How I Re-Configured My Brain to Read Research Papers
I want to preface this blog post by saying that I believe I am a decently good reader (in the sense of both understanding and retaining information), but despite that I struggled when reading my first couple of research papers. I want this blog post to serve as proof that even if you feel like you might not be able to properly digest a 20+ piece of technical jargon, you actually are able to.
Let’s begin.
About a week ago, I decided to challenge myself to building an at-home version of Meta’s Segment Anything Model (SAM). This came as a result of two things. First, I was building a project using SAM, and while at a high level I knew why it worked how it did, I wanted to properly understand it. Second, I realized that in order to ever have a shot of doing meaningful research in model development/advancement/architecture, I actually would have to know the ins and outs of the models. I will reserve more details about the architecture of Vision Transformers and more for another blog post.
My starting point, the famous 2003 Bengio et al (A Neural Probabilistic Language Model) is 19 pages, which might seem intimidating at first, but considering figures and references, is actually not that long. Before reading this paper, my usual approach to most papers was just to start at the abstract, and in my head read until the conclusion. Not that this approach is particularly bad, but I felt that I could not retain certain information as well, and that if there were any technically heavy sections I would get lost pretty quickly.
So I went on YouTube, and I found a couple videos on how certain PhD students or researchers read their papers. The approach I liked the most was Dario Tringali’s way of doing it. He reads the abstract, then skips directly to the conclusion, and then goes over the result. After that, he goes over the other potentially important parts like the methods and introduction, but only if he needs to. The other thing he does is ask three questions.
What is the system that this paper describes?
What are the authors of the paper doing?
Why is this paper important?
While I did not employ his reading order, I did apply the question approach to the paper. As I read the paper, I constantly reflected back on these three questions, trying to tie every new concept I learned towards the overall idea of what a Multi-layer Perceptron is. I found this to be very helpful, and I actually felt like I understood what I was reading. Another thing I changed in my approach is that I started reading out loud instead of in my head. I did this because sometimes I feel that reading in my head puts me into auto-pilot mode automatically, so by reading out loud I am conscious of every single word’s importance. Finally, what I also did was draw all over the paper. I highlighted, drew figures, underlined, etc. This really helped me capture the important information, and anything I did not understand I would highlight in red.
After I finished the paper, I uploaded it to NotebookLM for two reasons. First, I wanted to quiz myself over my understanding of the paper, and NotebookLM is actually great for generating quizzes that are relatively helpful at testing your knowledge. Second, I wanted to ask questions about the parts I did not understand. For example, I highlighted ‘text corpora’ because the context was not enough for me to understand what it meant. NotebookLM gave me a really good explanation at the first attempt. I did this with many other parts of the project, especially parameters or mathematical formulas.
This new approach to reading papers has made it so much more enjoyable for me to digest all this new information. It has sparked a new sense of excitement every time I get to dive deep into these very technically challenging topics. I hope this small post helps you out on your paper reading approach.