An ACM Student Member

My current research is focused on analyzing source code using Information Retrieval techniques.

Spotting Familiar Source Code Structures for Program Comprehension. 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering BERGAMO, ITALY, August 30 September 4, 2015. - Venkatesh Vinayakarao.

Paper | Poster |

 Developers deal with the persistent problem of understanding non-trivial code snippets. To understand the given implementation, its issues, and available choices, developers will bene.t from reading relevant discussions and descriptions over the web. However, there is no easy way to know the relevant natural language terms so as to reach to such descriptions from a code snippet, especially if the documentation is inadequate and if the vocabulary used in the code is not helpful for web search. We propose an approach to solve this problem using a repository of topics and associated structurally variant snippets collected from a discussion forum. In this on-going work, we take Java methods from the code samples of three Java books, match them with the repository, and associate the topics with 76.9% precision and 66.7% recall.  

Structurally Heterogeneous Source Code Examples from Unstructured Knowledge Sources. Published in ACM SIGPLAN 2015 Workshop on Partial Evaluation and Program Manipulation, PEPM 2015. - Venkatesh Vinayakarao, Rahul Purandare, Aditya Nori.

Paper | Slides |

Software developers rarely write code from scratch. With the existence of Wikipedia, discussion forums, books and blogs, it is hard to imagine a software developer not looking up these sites for sample code while building any non-trivial software system. While researchers have proposed approaches to retrieve relevant posts and code snippets, the need for finding variant implementations of functionally similar code snippets has been ignored. In this work, we propose an approach to automatically create a repository of structurally heterogeneous but functionally similar source code examples from unstructured sources. We evaluate the approach on stackoverflow1, a discussion forum that has approximately 19 million posts. The results of our evaluation indicates that the approach extracts structurally different snippets with a precision of 83%. A repository of such heterogeneous source code examples will be useful to programmers in learning different implementation strategies and for researchers working on problems such as program comprehension, semantic clones and code search.

Implementation: Github

 Mining Code Snippets
Structure in Source Code

If you are not having fun, you are not the best researcher you can be!