The Magic of Models: The journey from complex problems to beautiful solutions often requires us to take a stop at a station called models! Abstracting out unnecessary details helps us to focus on the core of the problem and therefore to find elegant solutions. In this talk, we apply our knowledge of vectors to model a real world problem that often occurs wherever we need to compare natural language (say English) sentences. Research Science Initiative - Summer Programme, CMI, Chennai, 28th May, 2019.Slides
Big Data - Text Processing: Principles and Practices: Text Processing is essential for several applications. Our primary medium of communication over the web remains to be text. In this talk, I explain the need for data structures using the example of suffix trees. In the second part of the talk, I show how vector space models have been used in search engines. With these two examples, we gather a glimpse of principles and practices of dealing with big textual data. SSN Analytica, SSN Institutions, Chennai, 11th May, 2019.Slides
Information Retrieval: An Overview: This talk starts with the notion and role of "information" in daily life. I will then discuss the library and information sciences practices that led to the field of digital libraries and further to search engines. I introduce Information Retrieval (IR)
covering basics of indexing and query processing. As part of the indexing step, I discuss vector space model to represent text, further leading to postings list. As part of query processing, I discuss simple
boolean queries (AND, OR, NOT) answered by merging the postings. We end the talk with a note on comparing and evaluating IR systems. Post-IOI (International Olympiad for Informatics) Training Camp Workshop, Chennai Mathematical Institute (CMI), 9th May, 2019.Slides
Code Variants and Their Retrieval: Information plays a crucial role in our day to day life. In this talk, I introduce you to the principles and practices behind information retrieval using a case of code variants search. Developers routinely analyze existing code, find better reuse alternatives, and look to develop high-quality code. Our research shows that 25% to 40% of developer discussions in bug reports are about variants. However, searching for such code variants over the web has several challenges. Using knowledge driven approaches, we propose a system to automate the search for code variants. Chennai Mathematical Institute (CMI), 9th January, 2019.
Principles and Practices Behind Building the Search Engines: Search engines play a key role in our daily life. Yet, building a search engine is scientifically involved due to the sub-second response requirement on top of fetching only relevant content and fetching all relevant content. We discuss some principles and practices of building search engines in this talk. IIIT Sri City, 6th October, 2018. Research Expo Material: Download here (zip file [5.01 MB])
Knowledge-Discovery based Approaches for Code Snippet Search: Code snippets pose several challenges compared to text for retrieval. In this talk, we propose a MATF based retrieval model, and hopefully inspire the audience to appreciate code search as a potential area of research. IIIT Sri City, 20th April, 2018.
Term Frequency and its Variants in Retrieval Models: Term frequency models and term frequency weighing schemes have appeared in several variants while designing retrieval models for text. In this talk, we review the background, intuitions and formulations of some of the variants. IIIT Delhi, 11th April, 2018.
Code Search - Challenges and Applications: Searching for source code is challenging for several reasons. The naturalness of source code cannot be compared with that of natural languages like English. How does this impact indexing and querying? In this talk, I highlight the major challenges and research opportunities in this field. Neemrana University, Rajasthan, 24th September, 2017.
Porting InnerEye to Azure Cloud: InnerEye is a research project that uses state of the art machine learning technology to build innovative tools for the automatic, quantitative analysis of three-dimensional radiological images. In this talk, I summarize my 12 weeks at Microsoft Research where I successfully migrated the InnerEye ML models to Azure cloud. Microsoft Research, Cambridge, United Kingdom, June 29, 2017.
Modeling Source Code to Support Retrieval-Based Applications: Advances in text retrieval do not apply directly to source code retrieval because of the difference in characteristics of source code when compared to text. Here, I discuss the role of source code models in retrieval. The Doctoral Consortium at the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, Cambridge, United Kingdom, February 6, 2017.
Reflections from a PM Research Fellow: My PhD is supported by Prime Minister's Research Fellowship Scheme. In this talk, first I give a glimpse of my research, then I discuss the challenges of doing PhD in India, and finally elaborate on ways in which PM fellowship helps in overcoming some of these hurdles. Panjab University, Chandigarh, 24th March, 2015.
Bayesian Data Analysis: Bayesian data analysis provides us a framework to systematically approach our belief update process. Suppose we know nothing about the fairness of the coin, and we see four continuous heads in the first four flips, what should we infer about the coin? Is it biased towards head? Should we bet on another head again? IIIT Delhi, 12th November, 2014.
Swarm Intelligence: How can loosely connected agents display collective intelligence? Can we learn something from them? Models such as metric distance and topological distances simulate swarm behavior. Several algorithms such as Ant Colony Optimization, and Particle Swarm Optimization find applications in crowd simulation and ad-hoc networks. Let us take a look. IIIT Delhi, 01st March, 2014.
Beauty of Information Retrieval: Information retrieval is a fast growing field which concerns search towards satiating some information need. In this talk, we discuss some key challenges and interesting applications. IIIT Delhi, 01st March, 2014.