Welcome to Big Data and Hadoop course! Massive increase in the availability of data has made the storage, management, and analysis extremely challenging. Various tools, technologies and frameworks have surfaced to help address this challenge. Apache Hadoop is one such framework that enables us to handle big data by making distributed computing easier. Concerns such as reliability, distributed file management and distributed processing have been abstracted from us by hadoop. In this course, we shall start with understanding the characteristics of big data and the fundamental concepts of cloud computing. We will explore the hadoop ecosystem. Specifically, we will explore HDFS, Map-Reduce, Pig and NoSQL DB. Our objective is to handle big data effectively and build web applications and RESTful services over cloud. This is an introductory course focused on the breadth of the big data landscape.
This is a group project. Allowed group sizes are 3 and 4. Submit a one page report before the project demonstration.
There is no prescribed text for this course. Readings will be shared during the lectures.