Welcome to Big Data and Hadoop course! With growing availability of data, their storage, management, and analysis have become extremely challenging. Various tools, technologies and frameworks have surfaced to help address this challenge. Apache Hadoop is one such framework that enables us to handle big data by making distributed computing easier. Concerns such as reliability, distributed file management and distributed processing have been abstracted from us by hadoop. In this course, we shall start with understand the characteristics of big data and the fundamental concepts of cloud computing. We will explore the hadoop ecosystem. Specifically, we will explore HDFS, Map-Reduce, Pig and NoSQL DB. Our objective is to handle big data effectively and build web applications and RESTful services over cloud. This is an introductory course focused on the breadth of the big data landscape.
This is a group project. Allowed group sizes are 3 and 4. Your project will be evaluated based on the following:
There is no prescribed text for this course. Readings will be shared during the lectures.