|
Distributed Computing and Big Data
Instructor | : Venkatesh Vinayakarao |
Term | : Jan - Apr 2025 |
TA | : Aniket Tiwari, Rohit Roy |
Welcome to Distributed Computing and Big Data course! Massive increase in the availability of data has made the storage, management, and analysis extremely challenging. Various tools, technologies and frameworks have surfaced to help address this challenge. Apache Hadoop is one such framework that enables us to handle big data by making distributed computing easier. Concerns such as reliability, distributed file management and distributed processing are abstracted from us by hadoop. In this course, we shall start with understanding the characteristics of big data and the fundamental concepts of cloud computing. We will explore the hadoop ecosystem. Specifically, we will explore HDFS, Map-Reduce, Pig and NoSQL DB. Our objective is to understand how big data can be effectively handled. We will also briefly discuss web applications development, with a special focus on RESTful services. This is an introductory course focused on the breadth of the big data landscape.
Key Learning Objectives
At the end of this course, you should be able to:
- Understand the fundamentals of distributed storage using Hadoop HDFS as an example.
- Understand distributed processing fundamentals using map-reduce framework and pig scripts.
- Understand NoSQL DB concepts using MongoDB and/or HBase.
- Understand web services.
Lecture Schedule
Evaluation
Instrument | Max Marks |
Mid Exam | 25% |
Final Exam | 35% |
Assignment (4*10%) | 40% |
|
Pre-requisites
None.
Resources
Text
There is no prescribed text for this course.
References
Optional Readings
|
If you are not
having fun, you are not the best student you can be!
|
|
|
|