CIS 545: Big Data Analytics
Programming Language: Python, SQL, HTML
By Profs. Susan Davidson and Zachary Ives in Penn Engineering
January 12, 2022
Abstract
In the era of big data, we are increasingly faced with the challenges of converting massive amounts of data to actionable knowledge. Given the limits of individual machines (compute power, memory, bandwidth), increasingly the solution is to clean, integrate, and process the data using statistical machine learning techniques, in parallel on many machines. This course focuses on the fundamentals of scaling computation to handle common data analytics tasks. I learned about basic tasks in collecting, wrangling, and structuring data; programming models for performing certain kinds of computation in a scalable way across many compute nodes; common approaches to converting algorithms to such programming models; standard toolkits for data analysis consisting of a wide variety of primitives; and popular distributed frameworks for analytics tasks such as filtering, graph analysis, clustering, and classification.
Date
January 12 – April 30, 2022
Time
12:00 AM
Location
Philadelphia, United States