Predicting Air Quality Index in India
CIS 545 Final Project
By Jia Xu, Yuqin Zhang, Yuluan Cao in Python Big Data Machine Learning
April 20, 2022
Air pollution has become one of the largest environmental health threats around the globe. India is among one of the most polluted countries in the world. According to the 2021 World Air Quality Report, 12 out of 15 most polluted regional cities are in India.
We implemented various models, including the Linear regression model, Gradient Boosting Regression, Neural Network, and SARIMA time series prediction model, for the AQI index at the station level to predict and evaluate the air quality index in India.
Involvement
Performed the exploratory data analysis, and data visualization using Plotly for state and city level analysis, built and evaluated the Gradient Boosting Regression and Neural Network models that predict the Air Quality Index in India, and came up with findings in spatial pattern, temporal pattern, and feature importance.