Python

Uncovering Inequities in Green Space (Green Space Data Challenge)

With the rapid growth of urbanization, green spaces in the form of parks, gardens, and other open areas are extremely important for improving the quality of life for those who live in urban areas. As polluted air and water and overcrowded cities become our everyday lives, the physical and mental health of urban communities has undoubtedly gained increasing attention in recent years. Not surprisingly, green spaces are coming to the rescue. Cancel changes Open spaces such as parks benefit communities by providing a cleaner environment to protect public health such as a reducing in the urban heat island effect and improving air quality. They can also have an impact on social health by contributing to increase community cohesion and wellbeing, and increased property values, among other positive environmental, social, and financial outcomes. As a result, the supply of green space and the ease with which it can be accessed are key concerns in urban planning and policy-making.

BrainSuite Demonstration

BrainSuite is a collection of open source software tools that enable largely automated processing of magnetic resonance images (MRI) of the human brain. The major functionality of these tools is to extract and parameterize the inner and outer surfaces of the cerebral cortex, to segment and label gray and white matter structures, and to analyze diffusion imaging data. BrainSuite also provides several tools for visualizing and interacting with the data.

Penn Artificial Intelligence Technology Collaboratory for Healthy Aging

I worked as a research assistant in this project. Through engagement and collaboration with stakeholders and academic/industry experts, I applied text mining to build a portal for AD research to create a central resource and knowledge base for technology identification and training.

Predicting Air Quality Index in India

This is my final project of the CIS 545 - Big Data Analysis in the Spring 2022 semester. We worked as a team to tackle this air pollution big data problem as air is what keeps humans alive. Monitoring it and understanding its quality is of immense importance to our well-being. At the end, we also won the Best Visualization outstanding project among the whole class.

Fulton Bank Customer Default Prediction

This was my very first and unique datathon experience. This datathon was composed of teams made up of both Fulton Bank employees and Penn students challenged with developing insights and recommendations for Fulton Bank using their data around selected business challenges in Consumer and Small Business, Fulton Financial Advisors, Information Technology, Marketing, and Risk Management.

Evaluating Academic Performance of Students Learning in Open University

Predicting students’ academic performance at school using regression methods is not a new area of interest. Machine learning methods, however, are relatively new in this field and it has been flourishing in recent years. According to Ghorbani and Ghousi (2020), due to technological advancements, predicting students’ performance at school is among the most beneficial and significant research topics nowadays. Therefore, we believe that it is a meaningful area for us to focus on and we decide to analyze the Open University Learning Analytics Dataset to study the student’s academic performance.

MSSP 608: Practical Machine Learning Methods

This course prepares me to use tools from those fields effectively in applied contexts and build skills including (1) feature representations of spreadsheet-based or text datasets; (2) training classification and regression models for prediction tasks; (3) evaluation of machine learning model accuracy and error analysis; and (4) reasoning about predictive models and making tradeoffs like bias vs. variance, granularity and annotation complexity in labeled training data, and the ethical application of predictive modeling to human-centered data.

MSSP 608: Practical Machine Learning Methods

This course prepares me to use tools from those fields effectively in applied contexts and build skills including (1) feature representations of spreadsheet-based or text datasets; (2) training classification and regression models for prediction tasks; (3) evaluation of machine learning model accuracy and error analysis; and (4) reasoning about predictive models and making tradeoffs like bias vs. variance, granularity and annotation complexity in labeled training data, and the ethical application of predictive modeling to human-centered data.

MSSP 607: Practical Programming for Data Science

This course familiarizes me with the core concepts of programming and the practice of software development for data-intensive applications in industry and government. After this course, I am now comfortable (1) writing code to save and load from files and spreadsheets into basic data structures like strings, lists, and maps; (2) manipulating data with code to perform tasks like generating aggregate statistics and filtering data into subsets; (3) effectively communicating findings from interactive, exploratory programming with others; and (4) working with technical teams, using best practices of software development when building line- of-business applications.

MSSP 607: Practical Programming for Data Science

This course familiarizes me with the core concepts of programming and the practice of software development for data-intensive applications in industry and government. After this course, I am now comfortable (1) writing code to save and load from files and spreadsheets into basic data structures like strings, lists, and maps; (2) manipulating data with code to perform tasks like generating aggregate statistics and filtering data into subsets; (3) effectively communicating findings from interactive, exploratory programming with others; and (4) working with technical teams, using best practices of software development when building line- of-business applications.