MentalChat16K
A benchmark dataset of 16k counseling conversations for training and evaluating LLMs in mental-health assistance. Fine-tuned LLaMA-2, Mistral, Vicuna & Zephyr improved 30% over base models on 7 metrics.
From benchmark datasets and fine-tuned LLMs to award-winning hackathon builds. Each project connects AI research with a real human need.
A benchmark dataset of 16k counseling conversations for training and evaluating LLMs in mental-health assistance. Fine-tuned LLaMA-2, Mistral, Vicuna & Zephyr improved 30% over base models on 7 metrics.
A system of LLMs fine-tuned on MentalChat16K that outperforms base models and baselines on 7 mental-health metrics — evaluated by human experts alongside LLM judges (GPT-3.5, Gemini Pro).
Attention-enhanced ensemble learning for robust detection of referable diabetic retinopathy across heterogeneous fundus image sources — accepted at the 55th ARVO Annual Meeting.
A dual-market growth strategy for China’s ¥7T senior-care industry: AI-enabled financial planning and personalized care targeting 3-year breakeven with 90%+ occupancy.
A GCP pipeline for multi-source political data — fine-tuning Whisper for Spanish political speech and analyzing 100k+ news articles to quantify sentiment shifts in the Chávez era. With UPenn, Columbia & UC Berkeley.
AI-powered feedback for motivational interviewing, grounded in the MITI framework — built to help counselors refine their craft.
Geospatial analysis of urban green-space access and its community-health implications. 1st place in the Green Space Data Challenge, organized by Georgetown University’s Massive Data Institute; invited to present the winning project at the APDU Annual Conference in Arlington, VA (Jul 2023).
Text mining to build an Alzheimer’s research portal — a knowledge base for identifying technologies that support healthy aging.
MRI-based classifiers that predict dementia development across cross-sectional and longitudinal cohorts.
Automated MRI processing of the human brain with the open-source BrainSuite toolkit — surface extraction, registration, and analysis.
Big-data analysis of air pollution across India — winner of the Best Visualization award in Penn’s CIS 545 Big Data Analytics.