about research publications people teaching software vita misc
Bayesian Modeling of High-Dimensional Verbal Autopsy Data
I have been working on methods for the analysis of verbal autopsy (VA) data since 2012. VA is a widely adopted tool to estimate disease burdens in the developing world by interviewing caregivers of the deceased. The key first step to assign cause of death using VA is to understand and characterize the high-dimensional symptoms and covariates given each cause of death. Our team has been developing Bayesian latent variable models to discover meaningful symptom-cause relationship from messy and limited data. Some recent papers in this area: |
|
Learning Under Distribution Shift
Many important global and public health problems involve learning and prediction, and the heterogeneity in data distributions need to be taken into account when using these predictions for policy making. Recently, our team have developed several algorithms for cause-of-death assignment using VA data from multiple domains (countries, subnational regions, time periods, demographic groups, etc.) to achieve more robust mortality estimation. We have also developed federated learning framework to ensemble models separately trained on different domains. We are extending these models to broader applications too. Some recent papers in this area: |
|
Small Area Estimation in Space and Time
Small area estimation (SAE) refers to the process of producing estimates of quantities of interest, such as prevalence of diseases, for specific geographic areas, even when data are sparse or unavailable. We have been developing SAE methods using survey data in a variety of different settings for binary, continuous, and composite indicators. Some recent papers in this area: |
|
Full Data Lifecycle of Statistical Analysis
Statistical inference given available data is often insufficient to produce relevant scientific insights. Much of my recent work can be characterized as designing principled workflows that span the entire data lifecycle, rather than just the analysis phase. Topics that I am interested in includes (1) active and targeted design of experiments and data collection paradigm, (2) methods to elicit knowledge from human experts and integrate incomplete domain knowledge into analysis, (3) uncertainty propagation and quantification across the data pipeline for decision making, and (4) task-driven model comparison and selection. Some recent papers in this area: |
|
Monitoring and Understanding COVID-19 and disease outbreaks
I have worked on methods on monitoring and quantifying the prevalence and transmission of the COVID-19 pandemic, and evaluating the impact of the pandemic. Some recent paper in this area: |
|
Open Source Software for Global Health Practitioners
I am a strong believer in open science and open-source software. Our groups has developed a collection of tools for analyzing verbal autopsy data and small area estimation that are widely adopted by practitioners worldwide. More details can be found in the software page. We work closely with international organizations and have conducted training workshops in LMICs throughout the years. Some papers focusing on software: |