Yihan Hu

MPhil student · Health Data Science / Epidemiology · University of Cambridge · yh623@cam.ac.uk

prof_pic.jpg

MRC Epidemiology Unit

MRC Biostatistics Unit

Institute of Metabolic Science

University of Cambridge

I’m an MPhil student in Health Data Science / Epidemiology at the University of Cambridge, based across the MRC Epidemiology Unit, the MRC Biostatistics Unit, and the Institute of Metabolic Science. My current research sits at two main frontiers: nutritional epidemiology, with a particular focus on ultra-processed foods (UPFs) and how exposure to them is shaped by the digital food environment; and spatial data science, with an emphasis on US-based applications — most recently the post-pandemic shift in US housing market seasonality and its mobility-driven mechanisms.

The four threads below trace the papers that have come out of this work so far:

Clinical AI evaluation — methodological rigour for clinical large language model applications. Recent papers argue that conversational clinical LLM apps should be evaluated under a retrieval-grounded protocol, not just AUC (JMIR AI, 2026), and that end-to-end LLM clinical triage skips the intermediate reasoning attributes that the rest of medicine uses (Clinical Imaging, 2026).

Digital food environment — moving food policy and exposure measurement from physical retail to algorithm-driven platforms. Recent work in The Lancet (2026) on why ultra-processed food policy needs an explicit digital track, and in Public Health Nutrition (2026) on the gap between content prevalence and adolescent exposure on TikTok.

US housing market dynamics — seasonality, mobility, and search-and-matching frameworks. With Yifei Huang (Northwestern), we documented the post-pandemic shift in US housing seasonality from May/June to March/April using 33 years of FHFA and Census data (Real Estate, 2025). With Cemil Selcuk (Cardiff), we extended the Ngai-Tenreyro framework to monthly frequency and showed the spring shift can be explained by a corresponding change in household mobility timing alone (SSRN, 2026).

Applied machine learning — clustering, segmentation, and interval-valued data representations for retail analytics.

The fastest way to reach me about a specific paper is by email. The full publication list is on the publications page.

selected publications

  1. Front. Public Health
    epidemic-forecasting.png
    Protocol-aware epidemic forecasting across heterogeneous public health surveillance systems
    Yihan Hu, Jingyuan Han, and Mingxin Liu
    Frontiers in Public Health, 2026
  2. JMIR AI
    rag-clinical-evaluation.png
    Toward Retrieval-Grounded Evaluation for Conversational Large Language Model-Based Risk Assessment
    Yihan Hu
    JMIR AI, 2026
  3. Clin. Imaging
    breast-pain-llm.png
    Comment on “Classifying the clinical significance of common breast pain symptoms using a large language model, ChatGPT (GPT-4)”
    Yihan Hu
    Clinical Imaging, 2026
  4. Public Health Nutr.
    tiktok-exposure.png
    Content prevalence is not adolescent exposure in TikTok influencer food marketing surveillance
    Yihan Hu
    Public Health Nutrition, 2026
  5. Lancet
    digital-food-environment.png
    Ultra-processed food policy must regulate the screen as well as the street
    Yihan Hu
    The Lancet, 2026
  6. SSRN
    housing-mobility-shift.png
    From Summer to Spring: A Shift in US Housing Market Seasonality
    Yihan Hu and Cemil Selcuk
    2026
    SSRN Working Paper
  7. Real Estate
    housing-seasonality.png
    Seasonality in the US Housing Market: Post-Pandemic Shifts and Regional Dynamics
    Yihan Hu and Yifei Huang
    Real Estate, 2025