Role Snapshot
A student-level research data engineering position supporting the Division of Biomedical Informatics in developing data-driven public health research using large language models and machine learning techniques.
Key Responsibilities: Perform data preprocessing, prompt engineering, and fine-tuning/evaluation of LLMs for public health applications. May incorporate geospatial analysis and spatial context into research datasets using modern ML workflows.
Skills & Tools: Strong Python programming skills, familiarity with machine learning frameworks, and experience with data preprocessing. Knowledge of geospatial tools such as GeoPandas is a plus.
Qualifications: Current student or recent graduate with coursework in machine learning strongly encouraged. Demonstrated Python proficiency and interest in public health research required.
Location: In-Person
Compensation: $55Kβ$75K/yr (estimated)
Job Description
The Department of Internal Medicine, Division of Biomedical Informatics is seeking a student data engineer with strong Python programming skills to support data-driven public health research using large language models and modern machine learning techniques. The candidate will work on tasks including data preprocessing, prompt engineering, and fine-tuning and evaluating LLMs.
Familiarity with geospatial concepts and tools such as GeoPandas is a plus, as the position may involve incorporating spatial context into public health analyses.
Candidates with prior coursework in machine learning are strongly encouraged to apply.

