Bio
As far as labels go, Data Scientist is the one that fits best, although my day-to-day work spans quite a bit: building LLM apps for clinical NLP, designing databases and ETL pipelines, managing DevOps infrastructure, and occasionally doing something that actually looks like machine learning — all in the context of medical research at UT Southwestern.
I like working at the intersection of messy real-world data, scientific questions, and systems that actually have to run in production. This site is a place where I collect projects, experiments, and ideas that sit in that overlap.
When I started college, I was the prototypical premed — mostly because "being a doctor sounds cool." Then I remembered that in fifth grade I had to leave class during a lesson on the circulatory system because I nearly passed out. Medicine might not have been my calling, but science definitely was.
I majored in chemistry, which I loved, but in my final semester I took an informatics course and discovered something magical: results at the click of a button instead of a five-hour organic synthesis. The joy of being able to rapidly iterate with software led me to my first role at UTSW, in the radiation oncology department as a "clinical data specialist" — my first experience with a job title not matching what I actually did. In practice, I was mostly doing bioinformatics, analyzing RNA-seq and microbiome data.
After completing my Master's degree in Data Science, I moved to my current role. Ironically, despite being in the bioinformatics department, I now do considerably less bioinformatics and considerably more systems, infrastructure, and applied AI.
What I've realized over the years is that my north star is understanding — or occasionally uncovering — how the hotdog is made. I'm happiest when I can peel back abstractions, trace systems to their roots, and figure out what's really going on under the hood, whether that's a statistical method, a production pipeline, or a messy real-world dataset.
Quick Facts
Education
- Master of Science in Data Science
- Bachelor of Science in Chemistry
- Both from the University of Texas at Austin
Experience
- Data Scientist, Bioinformatics — Sep 2023 – Present
- Clinical Data Specialist, Radiation Oncology — Feb 2020 – Aug 2023
- Both at the University of Texas Southwestern Medical Center
Public Service
- Designed and maintain the website for a regional food bank, Feeding Families of Alabama (WordPress)
My Cat Bagel
Publication Highlights
Hein D, Christie A, Holcomb M, Xie B, Jain AJ, Vento J, Rakheja N, Shakur AH, Christley S, Cowell LG, Brugarolas J, Jamieson AR, Kapur P.
Iterative refinement and goal articulation to optimize large language models for clinical information extraction.
npj Digital Medicine — 2025
Sanford NN, Shi Q, Hein D, Hall WA.
Benchmarks of success in radiotherapy vs systemic therapy: National Clinical Trials Network (NCTN) randomized controlled trials sponsored by the National Cancer Institute (NCI).
JNCI: Journal of the National Cancer Institute — 2025
Jamieson AR, Holcomb MJ, Dalton TO, Campbell KK, Vedovato S, Shakur AH, Kang S, Hein D, Lawson J, Danuser G, Scott DJ.
Rubrics to prompts: assessing medical student post-encounter notes with AI.
NEJM AI — 2024
Rhead B, Hein D*, Pouliot Y, Guinney J, De La Vega FM, Sanford NN.
Association of genetic ancestry with molecular tumor profiles in colorectal cancer.
Genome Medicine (*co-first author) — 2024
Lin G, Hein D, Liu P, Singal AG, Sanford NN.
Screening implications for Distribution of Colorectal Cancer Subsite by Age and Role of Flexible Sigmoidoscopy.
Cancers — 2024
Hein D, Coughlin LA, Poulides N, Koh AY, Sanford NN.
Assessment of distinct gut microbiome signatures in a diverse cohort of patients undergoing definitive treatment for rectal cancer.
Journal of Immunotherapy and Precision Oncology — 2024
Hein D, Deng W, Bleile M, Kazmi SA, Rhead B, De La Vega FM, Jones AL, Kainthla R, Jiang W, Cantarel B, Sanford NN.
Racial and ethnic differences in genomic profiling of early onset colorectal cancer.
JNCI: Journal of the National Cancer Institute — 2022
Full record: ORCID 0000-0002-8625-9528 · Google Scholar