project 5

Automating Data Abstraction for the Generation of Clinical Registries

This project aims at engineering a natural language processing (NLP) pipeline using large language models (LLMs) to extract information from unstructured clinical notes to automate the clinical registry generation process.

This project achieved the following:

  • Formulated the LLM prompt, built an autograder to evaluate the pipeline performance.
  • Gave a presentation on the phased results on the Harvard DSI South Africa Training Program monthly seminar in collaboration with two data scientists at Mayo Clinic.