project 5
Automating Data Abstraction for the Generation of Clinical Registries
This project aims at engineering a natural language processing (NLP) pipeline using large language models (LLMs) to extract information from unstructured clinical notes to automate the clinical registry generation process.
This project achieved the following:
- Formulated the LLM prompt, built an autograder to evaluate the pipeline performance.
- Gave a presentation on the phased results on the Harvard DSI South Africa Training Program monthly seminar in collaboration with two data scientists at Mayo Clinic.