Automated Data Harmonization
Aim 1: Conduct an exploratory analysis of responses provided for the medication list to identify variations (e.g. typos, specific acronyms, etc) often used for drug. names.
Aim 2: Develop a set of data generators to produce identified variations using the therapeutic_agent entity in the NCI Thesaurus.
Aim 3: Train and evaluate harmonization model with the data generators to semi-automatically standardize response for medication names.
NA