IMO Health Uses Clinical NLP to Streamline Trial Recruitment

Most of us have encountered an advertisement for a clinical trial at some point, perhaps online or in the subway. But while these colorful banners often appeal to those with broad health symptoms, such as difficulty walking and hearing, or specific conditions like schizophrenia, the reality is that qualifying for a clinical trial is very challenging.

Eligibility criteria must be particular and extensive to ensure that only suitable patients are recruited, and the clinical trials are safe and evaluated effectively. However, traditional approaches to extracting this information from unstructured trial text are often tedious and prone to human error.

To this end, IMO Health’s team of artificial intelligence (AI) experts—encompassing natural language processing (NLP) scientists, biomed engineers, and more—developed a generalizable and scalable GPT-based system to pull eligibility criteria from clinical trial documents across disease domains. A recent study demonstrates how combining such models with clinical NLP techniques can significantly streamline the patient recruitment process and expedite the construction of criteria knowledge bases, leading to advancements in medical knowledge and improved patient care.

Last month, IMO Health’s Surabhi Datta, PhD, Sr. Staff NLP Scientist, and Xiaoyan Wang, PhD, FAMIA, Chief Scientist and Senior Vice President, Life Sciences Solutions, presented these findings at a JAMIA Journal Club webinar. AMIA®, or the American Medical Informatics Association®, is a community of professionals dedicated to improving patient care and healthcare reform through informatics. JAMIA is AMIA’s peer-reviewed journal for biomedical and health informatics, covering all activities in the field.

Read on for a recap of this study and its significance.

Objective: Leverage clinical NLP for data extraction

Researchers have used AI (artificial intelligence) and clinical NLP techniques for years to extract eligibility criteria from clinical trial documents automatically. They’ve adopted various approaches, and some have even proposed combining models (called ensemble learning) to enhance results.

Recently, GPT-based models, such as GPT-4, have gained attention for their ability to understand and generate text. However, no work to date has leveraged or investigated these large language models (LLMs) for criteria information extraction.

IMO Health scientists developed an AI system to close this gap and conducted a research study to assess its capabilities, communicate its strengths, and identify areas for improvement.

Methods: Training and evaluating AutoCriteria

The team pulled clinical trial data for nine diseases from ClinicalTrials.gov and built an information extraction system, AutoCriteria. For each disease, they used three trials for prompt design, five for prompt calibration, and randomly selected 20 to evaluate the system.

AutoCriteria comprises the following modules:

Preprocessing

Trial documents are often long and contain many rules. So, IMO Health experts first split the raw criteria text into two parts: Inclusion (who can join a trial) and Exclusion (who cannot join a trial). Then, they split each of those parts into even smaller chunks of 200 words and ran their system on each part separately, extracting critical details. Finally, they combined all parts separately for the inclusion and exclusion criteria.

Knowledge ingestion

IMO Health’s AI team leveraged knowledge experts to discern each disease’s key medical terms and attributes. This information helped the model identify essential details within the inclusion and exclusion criteria.

Prompt modeling

The scientists experimented with many prompts, iteratively developing, testing, and calibrating them. They finally created two comprehensive prompts, one for inclusion and another for exclusion.

Prompt composition
1. General instruction
2. [Inclusion Criteria Text]: < criteria text >
3. Query part for Inclusion
Sample output
Entity type: Lab test
Attribute: Hemoglobin
Value: ≥ 10.0 g/dL
Modifier: NA
Sourse Sentence: Hemoglobin greater than 10.0 g/dL.

This figure shows a sample prompt template and its corresponding output.

Postprocessing

This step involved processing the model’s responses to address output inconsistencies and integrating medical knowledge through simple rules. For example, rarely, a response included missing values for entities, indicated by a vague phrase such as “gene name”—these cases were removed.

Evaluation

As part of this stage, the scientists evaluated the prompts manually and calibrated them repeatedly using expert feedback for every disease. For the final system assessment, they reviewed qualitative and quantitative metrics, such as precision, recall, and F1 scores (quantitative), as well as missing and incorrect criteria entities (qualitative).

Results: Quantitative and qualitative metrics

The overall accuracy of AutoCriteria in identifying all contextual information across diseases is 78.95%. In terms of qualitative metrics, the team’s thematic analysis indicated that “accurate logic interpretation of criteria” was one of the model’s strengths, and “overlooking/neglecting the main criteria” was one of its weaknesses.

Significance: A promising future for clinical NLP in life sciences

This study demonstrates AutoCriteria’s potential to alleviate the need for manual annotations when extracting granular information about eligibility from trial documents. The prompts developed for this tool also generalize well across different disease areas. Ultimately, this study proves that AutoCriteria is a scalable solution capable of addressing the complexities of clinical trial processes in real-world settings.

Click here to explore more groundbreaking, peer-reviewed research publications from IMO Health’s leading NLP and biomed scientists.

Article Topics: AI and NLP, Life Sciences

POINT OF CARE WORKFLOW

DATA QUALITY MANAGEMENT

PROBLEMS WE SOLVE

IMO Health experts apply clinical NLP to streamline clinical trial recruitment

Objective: Leverage clinical NLP for data extraction

Methods: Training and evaluating AutoCriteria

Preprocessing

Knowledge ingestion

Prompt modeling

Postprocessing

Evaluation

Results: Quantitative and qualitative metrics

Significance: A promising future for clinical NLP in life sciences

Click here to explore more groundbreaking, peer-reviewed research publications from IMO Health’s leading NLP and biomed scientists.

Related Content

Healthcare’s AI investment depends on clinical grounding

Can AI automate scientific literature review? Meet ASCOmind

Customer Spotlight: Dr. Jeffrey Hoffman – Elevating pediatric informatics and predictive care

Outsmarting data bottlenecks in pharma: The clinical terminology advantage

Why pharma’s RWD potential is stuck in the slow lane

Blog digest signup

Latest Resources

Solutions

Top Articles

Explore

Contact

Headquarters

POINT OF CARE WORKFLOW

DATA QUALITY MANAGEMENT

PROBLEMS WE SOLVE

POINT OF CARE WORKFLOW

DATA QUALITY MANAGEMENT

PROBLEMS WE SOLVE

IMO Health experts apply clinical NLP to streamline clinical trial recruitment

Objective: Leverage clinical NLP for data extraction

Methods: Training and evaluating AutoCriteria

Preprocessing

Knowledge ingestion

Prompt modeling

Postprocessing

Evaluation

Results: Quantitative and qualitative metrics

Significance: A promising future for clinical NLP in life sciences

Click here to explore more groundbreaking, peer-reviewed research publications from IMO Health’s leading NLP and biomed scientists.

Related Content

Healthcare’s AI investment depends on clinical grounding

Can AI automate scientific literature review? Meet ASCOmind

Customer Spotlight: Dr. Jeffrey Hoffman – Elevating pediatric informatics and predictive care

Outsmarting data bottlenecks in pharma: The clinical terminology advantage

Why pharma’s RWD potential is stuck in the slow lane

Blog digest signup

Latest Resources​

Solutions

Top Articles

Explore

Contact

Headquarters

POINT OF CARE WORKFLOW

DATA QUALITY MANAGEMENT

PROBLEMS WE SOLVE

Latest Resources