Research

Explore groundbreaking, peer-reviewed research publications and award-winning challenges involving natural language processing (NLP) authored by IMO Health’s leading NLP scientists.

TOPICS

Clinical trial optimization

Leveraging LLMs in building a clinical trial retrieval system for patients

Clinical trials are a vital component in the advancement of new therapeutic approaches and medical treatments. However, recruiting patients for such trials in a timely and efficient manner is incredibly challenging. IMO Health researchers investigated the feasibility of integrating LLMs in identifying and ranking relevant trials for patients across multiple disorders. Their system, Patient2trial, showed significant promise in boosting the efficiency of patient screening for trials by suggesting a ranked list of applicable trials to both patients and clinicians.

Patient2Trial: From patient to participant in clinical trials using large language models
ScienceDirect | January 2025
IMO Health authors Surabhi Datta, Kyeryoung Lee, Liang-Chin Huang, Hunki Paek, Roger Gildersleeve, Jonathan D. Gold, Deepak Pillai, Jingqi Wang, and Xiaoyan Wang

Analyzing adverse events in multiple myeloma

Multiple myeloma, a rare blood cancer, has a notable relapse rate and an evolving treatment landscape. As such, IMO Health’s Life Sciences team worked to develop an LLM capable of comprehensively analyzing its adverse effects. Based on an information extraction model, the tool is designed to conduct timely and large-scale analysis of myeloma treatment studies. The study effectively showed AI’s versatility in conducting thorough dataset analysis for multiple myeloma studies and other therapeutic areas, furthering the field of health economics and outcomes research (HEOR).

Profiling Adverse Events in Multiple Myeloma: Insights from Clinical Trials Via Large Language Models
ISPOR | May 2024
IMO Health authors Hunki Paek, Kyeryoung Lee, Surabhi Datta, Liang-Chin Huang, Nneka Ofoegbu, Long He, Bin Lin, Jingqi Wang, Xiaoyan Wang

Assessing the consistency of oncology clinical trial reports

Conference abstracts are critical for sharing initial clinical trial findings, yet concerns exist about their alignment with final published results. This study leveraged an LLM to analyze the consistency of initial conference outcomes with subsequent publications, focusing on treatment efficacy and safety in oncology trials. The team’s LLM pipeline demonstrated high precision and recall in analyzing clinical trial data across diverse cancer types, underscoring the transformative potential of AI in enhancing clinical research and, ultimately, decision-making. The findings reveal that while dose-escalation phases might show variability, the key treatment outcomes at recommended doses remain consistent over time.

Unveiling Consistency: A Large-Scale Analysis of Conference Proceedings and Subsequent Publications in Oncology Clinical Trials Using Large Language Models
ASCO Annual Meeting | May 2024
IMO Health authors Kyeryoung Lee, Hunki Paek, Liang Chin Huang, Surabhi Datta, Augustine Annan, Nneka Ofoegbu, Xiaoyan Wang

Advancing clinical trials with AI-powered eligibility criteria extraction

Existing NLP approaches face challenges in capturing fine-grained criteria within a given text and may lack applicability across various disease areas. This study evaluates a system that automatically extracts eligibility criteria, emphasizes contextual attributes, and can handle diverse diseases utilizing a cutting-edge large language mode. The results show high accuracy across a wide range of diseases, reducing the need for manual annotations, with the potential to optimize clinical trial workflows and accelerate the start of trials, offering advantages to both researchers and patients.

AutoCriteria: Advancing Clinical Trial Study with AI-Powered Eligibility Criteria Extraction
ISPOR | December 2023
IMO Health authors Surabhi Datta, Kyeryoung Lee, Hunki Paek, Frank J, Manion, Jingcheng Du, Liang-Chin Huang, Jingqi Wang, Xiaoyan Wang

Innovative visualization of clinical trial data to assess results and predict recruitment

Clinical trials are an essential part of the effort to find safe and effective prevention and treatment and often pose an urgent need for better information retrieval that allows searching by specific eligibility criteria and structured trial information. IMO Health (Melax Tech) developed an innovative “COVID-19 Trial Graph” to consolidate information from registered clinical trials, making it easier to query and visualize the data. This novel approach to clinical trial data representation has numerous potential applications, such as predicting recruitment status and comparing trial similarities.

COVID-19 trial graph: a linked graph for COVID-19 clinical trials
NIH | August 2021
IMO Health authors Jingcheng Du and Jingqi Wang

Enhancing consent and authorization information in biomedical research

Informed consent documents serve as an important communication vehicle between the research team and potential study participants and become a ‘source of truth’ regarding the allowability of potential research actions. IMO Health (Melax Tech) uses Informed Consent Ontology (ICO) and Semantic Web Rule Language (SWRL) to navigate biomedical research data complexities. This approach shows robust capability to link entities within consent forms and has great potential for software integration and data management. It explores semantic and computational potentials in biomedical data, promising advancements in managing complex authorization information within software systems.

Expressing and Executing Informed Consent Permissions Using SWRL: The All of Us Use Case
NIH | February 2022
IMO Health author Frank J. Manion

Developing an ontology to facilitate interoperability in the informed consent process for clinical research

The informed consent process involves giving a subject adequate information to obtain voluntary agreement to participate in a study, as well as provide information as the subject or situation requires. Yet, there is no coherence of representing informed consent in various electronic systems, which impede productive data integration and sharing among those systems. IMO Health (Melax Tech) developed an Informed Consent Ontology (ICO) to facilitate interoperability and data integration for informed consent data in clinical research. ICO is aligned with the Basic Formal Ontology (BFO) and consists of 471 terms, including 137 ICO-specific terms and other terms imported from reliable ontologies. By leveraging Semantic Web technology, ICO ensures that consent information remains distributed and diverse. The paper highlights how ICO can model informed consent workflows and provides a standardized framework for representing informed consent data. This approach has promising implications for enhancing informed consent processes in clinical research.

Development of a BFO-Based Informed Consent Ontology (ICO)
ICBO | January 2014
IMO Health author Frank J. Manion

Patient journey and disease progression

Leveraging NLP in healthcare for rare disease management

Prader-Willi Syndrome is an extremely rare genetic condition that presents a wide range of symptoms, complicating researchers’ understanding of its trajectory. To overcome this obstacle, IMO Health researchers used NLP to extract clinical variables from both structured data and unstructured clinical notes concerning Prader-Willi Syndrome. Ultimately, the study illustrates how leveraging advanced NLP techniques to analyze EHRs can improve the understanding of rare diseases, significantly enhancing patient care and streamlining operations.

Building Patient Journeys for Prader-Willi Syndrome Patients: Insights from Electronic Health Records Through Natural Language Processing
ISPE | August 2024
IMO Health authors Kyeryoung Lee, Hunki Paek, Xiaoyan Wang

Assessing diet and disease relationships to understand disease progression

Neurodegenerative diseases, such as Alzheimer’s and Parkinson’s, lack effective treatments, but there is growing interest in understanding how diet might impact their progression. In this study, IMO Health researchers assessed biomedical information in thousands of publications to construct a knowledge graph that revealed relationships between diet, specific chemicals, species, and neurodegenerative diseases, shedding light on factors potentially affecting these conditions.

Knowledge Graph-based Neurodegenerative Diseases and Diet Relationship Discovery
CIBB | October 2021
IMO Health author Jingcheng Du

Identifying symptoms in clinical notes & standardizing to OMOP common data model

Developers at IMO Health (Melax Tech) used NLP to extract Covid-19 signs and symptoms, along with eight attributes from clinical notes. The work used a hybrid approach of combining deep learning-based models, curated lexicons, and pattern-based rules to quickly build the extract entities and normalize them to standard terms in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The results accurately recognized important clinical concepts from free text in electronic health records (EHRs) to support accelerated clinical research.

COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model
JAMIA | March 2021
IMO Health authors Jingqi Wang and Frank J Manion

Extracting adverse events from safety reports with deep learning models

Automated analysis of vaccine postmarketing surveillance narrative reports is important to understand the progression of rare but severe vaccine adverse events (AEs). This study implemented and evaluated state-of-the-art deep learning algorithms for named entity recognition to extract nervous system disorder-related events from vaccine safety reports. IMO Health researchers compared different algorithms to a domain-specific BERT model that was pre-trained using VAERS reports. They focused specifically on Guillain-Barré syndrome (GBS) related influenza vaccine safety reports from 1990 to 2016.

Extracting postmarketing adverse events from safety reports in the vaccine adverse event reporting system (VAERS) using deep learning
JAMIA | February 2021
IMO Health authors Jingcheng Du and Jingqi Wang

Extracting cancer data from pathology reports

NLP technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs); however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, IMO Health (Melax Tech) developed customizable modules for extracting cancer-related information from pathology reports with high performance and F-measures ranging from 0.80 to 0.98.

Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP
NIH | August 2019
IMO Health author Jingqi Wang

Systematic literature review

Leveraging AI for Endometrial Cancer Literature Reviews

To tackle the slow and resource-intensive process of gathering scientific data through traditional systematic literature review (SLR) methodologies, IMO Health researchers and Merck developed a new AI system with Generative Pre-Trained Transformer 4 (GPT-4) technology. The tool is designed to screen articles and extract relevant data from clinical trials, helping to address the rapidly growing volume of medical content quickly and efficiently. The system positively identified the majority (94%) of eligible articles in a fraction of the time required for manual identification, illustrating AI’s ability to streamline review processes and offer precise, timely insights.

Optimizing Systematic Literature Reviews in Endometrial Cancer: Leveraging AI for Real-Time Article Screening and Data Extraction in Clinical Trials
ISPOR | May 2024
IMO Health authors Hunki Paek, Kyeryoung Lee, Surabhi Datta, Majid Rastegar-Mojarad, Emma Foley, Liang-Chin Huang, Xiaoyan Wang, Julie Glasgow, Chris Liston

Automating abstract screenings with machine- and deep-learning models

Systematic literature reviews (SLRs), while critical to life science research, are notoriously time-consuming and labor-intensive. To address this issue, IMO Health partnered with a global pharma company to develop two disease-specific annotated corpora – one for human papillomavirus (HPV) associated diseases and the other for pneumococcal-associated pediatric diseases (PAPD) – and leverage machine- and deep-learning models to automate the SLR abstract screening process. The results demonstrate AI’s potential to automate abstract screenings, saving researchers valuable time and effort.

Machine learning models for abstract screening task – A systematic literature review application for health economics and outcome research
Springer Nature | May 2024
IMO Health authors Jingcheng Du, Ekin Soysal, Long He, Bin Lin, Jingqi Wang, Frank J. Manion

Accelerating evidence synthesis in observational studies with “living” systematic literature review

Systematic literature reviews (SLR) are a critical tool to build a knowledge base, assist research gap analysis, synthesize evidence, direct research, and support FDQ regulatory submission. IMO Health (Melax Tech) developed an NLP solution that automates all major steps in the SLR process with a system that proactively and continuously updates relevant literature in a timely manner. The system screens abstracts to predict articles’ relevance status based on their title, abstract, and other metadata and highlights supporting information. It uses named entity recognition to parse full-length articles and extract data elements from both texts and tables. With high accuracy scores on article screening and data element extraction, the solution helps scientists dedicate more time to ensuring data quality and synthesizing evidence, while staying current with literature related to observational studies.

A Natural Language Processing Solution for Health Economics and Outcomes Research Systematic Literature Review
ISPOR | June 2023
IMO Health authors Jingcheng Du, Frank J. Manion, Xiaoyan Wang

Drug repurposing

Accelerating drug repurposing with an AI-driven framework and knowledge graphs

Drug repurposing can be a cost-effective strategy to rapidly bring FDA-approved drugs to market for new indications, however, typically requires extensive manual reviews of related literature, clinical trials, and relevant clinical data. IMO Health (Melax Tech) scientists used NLP to extract biomedical knowledge, integrate diverse data sources, and apply deep learning models to build knowledge graphs and scoring systems. The KnowledgeSphere framework has successfully been used to extract biomedical entities and relations from 35 million PubMed abstracts and achieve successful drug repurposing cases.

Knowledgesphere: An Automated and Integrative Framework for Drug Repurposing Empowered By Knowledge Graph and AI Literature Review
ISPOR | June 2023
IMO Health authors Liang-Chin Huang, Jingcheng Du, Kyeryoung Lee, Jingqi Wang, Frank J Manion, Xiaoyan Wang

Public health sentiment and social media listening

Informing public health through analysis of public sentiment

This study assessed public perceptions of measles using Twitter data. IMO Health (Melax Tech) developed a multi-task Convolutional Neural Network (MTCNN) model that outperformed other machine learning models in classifying measles-related tweets. The results highlight the potential of social media analysis and deep learning in informing public health responses to vaccine-preventable diseases.

Understanding Public Perceptions of Measles from Twitter Using Multi-Task Convolutional Neural Networks
IOS Press | October 2021
IMO Health author Jingcheng Du

Detecting vaccine misinformation in social media with machine learning

IMO Health (Melax Tech) used machine learning to identify misinformation related to HPV vaccine on Reddit. We analyzed over 28,000 posts and found that 25.63% contained vaccine misinformation, with safety concerns being the most common type. Our findings suggest that machine learning can help combat vaccine misinformation on social media.

Using Machine Learning–Based Approaches for the Detection and Classification of Human Papillomavirus Vaccine Misinformation: Infodemiology Study of Reddit Discussions
NIH | August 2021
IMO Health author Jingcheng Du

NLP algorithms analyze vaccine sentiment on social media to inform public health actions

This study aimed to use machine learning-based natural language processing (NLP) algorithms to analyze vaccine sentiment and hesitancy across social media platforms due to the rapid spread of vaccine-related misinformation. The team collected and analyzed social media discussions from 2011 to 2021 related to HPV, MMR, and general vaccines on Twitter, Reddit, and YouTube. The results showed that machine-learning NLP algorithms achieved accuracy scores ranging from 0.51 to 0.78 for vaccine sentiment prediction and 0.69 to 0.91 for vaccine hesitancy prediction. Temporal trends revealed variations in social media activity, and the interactive dashboard developed could offer real-time insights to inform public health actions and improve vaccine uptake.

Vaccine Sentiments on Social Media: A Machine Learning-Powered Real-Time Monitoring System
ISPOR | June 2023
IMO Health authors Jingcheng Du, Frank J. Manion, Jingqi Wang, Xiaoyan Wang, Liang-Chin Huang

Clinical information extraction

NLP toolkit simplifies clinical information extract with customizable pipelines

IMO Health (Melax Tech) developed a user-friendly NLP toolkit for clinical information extraction. It simplifies the process by breaking it down into customizable pipelines and using machine learning and rule-based methods.

CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines
JAMIA | November 2017
IMO Health author Jingqi Wang

POINT OF CARE WORKFLOW

DATA QUALITY MANAGEMENT

PROBLEMS WE SOLVE