Research

Explore groundbreaking, peer-reviewed research publications and award-winning challenges involving natural language processing (NLP) authored by IMO’s leading NLP scientists.
TOPICS

Clinical trial optimization

Advancing clinical trials with AI-powered eligibility criteria extraction

Existing NLP approaches face challenges in capturing fine-grained criteria within a given text and may lack applicability across various disease areas. This study evaluates a system that automatically extracts eligibility criteria, emphasizes contextual attributes, and can handle diverse diseases utilizing a cutting-edge large language mode. The results show high accuracy across a wide range of diseases, reducing the need for manual annotations, with the potential to optimize clinical trial workflows and accelerate the start of trials, offering advantages to both researchers and patients.
AutoCriteria: Advancing Clinical Trial Study with AI-Powered Eligibility Criteria Extraction
ISPOR | December 2023
IMO authors Surabhi Datta, Kyeryoung Lee, Hunki Paek, Frank J, Manion, Jingcheng Du, Liang-Chin Huang, Jingqi Wang, Xiaoyan Wang

Innovative visualization of clinical trial data to assess results and predict recruitment

Clinical trials are an essential part of the effort to find safe and effective prevention and treatment and often pose an urgent need for better information retrieval that allows searching by specific eligibility criteria and structured trial information. IMO (Melax Tech) developed an innovative “COVID-19 Trial Graph” to consolidate information from registered clinical trials, making it easier to query and visualize the data. This novel approach to clinical trial data representation has numerous potential applications, such as predicting recruitment status and comparing trial similarities.
COVID-19 trial graph: a linked graph for COVID-19 clinical trials
NIH | August 2021
IMO authors Jingcheng Du and Jingqi Wang

Enhancing consent and authorization information in biomedical research

Informed consent documents serve as an important communication vehicle between the research team and potential study participants and become a ‘source of truth’ regarding the allowability of potential research actions. IMO (Melax Tech) uses Informed Consent Ontology (ICO) and Semantic Web Rule Language (SWRL) to navigate biomedical research data complexities. This approach shows robust capability to link entities within consent forms and has great potential for software integration and data management. It explores semantic and computational potentials in biomedical data, promising advancements in managing complex authorization information within software systems.
Expressing and Executing Informed Consent Permissions Using SWRL: The All of Us Use Case
NIH | February 2022
IMO author Frank J. Manion

Developing an ontology to facilitate interoperability in the informed consent process for clinical research

The informed consent process involves giving a subject adequate information to obtain voluntary agreement to participate in a study, as well as provide information as the subject or situation requires. Yet, there is no coherence of representing informed consent in various electronic systems, which impede productive data integration and sharing among those systems. IMO (Melax Tech) developed an Informed Consent Ontology (ICO) to facilitate interoperability and data integration for informed consent data in clinical research. ICO is aligned with the Basic Formal Ontology (BFO) and consists of 471 terms, including 137 ICO-specific terms and other terms imported from reliable ontologies. By leveraging Semantic Web technology, ICO ensures that consent information remains distributed and diverse. The paper highlights how ICO can model informed consent workflows and provides a standardized framework for representing informed consent data. This approach has promising implications for enhancing informed consent processes in clinical research.
Development of a BFO-Based Informed Consent Ontology (ICO)
ICBO | January 2014
IMO author Frank J. Manion

Patient journey and disease progression

Extracting adverse events from safety reports with deep learning models

Automated analysis of vaccine postmarketing surveillance narrative reports is important to understand the progression of rare but severe vaccine adverse events (AEs). This study implemented and evaluated state-of-the-art deep learning algorithms for named entity recognition to extract nervous system disorder-related events from vaccine safety reports. IMO researchers compared different algorithms to a domain-specific BERT model that was pre-trained using VAERS reports. They focused specifically on Guillain-Barré syndrome (GBS) related influenza vaccine safety reports from 1990 to 2016.
Extracting postmarketing adverse events from safety reports in the vaccine adverse event reporting system (VAERS) using deep learning
JAMIA | February 2021
IMO authors Jingcheng Du and Jingqi Wang

Assessing diet and disease relationships to understand disease progression

Neurodegenerative diseases, such as Alzheimer’s and Parkinson’s, lack effective treatments, but there is growing interest in understanding how diet might impact their progression. In this study, IMO researchers assessed biomedical information in thousands of publications to construct a knowledge graph that revealed relationships between diet, specific chemicals, species, and neurodegenerative diseases, shedding light on factors potentially affecting these conditions.
Knowledge Graph-based Neurodegenerative Diseases and Diet Relationship Discovery
CIBB | October 2021
IMO author Jingcheng Du

Identifying symptoms in clinical notes & standardizing to OMOP common data model

Developers at IMO (Melax Tech) used NLP to extract Covid-19 signs and symptoms, along with eight attributes from clinical notes. The work used a hybrid approach of combining deep learning-based models, curated lexicons, and pattern-based rules to quickly build the extract entities and normalize them to standard terms in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The results accurately recognized important clinical concepts from free text in electronic health records (EHRs) to support accelerated clinical research.
COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model
JAMIA | March 2021
IMO authors Jingqi Wang and Frank J Manion

Extracting cancer data from pathology reports

NLP technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs); however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, IMO (Melax Tech) developed customizable modules for extracting cancer-related information from pathology reports with high performance and F-measures ranging from 0.80 to 0.98.
Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP
NIH | August 2019
IMO author Jingqi Wang

Systematic literature review

Accelerating evidence synthesis in observational studies with “living” systematic literature review

Systematic literature reviews (SLR) are a critical tool to build a knowledge base, assist research gap analysis, synthesize evidence, direct research, and support FDQ regulatory submission. IMO (Melax Tech) developed an NLP solution that automates all major steps in the SLR process with a system that proactively and continuously updates relevant literature in a timely manner. The system screens abstracts to predict articles’ relevance status based on their title, abstract, and other metadata and highlights supporting information. It uses named entity recognition to parse full-length articles and extract data elements from both texts and tables. With high accuracy scores on article screening and data element extraction, the solution helps scientists dedicate more time to ensuring data quality and synthesizing evidence, while staying current with literature related to observational studies.
A Natural Language Processing Solution for Health Economics and Outcomes Research Systematic Literature Review
ISPOR | June 2023
IMO authors Jingcheng Du, Frank J. Manion, Xiaoyan Wang

Drug repurposing

Accelerating drug repurposing with an AI-driven framework and knowledge graphs

Drug repurposing can be a cost-effective strategy to rapidly bring FDA-approved drugs to market for new indications, however, typically requires extensive manual reviews of related literature, clinical trials, and relevant clinical data. IMO (Melax Tech) scientists used NLP to extract biomedical knowledge, integrate diverse data sources, and apply deep learning models to build knowledge graphs and scoring systems. The KnowledgeSphere framework has successfully been used to extract biomedical entities and relations from 35 million PubMed abstracts and achieve successful drug repurposing cases.
Knowledgesphere: An Automated and Integrative Framework for Drug Repurposing Empowered By Knowledge Graph and AI Literature Review
ISPOR | June 2023
IMO authors Liang-Chin Huang, Jingcheng Du, Kyeryoung Lee, Jingqi Wang, Frank J Manion, Xiaoyan Wang

Public health sentiment and social media listening

Informing public health through analysis of public sentiment

This study assessed public perceptions of measles using Twitter data. IMO (Melax Tech) developed a multi-task Convolutional Neural Network (MTCNN) model that outperformed other machine learning models in classifying measles-related tweets. The results highlight the potential of social media analysis and deep learning in informing public health responses to vaccine-preventable diseases.
Understanding Public Perceptions of Measles from Twitter Using Multi-Task Convolutional Neural Networks
IOS Press | October 2021
IMO author Jingcheng Du

Detecting vaccine misinformation in social media with machine learning

IMO (Melax Tech) used machine learning to identify misinformation related to HPV vaccine on Reddit. We analyzed over 28,000 posts and found that 25.63% contained vaccine misinformation, with safety concerns being the most common type. Our findings suggest that machine learning can help combat vaccine misinformation on social media.
Using Machine Learning–Based Approaches for the Detection and Classification of Human Papillomavirus Vaccine Misinformation: Infodemiology Study of Reddit Discussions
NIH | August 2021
IMO author Jingcheng Du

NLP algorithms analyze vaccine sentiment on social media to inform public health actions

This study aimed to use machine learning-based natural language processing (NLP) algorithms to analyze vaccine sentiment and hesitancy across social media platforms due to the rapid spread of vaccine-related misinformation. The team collected and analyzed social media discussions from 2011 to 2021 related to HPV, MMR, and general vaccines on Twitter, Reddit, and YouTube. The results showed that machine-learning NLP algorithms achieved accuracy scores ranging from 0.51 to 0.78 for vaccine sentiment prediction and 0.69 to 0.91 for vaccine hesitancy prediction. Temporal trends revealed variations in social media activity, and the interactive dashboard developed could offer real-time insights to inform public health actions and improve vaccine uptake.
Vaccine Sentiments on Social Media: A Machine Learning-Powered Real-Time Monitoring System
ISPOR | June 2023
IMO authors Jingcheng Du, Frank J. Manion, Jingqi Wang, Xiaoyan Wang, Liang-Chin Huang

Clinical information extraction

NLP toolkit simplifies clinical information extract with customizable pipelines

IMO (Melax Tech) developed a user-friendly NLP toolkit for clinical information extraction. It simplifies the process by breaking it down into customizable pipelines and using machine learning and rule-based methods.
CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines
JAMIA | November 2017
IMO author Jingqi Wang