Resources


Database Credentialed Access

National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database

Jiayang Wang, Xiaoshuo Huang, Lin Yang, Jiao Li

A dataset of annotated NIHSS scale items and corresponding scores from stroke patients discharge summaries in MIMIC-III.

Published: Jan. 25, 2021. Version: 1.0.0


Database Credentialed Access

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Aman Kansal, Emma Chen, Tom Jin, Pranav Rajpurkar, David Kim

A multimodal dataset of deidentified clinical and physiological data from emergency department visits, aimed at enabling research on patient outcomes, care processes, and the impact of continuous monitoring on treatment during and after the COVID-19.

Published: March 3, 2025. Version: 1.0.0


Database Credentialed Access

SCRIPT X2B8 Dataset: per-day clinical features to model successful next-day extubation

Sam Fenske, Alec Peltekian, Mengjia Kang, Nikolay Markov, Anna Pawlowski, Luke Rasmussen, Thomas Stoeger, Benjamin Singer, GR Scott Budinger, Richard Wunderink, Alexander Misharin, Ankit Agrawal, Catherine A Gao

This dataset contains electronic health record (EHR) data from ICU patients receiving mechanical ventilation, aggregated on a daily basis, along with annotations of intubation, extubation, tracheostomy days, and cases of failed extubation. Data can b

Published: Jan. 28, 2025. Version: 1.0.0


Database Credentialed Access

TherLid: A Thermometry Linked Dataset

Jeremy Tan, Inês Martins, João Matos, Tiago Filipe Sousa Gonçalves, Tetsu Ohnuma, Jaime dos Santos Cardoso, Leo Anthony Celi, Vijay Krishnamoorthy, Andrea Lane, An Kwok Wong

TherLiD is an open-source dataset of 13,251 paired temperature readings (contact and infrared) from MIMIC-IV and eICU databases. With added demographics and derived data, it supports research on racial and ethnic disparities in infrared thermometry.

thermometry intensive care unit health equity electronic health records

Published: Jan. 21, 2025. Version: 1.0.0


Database Credentialed Access

ENCoDE, mEasuring skiN Color to correct pulse Oximetry DisparitiEs: skin tone and clinical data from a prospective trial on acute care patients.

Sicheng Hao, Katelyn Dempsey, João Matos, Mahmoud Alwakeel, Jared Houghtaling, An Kwok Wong

A prospective collected EHR-linked skin tone measurements database in OMOP format with emphasis on pulse oximetry disparities.

Published: Aug. 22, 2024. Version: 1.0.0


Database Credentialed Access

EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries

Sunjun Kweon, Jiyoun Kim, Heeyoung Kwak, Dongchul Cha, Hangyul Yoon, Kwang Hyun Kim, Jeewon Yang, Seunghyun Won, Edward Choi

An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries

Published: June 26, 2024. Version: 1.0.1


Database Credentialed Access

ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection

Sunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee Sung, Joel Reisman, Wenjun Li, Robert Kerns, William Becker, Hong Yu

Opioid-related aberrant behaviors (ORABs) detection Dataset (ODD) which is a large-size, expert-annotated, and multi-label classification benchmark dataset corresponding to the task

substance use natural language processing opioid related aberrant behavior

Published: Jan. 11, 2024. Version: 1.0.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, Majid Afshar

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0


Database Credentialed Access

BOLD, a blood-gas and oximetry linked dataset

João Matos, Tristan Struja, Jack Gallifant, Luis Filipe Nakayama, Marie Charpignon, Xiaoli Liu, Jaime dos Santos Cardoso, Leo Anthony Celi, An Kwok Wong

An open-source pulse oximetry and arterial blood gas dataset, derived from MIMIC-III, MIMIC-IV, and eICU-CRD

pulse oximetry intensive care unit health equity electronic health records

Published: Nov. 8, 2023. Version: 1.0


Database Open Access

MIMIC-IV Clinical Database Demo

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Steven Horng, Leo Anthony Celi, Roger Mark

An openly available subset of patients in the MIMIC-IV database.

critical care electronic health record mimic

Published: Jan. 31, 2023. Version: 2.2