Resources


Database Credentialed Access

CORAL: expert-Curated medical Oncology Reports to Advance Language model inference

Madhumita Sushil, Vanessa Kennedy, Divneet Mandair, Brenda Miao, Travis Zack, Atul Butte

Medical oncology progress notes annotated with advanced, comprehensive oncology-relevant concepts and relationships.

information extraction artificial intelligence oncology natural language processing large language models electronic health records

Published: Feb. 7, 2024. Version: 1.0


Database Open Access

A Multi-Modal Satellite Imagery Dataset for Public Health Analysis in Colombia

Sebastian A Cajas, David Restrepo, Dana Moukheiber, Kuan Ting Kuo, Chenwei Wu, David Santiago Garcia Chicangana, Atika Rahman Paddo, Mira Moukheiber, Lama Moukheiber, Sulaiman Moukheiber, Saptarshi Purkayastha, Diego M Lopez, Po-Chih Kuo, Leo Anthony Celi

Multi-Modal Satellite imagery Dataset in Colombia: A public health analysis with spatiotemporally aligned satellite images and its corresponding metadata across 81 municipalities (2016-2018), facilitating multimodal AI applications.

multimodality satellite imagery

Published: Jan. 30, 2024. Version: 1.0.0


Database Credentialed Access

RadCoref: Fine-tuning coreference resolution for different styles of clinical narratives

Yuxiang Liao, Hantao Liu, Irena Spasic

RadCoref is a small subset of MIMIC-CXR with manually annotated coreference mentions and clusters. Based on the annotated data, we fine-tuned a deep neural model and used it to annotate the whole MIMIC-CXR dataset. Both data are available.

natural language processing coreference resolution radiology

Published: Jan. 30, 2024. Version: 1.0.0


Database Credentialed Access

Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database

Marco Guevara, Shan Chen, Spencer Thomas, Danielle Bitterman

Annotation dataset of social determinants of health from MIMC-III Clinical Care Database notes.

natural language processing social determinants of health

Published: Jan. 24, 2024. Version: 1.0.1


Database Open Access

A Comprehensive Dataset of Pattern Electroretinograms for Ocular Electrophysiology Research: The PERG-IOBA Dataset

Itziar Fernández, Ruben Cuadrado Asensio, Yolanda Larriba, Cristina Rueda, Rosa M Coco-Martin

336 CSV records with 1354 PERG responses (microvolts) from 304 subjects at IOBA. Includes age (years), gender, diagnoses, and visual acuity in logMar scale.

Published: Jan. 19, 2024. Version: 1.0.0


Database Credentialed Access

ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection

Sunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee Sung, Joel Reisman, Wenjun Li, Robert Kerns, William Becker, Hong Yu

Opioid-related aberrant behaviors (ORABs) detection Dataset (ODD) which is a large-size, expert-annotated, and multi-label classification benchmark dataset corresponding to the task

substance use natural language processing opioid related aberrant behavior

Published: Jan. 11, 2024. Version: 1.0.0


Database Credentialed Access

EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems

Konstantin Kotschenreuther

Dataset consisting of question and answer pairs synthetically generated from medical discharge summaries, designed to facilitate the training and development of large language models specifically tailored for healthcare applications

mimic-iv clinical question-answering medical discharge summaries large language models

Published: Jan. 11, 2024. Version: 1.0.0


Database Open Access

Surface electromyographic signals collected during long-lasting ground walking of young able-bodied subjects

Francesco Di Nardo, Christian Morbidoni, Sandro Fioretti

The dataset is composed of long-lasting surface electromyographic (sEMG) signals recorded from ten muscles during ground walking of 31 young able-bodied subjects in Movement Analysis Lab, Università Politecnica delle Marche, Ancona, Italy.

biomedical signals muscle recruitment surface emg signal walking gait analysis

Published: Jan. 9, 2024. Version: 1.0.1

Visualize waveforms

Database Credentialed Access

Neurocritical care waveform recordings in pediatric patients

Thomas Heldt, Andrea Fanelli, Robert Tasker, Frederick Vonberg, Kerri LaRovere

The database contains waveform recordings, including arterial blood pressure, intracranial pressure, and cerebral blood flow velocity, from pediatric patients in neurocritical care.

intracranial pressure arterial blood pressure noninvasive icp neurocritical care neurotrauma cerebral blood flow velocity pediatric patients

Published: Jan. 8, 2024. Version: 1.0.0


Database Open Access

Patient-level dataset to study the effect of COVID-19 in people with Multiple Sclerosis

Hamza Khan, Lotte Geys, peer baneke, Giancarlo Comi, Liesbet Peeters

This dataset is part of the Global Data Sharing Initiative. The data was acquired by people with MS and clinicians using a fast data entry tool. The dataset includes demographics, comorbidities and hospital stay and COVID-19 symptoms of PwMS.

Published: Jan. 2, 2024. Version: 1.0.1