Resources


Database Credentialed Access

CORAL: expert-Curated medical Oncology Reports to Advance Language model inference

Madhumita Sushil, Vanessa Kennedy, Divneet Mandair, Brenda Miao, Travis Zack, Atul Butte

Medical oncology progress notes annotated with advanced, comprehensive oncology-relevant concepts and relationships.

information extraction artificial intelligence oncology natural language processing large language models electronic health records

Published: Feb. 7, 2024. Version: 1.0


Database Credentialed Access

Establishment of a Chinese critical care database from electronic healthcare records in a tertiary care medical center

Senjun Jin, Lin Chen, Kun Chen, Zhongheng Zhang

Chinese critical care database from electronic healthcare records in a tertiary care medical center

database china critical care

Published: Jan. 19, 2023. Version: 1.0


Database Open Access

Eye Tracking Dataset for the 12-Lead Electrocardiogram Interpretation of Medical Practitioners and Students

Mohammed Tahri Sqalli, Dena Al-Thani, Mohamed Elshazly, Mohammed Al-Hijji

The project aims at collecting a dataset using eye-tracking technology to understand the 12-lead electrocardiogram interpretation visual behavior for medical practitioners and students with different expertise levels.

human vision medical students ecg interpretation medical image interpretation medical practice medical education human-computer interaction eye-tracking medical practitioners visual expertise ecg electrocardiogram

Published: March 16, 2022. Version: 1.0.0


Model Credentialed Access

Characterization of Stigmatizing Language in Medical Records

Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze

A suite of classifiers for detecting three types of stigmatizing language in electronic medical records. Trained on MIMIC-IV discharge notes.

clinical natural language processing domain transfer bias stigmatizing language large language models mimic

Published: Nov. 6, 2023. Version: 1.0.0


Database Credentialed Access

Medical Expert Annotations of Unsupported Facts in Doctor-Written and LLM-Generated Patient Summaries

Stefan Hegselmann, Shannon Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang

Annotations for unsupported facts in 100 original MIMIC patient summaries (discharge instructions) and hallucinations in 100 Large Language Model (LLM) generated patient summaries labeled by two medical experts.

Published: April 28, 2024. Version: 1.0.0


Database Credentialed Access

Medication Extraction Labels for MIMIC-IV-Note Clinical Database

Akshay Goel, Almog Gueta, Omry Gilon, Sofia Erell, Amir Feder

Medication extraction NLP labels for 600 discharge summaries in MIMIC-IV-Note dataset.

Published: Dec. 12, 2023. Version: 1.0.0


Database Restricted Access

VinDr-SpineXR: A large annotated medical image dataset for spinal lesions detection and classification from radiographs

Hieu Huy Pham, Hieu Nguyen Trung, Ha Quy Nguyen

VinDr-SpineXR: A large annotated medical image dataset for spinal lesions detection and classification from radiographs

Published: Aug. 24, 2021. Version: 1.0.0


Database Credentialed Access

Deidentified Medical Text

Margaret Douglass, Bill Long, George Moody, Peter Szolovits, Li-wei Lehman, Roger Mark, Gari D. Clifford

Gold standard corpus of 2,434 deidentified nursing notes

medical text nursing notes hipaa de-identification

Published: Dec. 18, 2007. Version: 1.0


Database Credentialed Access

Deidentified Medical Text

Margaret Douglass, Bill Long, George Moody, Peter Szolovits, Li-wei Lehman, Roger Mark, Gari D. Clifford

Gold standard corpus of 2,434 deidentified nursing notes

medical text nursing notes hipaa de-identification

Published: Dec. 18, 2007. Version: 1.0


Database Credentialed Access

MIMIC-Ext-MIMIC-CXR-VQA: A Complex, Diverse, And Large-Scale Visual Question Answering Dataset for Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei JI, Eric Chang, Tackeun Kim, Edward Choi

We introduce MIMIC-Ext-MIMIC-CXR-VQA, a complex, diverse, and large-scale dataset designed for Visual Question Answering (VQA) tasks within the medical domain, focusing primarily on chest radiographs.

question answering multimodal benchmark radiology evaluation visual question answering electronic health records deep learning machine learning chest x-ray

Published: July 19, 2024. Version: 1.0.0