Database Credentialed Access

MIMIC-IV-Ext-MDS-ED: Multimodal Decision Support in the Emergency Department - a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine

Juan Miguel Lopez Alcaraz Nils Strodthoff

Published: Sept. 12, 2024. Version: 1.0.0


When using this resource, please cite: (show more options)
Lopez Alcaraz, J. M., & Strodthoff, N. (2024). MIMIC-IV-Ext-MDS-ED: Multimodal Decision Support in the Emergency Department - a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine (version 1.0.0). PhysioNet. https://doi.org/10.13026/p90d-vd84.

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

Abstract

Measurable progress in the development of medical decision support systems has been hindered by a lack of comprehensive datasets. Many available datasets focus on narrow prediction tasks and do not include a diverse range of data types, which limits their effectiveness in real-world clinical settings. This issue is particularly critical in emergency care, where accurate and timely diagnoses and the ability to predict patient deterioration are essential.

To address these challenges, we present a new dataset derived from MIMIC-IV, created specifically for benchmarking multimodal decision support systems in emergency departments. This dataset includes data from the first 1.5 hours after the patient's arrival, covering demographics, biometrics, vital signs (including trends), lab results (including trends), and ECG waveforms. It allows for the evaluation of predictive models across a broad spectrum of clinical conditions, including both cardiac and non-cardiac conditions (1428 ICD-10 codes), as well as clinical deterioration measures (15 labels covering 6 clinical deterioration conditions, ICU admission at two horizons, and mortality at 7 different time horizons).

The integration of diverse data types aims to enhance the clinical relevance and robustness of decision support systems, facilitating more accurate and timely predictions in acute care scenarios. We release this dataset to encourage further research and innovation in emergency medicine and to provide a resource for the reliable benchmarking of multimodal AI models in the field.


Background

The MIMIC-IV-Ext-MDS-ED dataset, as outlined in our recent publication [1], advances the capability to train unified prediction models using ECG waveforms and tabular clinical metadata for a wide range of cardiac and non-cardiac diagnostic conditions and deterioration events. This dataset enables the training of models that not only predict various cardiac conditions but also non-cardiac issues based on a single ECG input.

The dataset integrates ECG recordings from MIMIC-IV-ECG [3] with clinical ground truth data from MIMIC-IV [4,5]. It merges ECG traces with detailed discharge diagnoses, including ICD-10-CM codes, to offer a comprehensive view of patient health. Additional context is provided by data from MIMIC-IV-ED [6], which includes information on emergency department stays, diagnoses, and vital signs. To ensure robust benchmarking, the dataset includes stratified splits based on diagnosis, age, and gender, as outlined in [2,9]. For implementation details and technical guidance, please refer to the accompanying code repository [7].


Methods

The data acquisition for the MIMIC-IV-Ext-MDS-ED dataset involved integrating multiple data sources to create a comprehensive resource for emergency medicine research. This integration process focused on merging ECG recordings with diverse patient features and clinical outcomes.
Data from MIMIC-IV-ECG-Ext-ICD [2,9] was utilized from which we collected diagnostic ICD10 codes and stratified splits.

The MIMIC-IV-ECG [3] dataset provided ECG recordings. Clinical data were drawn from MIMIC-IV [4,5] (tables admissions, diagnoses_icd, d_labitems, labevents, icustays, procedures_icd, omr), which cover patient admissions, diagnostic codes, laboratory tests, ICU stays, and procedures. Additionally, MIMIC-IV-ED [6] (tables edstays, diagnosis, pyxis, vitalsign, and triage) details emergency department stays, diagnoses, medication administration, vital signs, and triage information.


Data Description

We provide one main table file with 129095 rows (ECG records) and 2408 columns: mds_ed.csv. Overall, the columns are divided into feature modalities (general, demographics, biometrics, vital parameters, laboratory values) as well as targets (diagnoses, and deterioration). Below, we describe the columns provided in the main table.

A) General (22/1935)

Associated columns carry names that start with the prefix general_. Features from this category are not supposed to be used as input features for prediction models. Most notably, they allow to link tabular features to corresponding ECG waveforms from MIMIC-IV-ECG.

General features
name description
general_file_name path to the waveform

general_study_id
study id within MIMIC-IV-ECG
general_subject_id subject id within MIMIC-IV-ECG
general_ecg_time time of the waveform collection
general_ed_stay_id ED stay identifier
general_ed_hadm_id hospital admission identifier sourced from the ED system
general_ed_diag_ed ICD-10-CM ED discharge diagnoses sourced from the ED system
general_ed_diag_hosp ICD-10-CM hospital discharge diagnoses sourced from the ED system
general_anchor_age age at 'anchor_year'
general_anchor_year specified 'anchor_year'
general_dod date of death (if applicable)
general_ecg_no_within_stay enumerates ECGs within a given ED/hospital stay
general_strat_fold stratified folds using multi-label stratification as in [2] (applied to diagnoses, gender, age (binned), and outpatient status)
general_intime patient ED admission time
general_outtime patient ED discharge time
general_race patient ethnicity
general_90min end of feature collection window
general_mortality_hours hours from admission to mortality
general_mortality_days days from admission to mortality
general_hosp_dischtime patient hospital discharge time
general_icu_time_hours hours from patient ED admission to ICU admission
general_data the index of the ECG waveform from MIMIC-IV-ECG

B) Input features: demographics (7/1935)

These columns represent the tabular features for the demographics modality. These features include age, gender, and ethnicity. Associated columns carry names that start with the prefix demographics_.

C) Input features: biometrics (3/1935)

These columns represent the tabular features for the biometrics modality. These features include weight, height, and BMI. Associated columns carry names that start with the prefix biometrics_.

D) Input features: vital parameters (and trends) (55/1935)

These columns represent the tabular features for the vital parameters modality. These features include temperature, heart rate, respiration rate, oxygen saturation, systolic blood pressure, and diastolic blood pressure. Since we collect a set of these features within the 1.5 hours window, for each we capture trends via statistical aggregation functions such as mean, median, minimum, maximum, standard deviation, first, last, rate of change, and the slope of a linear model fitted on the minutes' difference between value collection and arrival as the independent variable and the actual values as dependent variables. Associated columns carry names that start with the prefix demographics_. The different aggregations can be distinguished based on a corresponding postfix trailing the column name (_mean, _median, _min, _max, _std, _first, _last, _change, _coeff).

E) Input features: Laboratory values (and trends) (405/1935)

These columns represent the tabular features for the laboratory values modality. These features include absolute basophil count, absolute eosinophil count, absolute lymphocyte count, alanine aminotransferase (ALT), albumin, alkaline phosphatase, aspartate aminotransferase (AST), bands, base excess, basophils, bicarbonate, bilirubin (direct), bilirubin (total), c-reactive protein, calcium (total), carboxyhemoglobin, chloride, creatine kinase (ck), creatine kinase (MB isoenzyme), creatinine, eosinophils, fibrinogen (functional), free calcium, glucose, hematocrit, hemoglobin, INR(PT), lactate, lymphocytes, magnesium, neutrophils, oxygen saturation, PT, PTT, phosphate, platelet count, potassium, RDW, red blood cells, sodium, troponin t, urea nitrogen, white blood cells, pCO2, and ph.

As above, for each of the values aggregated features via mean, median, minimum, maximum, standard deviation, first, last, rate of change, and the slope of a linear model fitted on the minutes' difference between value collection and arrival as the independent variable and the actual values as dependent variables. Associated columns carry names that start with the prefix labvalues_. The different aggregations can be distinguished based on a corresponding postfix trailing the column name (_mean, _median, _min, _max, _std, _first, _last, _change, _coeff).

F) Prediction targets: diagnoses (1428/1935)

These columns represent the targets for the diagnoses task following [2,9] ICD10 codes procedure. Associated columns carry names that start with the prefix diagnoses_.

G) Prediction targets: deterioration (15/1935)

These columns represent the targets for the deterioration task following commonly used definitions [8]. To not negatively affect training and evaluation, apart from the binary values, we also include a special token (-999) which refers to the exclusion of specific target labels due to the clinical workflow absence of the entire patient visit. Associated columns carry names that start with the prefix deterioration_.


Usage Notes

For comparability, we invite people to follow the benchmarking recommendations below. We refer to the original publication [1] for the first benchmarking results and [7] for the corresponding code repository.

Benchmarking recommendations

Within MIMIC-IV-ECG-Ext-ICD, the MIMIC-IV-ECG dataset patients were randomly assigned to twenty folds: 0 to 17 for training, 18 for validation and model selection, and 19 for testing. For benchmark purposes, we only select the respective first ECG for each stay ('general_ecg_no_within_stay'==0) for the validation and test folds to prevent bias in the model evaluation due to patients with a large number of ECGs per stay while keeping all ECGs for the training folds. For a more comprehensive description and first benchmarking results in data modality scenarios see [1] and the corresponding code repository [7].

For the benchmark results from [1], ECG samples were resampled to 100Hz, with missing signal values linearly interpolated and infrequent missing values at sequence boundaries replaced with zero. Signals were clipped to a maximum amplitude of 3 mV. Apart from resampling, handling missing values, and clipping, no further preprocessing was applied to the raw ECG signals.

See the demo.ipynb notebook in [7] for an example of how to access each modality and target. Similarly, the data described here does not includes the masking strategy described in [1], which imputes the median value if the value is missing and adds an auxillary binary variable for each original feature to indicate if the original feature value was missing, see also demo.ipynb (in the associated code repository) for an exemplary workflow


Release Notes

1.0.0 Initial release of the dataset.


Ethics

This study utilized data from the publicly available Medical Information Mart for Intensive Care (MIMIC) database. The use of MIMIC data for research purposes is governed by the Health Insurance Portability and Accountability Act (HIPAA) in the United States, and researchers are required to adhere to strict ethical guidelines when accessing and using this data.


Conflicts of Interest

The authors declare no conflicts of interest.


References

  1. Lopez Alcaraz, J. M. , Bouma H. & Strodthoff, N. (2024). MDS-ED: Multimodal Decision Support in the Emergency Department--a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine. arXiv preprint arXiv:2407.17856
  2. Strodthoff, N., Lopez Alcaraz, J.M., & Haverkamp, W. (2024). Prospects for Artificial Intelligence-Enhanced ECG as a Unified Screening Tool for Cardiac and Non-Cardiac Conditions – An Explorative Study in Emergency Care, European Heart Journal - Digital Health, ztae039. https://doi.org/10.1093/ehjdh/ztae039
  3. Gow, B., Pollard, T., Nathanson, L. A., Johnson, A., Moody, B., Fernandes, C., Greenbaum, N., Waks, J. W., Eslami, P., Carbonati, T., Chaudhari, A., Herbst, E., Moukheiber, D., Berkowitz, S., Mark, R., & Horng, S. (2023). MIMIC-IV-ECG: Diagnostic Electrocardiogram Matched Subset (version 1.0). PhysioNet. https://doi.org/10.13026/4nqg-sb35
  4. Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L. A., & Mark, R. (2023). MIMIC-IV (version 2.2). PhysioNet. https://doi.org/10.13026/6mm1-ek67
  5. Johnson, A.E.W., Bulgarelli, L., Shen, L. et al (2023). MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 10, 1. https://doi.org/10.1038/s41597-022-01899-x
  6. Johnson, A., Bulgarelli, L., Pollard, T., Celi, L. A., Mark, R., & Horng, S. (2023). MIMIC-IV-ED (version 2.2). PhysioNet. https://doi.org/10.13026/5ntk-km72
  7. Lopez Alcaraz, J.M., Bouma H., & Strodthoff N. (2024). Code repository (2024). https://doi.org/10.5281/zenodo.13753554
  8. Mitchell, O. J., Dewan, M., Wolfe, H. A., Roberts, K. J., Neefe, S., Lighthall, G., ... & Abella, B. S. (2022). Defining physiological decompensation: an expert consensus and retrospective outcome validation. Critical care explorations, 4(4), e0677. https://doi.org/10.1097/CCE.0000000000000677
  9. Strodthoff, N., Lopez Alcaraz, J. M., & Haverkamp, W. (2024). MIMIC-IV-ECG-Ext-ICD: Diagnostic labels for MIMIC-IV-ECG (version 1.0.1). PhysioNet. https://doi.org/10.13026/ypt5-9d58

Parent Projects
MIMIC-IV-Ext-MDS-ED: Multimodal Decision Support in the Emergency Department - a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine was derived from: Please cite them when using this project.
Share
Access

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
PhysioNet Credentialed Health Data License 1.5.0

Data Use Agreement:
PhysioNet Credentialed Health Data Use Agreement 1.5.0

Required training:
CITI Data or Specimens Only Research

Corresponding Author
You must be logged in to view the contact information.

Files