CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images 1.0.0
(2,391 bytes)
# CheXmask Database Data Dictionary
## CSV File Structure
Each dataset in the CheXmask Database is provided as a separate CSV file. Below are detailed descriptions of all fields/variables present in these files.
### Image Identification
**Field Name**: Image ID
**Description**: Reference identifier linking to original image in source dataset
**Type**: String
**Notes**: Format varies by source dataset (ChestX-ray8, CheXpert, MIMIC-CXR-JPG, Padchest, VinDr-CXR)
### Quality Metrics
**Field Name**: Dice RCA (Max)
**Description**: Maximum Dice Similarity Coefficient from Reverse Classification Accuracy
**Type**: Float
**Range**: 0.0 to 1.0
**Units**: Dimensionless
**Notes**: Higher values indicate better segmentation quality
**Field Name**: Dice RCA (Mean)
**Description**: Mean Dice Similarity Coefficient from Reverse Classification Accuracy
**Type**: Float
**Range**: 0.0 to 1.0
**Units**: Dimensionless
**Notes**: Recommended threshold for use is >= 0.7
### Anatomical Features
**Field Name**: Landmarks
**Description**: Set of points representing organ contours generated by HybridGNet model
**Type**: Array of coordinates
**Format**: JSON array of [x,y] coordinates
**Units**: Pixels
**Field Name**: Left Lung
**Description**: Segmentation mask for left lung
**Type**: String
**Format**: Run-length encoding (RLE)
**Notes**: Must be decoded using provided dimensions
**Field Name**: Right Lung
**Description**: Segmentation mask for right lung
**Type**: String
**Format**: Run-length encoding (RLE)
**Notes**: Must be decoded using provided dimensions
**Field Name**: Heart
**Description**: Segmentation mask for heart
**Type**: String
**Format**: Run-length encoding (RLE)
**Notes**: Must be decoded using provided dimensions
### Image Dimensions
**Field Name**: Height
**Description**: Height of segmentation mask
**Type**: Integer
**Units**: Pixels
**Notes**: Required for decoding RLE masks
**Field Name**: Width
**Description**: Width of segmentation mask
**Type**: Integer
**Units**: Pixels
**Notes**: Required for decoding RLE masks
## Additional Notes
1. **RLE Format**: Run-length encoding is used to compress the binary mask data. Each RLE string represents pairs of (start_position, run_length) for the mask. A decoding script is present in the GitHub repository.