CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images 1.0.0

File: <base>/DATA_DICTIONARY.md (2,391 bytes)
# CheXmask Database Data Dictionary

## CSV File Structure
Each dataset in the CheXmask Database is provided as a separate CSV file. Below are detailed descriptions of all fields/variables present in these files.

### Image Identification
**Field Name**: Image ID  
**Description**: Reference identifier linking to original image in source dataset  
**Type**: String  
**Notes**: Format varies by source dataset (ChestX-ray8, CheXpert, MIMIC-CXR-JPG, Padchest, VinDr-CXR)  

### Quality Metrics

**Field Name**: Dice RCA (Max)  
**Description**: Maximum Dice Similarity Coefficient from Reverse Classification Accuracy  
**Type**: Float  
**Range**: 0.0 to 1.0  
**Units**: Dimensionless  
**Notes**: Higher values indicate better segmentation quality  

**Field Name**: Dice RCA (Mean)  
**Description**: Mean Dice Similarity Coefficient from Reverse Classification Accuracy  
**Type**: Float  
**Range**: 0.0 to 1.0  
**Units**: Dimensionless  
**Notes**: Recommended threshold for use is >= 0.7

### Anatomical Features

**Field Name**: Landmarks  
**Description**: Set of points representing organ contours generated by HybridGNet model  
**Type**: Array of coordinates  
**Format**: JSON array of [x,y] coordinates  
**Units**: Pixels  

**Field Name**: Left Lung  
**Description**: Segmentation mask for left lung  
**Type**: String  
**Format**: Run-length encoding (RLE)  
**Notes**: Must be decoded using provided dimensions  

**Field Name**: Right Lung  
**Description**: Segmentation mask for right lung  
**Type**: String  
**Format**: Run-length encoding (RLE)  
**Notes**: Must be decoded using provided dimensions  

**Field Name**: Heart  
**Description**: Segmentation mask for heart  
**Type**: String  
**Format**: Run-length encoding (RLE)  
**Notes**: Must be decoded using provided dimensions  

### Image Dimensions

**Field Name**: Height  
**Description**: Height of segmentation mask  
**Type**: Integer  
**Units**: Pixels  
**Notes**: Required for decoding RLE masks  

**Field Name**: Width  
**Description**: Width of segmentation mask  
**Type**: Integer  
**Units**: Pixels  
**Notes**: Required for decoding RLE masks  

## Additional Notes

1. **RLE Format**: Run-length encoding is used to compress the binary mask data. Each RLE string represents pairs of (start_position, run_length) for the mask. A decoding script is present in the GitHub repository.