Project Mark Survey Logo

Medical Dataset Browser

Exploring, retrieving and analyzing publicly available medical image datasets

Shanghai AI Lab GMAI

๐ŸŽ‰ Welcome to Project Imaging-X ๐ŸŒŸ

A Comprehensive Survey of Medical Imaging Datasets by GMAI

This work constitutes the first large-scale survey of more than 1,000 publicly available medical imaging datasets, with the objective of facilitating the development of robust and generalizable medical foundation models.

Medical imaging datasets represent the cornerstone of AI-driven healthcare; however, they remain fragmented, small in scale, and narrowly task-specific. This survey characterizes modalities, clinical tasks, anatomical coverage, and annotation types, and outlines integrative strategies to maximize their collective utility.

๐Ÿ“– Read more: Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Survey Overview
Filter Mode 1 ยท JSON Rules

Edit the JSON below and run Phase 1&2 (rule-based filtering). Optionally run Phase 3&4 (selection + summary).

Field guide & phase meaning (click to expand)

Phases:

  • Phase 1 ยท Harmonisation โ€” standardize dimensions/modalities, anatomy, dataset quality and impact.
  • Phase 2 ยท Alignment โ€” align task semantics and labels across datasets.
  • Phase 3 ยท Blueprint โ€” cluster datasets and assess integrative potential and scale.
  • Phase 4 ยท Indexing โ€” build public indices and visual summaries for easy access.

Key fields:

  • dimension: one of 2D, 3D, video.
  • modalities: e.g., ["CT","MR","PET","Fundus"].
  • task_types: e.g., ["classification","segmentation","detection","regression"].
  • license_allowlist: restrict to licenses in this list (or null for no restriction).
  • include_unlabeled: keep datasets without labels (default true).
  • min_valid_image_n_per_dataset: minimum total images per dataset.
  • anatomy_whitelist: only keep datasets whose organ/anatomy matches these keywords.
  • release_date_min: minimum release year (e.g., 2020 or "2020-01-01").
  • allow_3d_as_2d_sources: treat 3D as 2D sources when dimension="2D".
  • selection: Phase 3&4 constraints, e.g., min_datasets_per_modality, min_orgs_per_modality, storage_budget_gb.
Filter Mode 2 ยท Quick Filters (search + dropdowns)
Dimension Distribution
Modality Distribution
Task Distribution
Phase 1&2 โ€“ by Modality (Bar)
Phase 1&2 โ€“ Tasks (Doughnut)
Phase 3 โ€“ by Modality (Bar)
Phase 3 โ€“ Tasks (Doughnut)
Phase 4 โ€” Summary Table aggregated by modality
Modality #Datasets Total Images #Orgs Selected / All

Dataset Name Dimension Modality Task Organ Images Release Year Organization License Homepage
Loading...

Load more data...