AtlasPatch

An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology.

Ahmed Alagha*, Christopher Leclerc, Yousef Kotp†, Omar Abdelwahed†, Calvin Moras, Peter Rentopoulos, Rose Rostami, Bich Ngoc Nguyen, Jumanah Baig, Abdelhakim Khellaf, Vincent Quoc-Huy Trinh, Rabeb Mizouni, Hadi Otrok, Jamal Bentahar, Mahdi S. Hosseini*

*Project Lead, †Equal Contributer

Abstract

Whole-slide image (WSI) preprocessing, typically comprising tissue detection followed by patch extraction, is foundational to AI-driven and image-based computational pathology workflows. This remains a major computational bottleneck as existing tools either rely on inacurate heuristic thresholding for tissue detection, or adopt AI-based approaches trained on limited-diversity data that operate at the patch level, incurring substantial computational complexity. We present AtlasPatch, an efficient and scalable slide preprocessing framework for accurate tissue detection and high-throughput patch extraction with minimal computational overhead. AtlasPatch’s tissue detection module is trained on a heterogeneous and semi-manually annotated dataset of ~35,000 WSI thumbnails, using efficient fine-tuning of the Segment Anything2 model. The tool extrapolates tissue masks from thumbnails to full-resolution slides to extract patch coordinates at user-specified magnifications, with options to stream patches directly into commonly used image encoders for embedding generation or export patch images for storage, all efficiently parallelized across CPUs and GPUs to maximize throughput. We assess AtlasPatch across segmentation accuracy, computational complexity, and downstream multiple-instance learning, matching state-of-the-art performance while operating at a fraction of their computational cost.

Features

Segmenter

Our high quality tissue detector generates masks using Segment Anything Model 2 (SAM2), finetuned using a large and diverse annotated dataset. This dataset, comprised of over 35,000 whole-slide image (WSI) thumbnails, was curated to cover several organs and tissue types, institutions, scanner vendors, acquisition protocols, and covering variations in illumination, tissue fragment number and size, tissue boundary definition, and histologic heterogeneity. We finetuned the SAM2 model by freezing the backbone and training only the normalization layers for the tissue-versus-background task.

Speed Comparison

AtlasPatch

CLAM

Trident-GrandQC

Trident-Hest

All runs shown compare the speed of image segmentation and patch extraction of 100 whole-slide images run on the same computer hardware (time sped up 10x).

Citation


      @software{atlaspatch,
        title = {AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology},
        author = {Ahmed Alagha, Christopher Leclerc, Yousef Kotp, Omar Abdelwahed, Calvin Moras, Peter Rentopoulos, Rose Rostami,
        Bich Ngoc Nguyen, Jumanah Baig, Abdelhakim Khellaf, Vincent Quoc-Huy Trinh, Rabeb Mizouni, Hadi Otrok, Jamal Bentahar,
        Mahdi S. Hosseini},
        year = {2025},
        url = {https://github.com/AtlasAnalyticsLab/AtlasPatch},
      }