Weakly supervised tumour segmentation of Colorectal cancer to quantify CDX2-loss in whole-slide-images
Authors
Areemitporn, Chananthan ; Babichanthiran, Adssayah ; Skaarup, Sandra Zain
Term
4. term
Publication year
2026
Submitted on
2026-06-01
Pages
97
Abstract
This project addresses the challenge of colorectal cancer tumor segmentation given substantial morphological heterogeneity and limitations in digital pathology. The aim was to develop a weakly supervised, semi-automatic pipeline that segments tumor-rich regions on pan-cytokeratin (PCK) whole-slide images and then co-registers the resulting masks to matched CDX2-stained slides to quantify CDX2 loss. The dataset comprised 258 unlabeled whole-slide images with both PCK and CDX2 stains. Methodologically, patch-based features were extracted and clustered using unsupervised learning, where K-means produced the most morphologically meaningful clusters to serve as pseudo-labels; a Markov Random Field was applied to enforce spatial consistency within the region of interest. These pseudo-labels were used to train classifiers, with a multilayer perceptron (MLP) using binary cross-entropy, Adam optimization, ReLU activations, and dropout performing best after hyperparameter tuning. Tumor masks were generated on PCK images and co-registered to CDX2 via downsampling, initial rigid alignment, and multimodal registration. CDX2 expression was then quantified patch-wise using color deconvolution (DAB extraction), scoring, and classification to capture heterogeneity. According to the excerpt, the MLP achieved high segmentation performance (99% accuracy and 100% F1-score), while other metrics (Dice, IoU) were not specified in the text. Overall, the work indicates that an unsupervised/weakly supervised approach can produce useful tumor masks and enable quantification of CDX2 loss in whole-slide images; detailed evaluation of registration and quantification beyond feasibility is not provided in this excerpt.
Dette projekt adresserer udfordringen med at segmentere tumorer i kolorektal cancer på grund af betydelig morfologisk heterogenitet og begrænsninger i digital patologi. Formålet var at udvikle en svagt overvåget, semi-automatisk pipeline, der segmenterer tumorrigt væv på pan-cytokeratin (PCK) farvede helslidesbilleder og dernæst co-registrerer segmenteringsmaskerne til tilsvarende CDX2-farvede billeder for at kvantificere CDX2-tab. Datasættet bestod af 258 uannoterede helslidesbilleder med både PCK og CDX2. Metodisk blev patch-baserede features udtrukket og grupperet med usuperviseret klyngedannelse, hvor K-means gav mest morfologisk meningsfulde klynger til brug som pseudomærkater; en Markov Random Field-tilgang sikrede rumlig konsistens i region of interest. Pseudomærkaterne blev anvendt til at træne klassifikationsmodeller, hvor en multilags-perceptron (MLP) med binær krydsentropi, Adam-optimering, ReLU-aktiveringer og dropout opnåede bedst ydeevne efter hyperparametertuning. Segmenteringsmaskerne blev genereret på PCK-billeder og co-registreret til CDX2 via nedskalering, initial stiv justering og multimodal registrering. CDX2-udtryk blev efterfølgende kvantificeret patch-vis ved farvede dekonvolution (DAB-ekstraktion), scoring og klassifikation med henblik på at beskrive heterogenitet. Ifølge uddraget opnåede MLP-modellen høj segmenteringspræcision (nøjagtighed 99 % og F1-score 100 %), mens andre mål (Dice, IoU) ikke var specificeret i teksten. Samlet viser arbejdet, at en usuperviseret/svagt overvåget tilgang kan levere brugbare tumormasker og muliggøre kvantificering af CDX2-tab i helslidesbilleder; detaljerede resultater for registrering og kvantificering ud over gennemførlighed fremgår ikke af dette uddrag.
[This apstract has been generated with the help of AI directly from the project full text]
