Unsupervised grading of prostate cancer from the feature space of a convolutional neural network
Author
Lind, Henrik Paaske
Term
4. term
Publication year
2022
Submitted on
2022-06-01
Pages
82
Abstract
Background and aim: Treatment decisions for prostate cancer rely on grading tissue aggressiveness using the Gleason score. Because tissue morphology is highly heterogeneous, pathologists’ ratings can vary, which risks inappropriate treatment. Objective, data-driven measures are therefore needed. H&E-stained whole-slide images (WSIs) contain rich visual patterns that a convolutional neural network (CNN) can use to learn a feature space that naturally separates tissue by appearance. This study asked whether the Gleason score can be inferred from a CNN’s feature space even when grade labels are not used. Method: Patches from H&E-stained WSIs were fed into a multi-output CNN with both reconstruction and multi-class classification heads to learn tissue features. Two model configurations (with and without grade labels) were trained and compared. Performance was assessed on unseen test images using mean squared error for reconstruction and a confusion matrix, precision, recall, and F1-score for classification. To quantify a Gleason score from the learned features, we extracted model features and applied principal component analysis (PCA) to components explaining >90% of the variance. We developed a point-to-point score (PPS) algorithm that assigns a score by computing the accumulated Euclidean distance along a sequence of k-means cluster centroids from benign toward higher-grade tissue features. Results: Mean feature value differences from benign to Gleason 3, 4, and 5 were (0.005, 0.059), (-0.041, 0.147), (-0.272, 0.187) with grade labels and (-0.202, 0.153), (-0.290, 0.153), (-0.438, 0.170) without grade labels. Using 10 and 25 k-means clusters, most benign patches fell at distances 0–0.6, Gleason 3 at 0.6–0.9, and Gleason 4–5 at 0.9–1.2 (up to 1.4 with 25 clusters). Conclusion: A clearer separation of tissue features can be achieved without using grade labels, and the PPS algorithm can suggest a Gleason score based on accumulated distance from benign tissue features.
Baggrund og formål: Behandling af prostatakræft afhænger af, at patologer graderer vævet efter aggressivitet med Gleason-score. Vævsstrukturer er meget heterogene, hvilket kan give uoverensstemmelser mellem vurderinger og risiko for forkert behandling. Derfor er der behov for objektive, data-drevne mål. HE-farvede helslide-billeder (WSI) rummer rige visuelle mønstre, som et konvolutions-neuralt netværk (CNN) kan bruge til at lære et feature-rum, der naturligt adskiller væv efter udseende. Målet var at undersøge, om Gleason-score kan udledes fra et CNN’s feature-rum, selv når gradlabels ikke bruges. Metode: Patches fra HE-farvede WSI’er blev brugt i et multi-output CNN med både rekonstruktion og multi-klasses klassifikation for at lære vævsegenskaber. To modelkonfigurationer (med og uden gradlabels) blev trænet og sammenlignet. Ydeevnen blev vurderet på nye testbilleder med mean squared error for rekonstruktion og confusion matrix, præcision, recall og F1-score for klassifikation. For at kvantificere en Gleason-score blev modelens features ekstraheret og reduceret med principal component analysis (PCA) til komponenter, der forklarede >90% af variansen. Den nye point-to-point-score (PPS) algoritme blev udviklet til at beregne en Gleason-score som den akkumulerede euklidiske distance gennem en kæde af k-means-klyngecentroider fra benignt mod højere Gleason-graders vævsfeatures. Resultater: De gennemsnitlige feature-forskelle fra benignt til Gleason 3, 4 og 5 var (0,005, 0,059), (-0,041, 0,147), (-0,272, 0,187) med gradlabels og (-0,202, 0,153), (-0,290, 0,153), (-0,438, 0,170) uden gradlabels. Med 10 og 25 k-means-klynger lå hovedparten af benigne patches ved afstande 0–0,6, Gleason 3 ved 0,6–0,9 og Gleason 4–5 ved 0,9–1,2 (op til 1,4 med 25 klynger). Konklusion: Et tydeligere skel i vævets feature-rum kan opnås uden at bruge gradlabels, og PPS-algoritmen kan antyde en Gleason-score baseret på akkumuleret distance fra benigne vævsfeatures.
[This apstract has been rewritten with the help of AI based on the project's original abstract]
