Sol Research

Efficient annotation bootstrapping for cell identification in follicular lymphoma.

Last Updated Apr 23, 2025 in Computer methods and programs in biomedicine by Adam Krawczyk, Aleksandra Osowska-Kurczab, Sławomir Pakuło, Wojciech Kotłowski, Zaneta Swiderska-Chadaj

TLDR

Reducing annotation costs in digital pathology: A study compares three methods for annotating whole slide images and proposes a hybrid architecture for detecting centroblasts and centrocytes.

Abstract

In the medical field of digital pathology, many tasks rely on visual assessments of tissue patterns or cells, presenting an opportunity to apply computer vision methods. However, acquiring a substantial number of annotations for developing deep learning algorithms remains a bottleneck. The annotation process is inherently biased due to various constraints, including labor shortages, high costs, time inefficiencies, and a strongly imbalanced distribution of labels. This study explores available solutions for reducing the costs of annotation bootstrapping in the challenging task of follicular lymphoma diagnosis. We compare three distinct approaches to annotation bootstrapping: extensive manual annotations, active learning, and weak supervision. We propose a hybrid architecture for centroblast and centrocyte detection from whole slide images, based on a custom cell encoder and contextual encoding derived from foundation models for digital pathology. We collected a dataset of 41 whole slide images scanned with a 20x objective lens and resolution 0.24μm/pixel, from which 12,704 cell annotations were gathered. Applying our proposed active learning workflow led to an almost twofold increase in the number of samples within the minority class. The best bootstrapping method improved the overall performance of the detection algorithm by 18 percentage points, yielding a macro-averaged F1-score, precision, and recall of 63%. The results of this study may find applications in other digital pathology problems, particularly for tasks involving a lack of homogeneous cell clusters within whole slide images.

Overview

The study explores solutions for reducing annotation bootstrapping costs in follicular lymphoma diagnosis, a challenging task in digital pathology.
The study compares three annotation bootstrapping methods: extensive manual annotations, active learning, and weak supervision.
The study proposes a hybrid architecture for centroblast and centrocyte detection from whole slide images, using a custom cell encoder and contextual encoding derived from foundation models for digital pathology.

Comparative Analysis & Findings

The proposed active learning workflow led to a twofold increase in the number of samples within the minority class.
The best bootstrapping method improved the overall performance of the detection algorithm by 18 percentage points.
The macro-averaged F1-score, precision, and recall of 63% were achieved using the best bootstrapping method.

Implications and Future Directions

The results may have applications in other digital pathology problems, particularly for tasks involving a lack of homogeneous cell clusters within whole slide images.
The study highlights the potential of active learning and weak supervision for reducing annotation costs and improving detection algorithm performance.
Future research can build upon the hybrid architecture proposed in this study, exploring new applications and challenges in digital pathology.

Read Full Article