Jump to content

Unsupervised Phenotype Discovery in High-Content Imaging via Archetypes and Self-Supervised Learning

October 27, 2025

Identifying novel therapeutic candidates for complex diseases remains a major challenge in modern drug discovery. To address this, biopharmaceutical research increasingly relies on automated, high-throughput screening assays using cell culture models to evaluate thousands of compounds in parallel. However, the resulting large-scale imaging data complicates systematic expert review, making phenotype discovery and classification dependent on extensive—and often biased—manual curation.

A common strategy to mitigate this issue is archetypal analysis, which identifies phenotypes within a dataset. In this work, we introduce an end-to-end deep learning framework that simultaneously learns embeddings from high-content images and uncovers phenotypic structures without supervision [1]. Building on these representations, we apply self-supervised learning to construct a phenotypic embedding space, enabling intuitive visual exploration and downstream assay analysis.

Comprehensive experiments on industry-relevant assays demonstrate that our approach outperforms existing unsupervised and supervised methods, providing a scalable and unbiased pipeline for drug screening and functional genomics [2].

 

[1] Wieser, M et al. "Revisiting Deep Archetypal Analysis for Phenotype Discovery in High Content Imaging." 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE Computer Society, 2025.

[2] Siegismund, D et al. "Self-supervised representation learning for high-content screening." International Conference on Medical Imaging with Deep Learning. PMLR, 2022.


Request Resource

By submitting my data, I give consent to the collection, processing and use of my personal data in accordance with the Genedata privacy policy