Presented at the Swiss Image-Based Screening Conference, Basel, Switzerland
Image-based high content screens are mainly analyzed the following way: HCS images are subjected to automated image analysis using protocols defined in assay development, yielding numerical data on the object level, which are then averaged to the well level. After a quick-pass QC, the final activity or potency of individual compounds is typically determined based on one to three readout(s), which are often normalized, and hit lists are generated by simple filtering rules.
While such a procedure seems suitable for production screens with very well-defined biology, in phenotypic screens or MOA studies the relevant experimental end points are often more complex or not even defined a-priori. Diverse data analysis strategies exist to cope with this challenge, but they often require expert knowledge or tools or expensive, highly tailored custom software.
Here we describe a comprehensive HCS analysis pipeline allowing rapid exploration of multiple analytical strategies. The fully automated pipeline includes image analysis, data pre-processing, feature reduction of data, and unsupervised and supervised (machine learning) methods. The modular design allows flexible selection of the appropriate analytical strategy to address a certain target biology. The outcome is evaluated using a set of performance measures including validation against a known ground truth, state-of-the-art data visualization techniques, and classification result review.
Using benchmark data, we show the versatility and efficiency of this pipeline when exploring analysis strategies for complex phenotypic HCS. By using real-life pharma data, we aim to find general, easy, and fast analysis strategies for common experiment types to finally empower many more screening scientists with an easy and rapid analysis of phenotypic HCS.