TEACHER
Date:24 July (Thursday)
Time:09:05 – 09:25 (GMT+8)
Professor
University of California, San Francisco
Twenty years ago, we introduced high-content image-based phenotypic screens (HCS) (Perlman, Science, 2004). Our work demonstrated that cellular phenotypic profiles of compounds with similar mechanisms of action tend to cluster together. This enabled functional prediction of uncharacterized compounds to be performed via guilt-by-association comparison with well characterized “reference” compounds in the same dataset.
HCS is now a standard platform for screening large-scale compound libraries in academia and pharmaceutical industry. The wide-spread adoption of HCS has led to a rapidly growing number of datasets. However, unlike omics studies, which measure a consistent set of features (such as genes or proteins) and datasets can be combined to gain synergy, the highly customized nature of HCS (experimental, computational and reference drug choices) produces heterogenous phenotypic profiles that cannot be directly compared. A critical, long-standing challenge is how to integrate diverse—but currently isolated—HCS dataset resources.
To address this, we developed CLIPn, a contrastive deep-learning approach to align heterogeneous HCS resources. This AI-powered framework is designed to enable the cross-dataset “transitive” prediction, whereby the function of an uncharacterized compound screened in one dataset could be predicted through comparison with reference compounds profiled in other datasets. We applied CLIPn to 14 diverse HCS datasets generated using different experimental systems and computational pipelines over the past 20 years. By integrating these datasets, we predicted and experimentally validated functions for compounds that could not be characterized in the original, isolated HCS studies. Our work demonstrates, for the first time, that accurate “transitive” predictions can be made across diverse HCS profile resources.