Cancer cell lines are important tools for understanding disease mechanisms the as well as for development and pre-testing of appropriate therapeutics. However, cancer cell lines are prone to contamination and misidentification, which devaluate observations with respect to the assumed original cancer type. The ICLAC (International Cell Line Authentication Committee) has confirmed more than 500 problematic cell lines, many of which are commonly used for research resulting, in ~32,000 publications with questionable results according to a recently published estimate.
Large-scale surveys of identification errors for cancer cell lines have been mostly limited to text-based analysis. Importantly, such analyses cannot estimate the concordance between cell line data and corresponding primary tumors of a given disease. Here, the use of genomic profiling data for the review of published cell line experiments provides a novel, highly promising approach to this problem.
We propose a strategic collaboration between the team behind the Cellosaurus reference cell line resource at the University of Geneva (lead: Prof. Amos Bairoch), and the group "Theoretical Oncogenomics" (lead: Prof. Michael Baudis) at the University of Zurich. Synergies will arise from resources and data curation of the Geneva team, combined with cancer profiling data and expertise in genome data analysis of the Zurich group.Recently, the Zurich group has developed a method to quantify similarities between cell lines as well as type-matched primary tumors. Initially, 3675 genomic profiling experiments of 1539 distinct cell lines were processed and data was mapped to corresponding Cellosaurus entries. Probe-level genome data was visualised and re-calibrated, using arrayMap data and software. Future refinements of the method will include linkage disequilibrium (LD) models for CL identity mapping, to make identification and contamination status of a cell line unambiguous.
The primary aim of the proposed project will be to provide a high quality tool for researchers to assess their cell lines, collaboratively developed by the groups at Zurich & Geneva. This tool will provide a two-step identification of the cell lines. First step is genomic cell line fingerprinting based on genomic variation profiles and - where available - LD block analysis. In a second step, cell line variation data is compared to the vast arrayMap & Progenetix cancer datasets, providing a similarity score for the genome of interest compared to primary tumor profiles. Our resource will provide researcher in the areas of cancer genomics, physiology and pharmacology with an important tool for optimising resource use and guaranteeing correct interpretation of expensive and time consuming experiments. Support of this Data Science proposal through the UNIGE - UZH collaborative framework would allow us to develop this proposal into a long-term strategy with appropriate funding support and increasing collaborative participation.
Prof. Dr. Amos Bairoch, University of Geneva
Prof. Dr. Michael Baudis, University of Zurich
Rahel Paloots, University of Zurich