Biocuration 2016, Geneva

Harvesting Cancer Genome Data

Michael Baudis


While the analysis of cancer genomes using high-throughput technologies has generated tens of thousands of oncogenomic profiles, meta-analyses of datasets is greatly inhibited through limited data access, technical fragmentation and a multitude of raw data and annotation formats.

For the arrayMap cancer genome resource, our group collects, re-processes and annotates cancer and associated reference data from genomic array experiments, retrieved from public repositories (e.g. NCBI GEO, EBI ArrayExpress) as well as from publication supplements and through direct requests to the primary producers. So far, our resources provide pre-formatted data for more than 50’000 cancer genome profiling experiments, derived from 340 array platforms and representing more than 700 original publications.

Here, I will present some aspects of the shifting landscape of cancer genome data production and publication, an overview about the data accessible through our resources, and an outlook into developments especially with regard of work performed in the context of the Global Alliance for Genomics and Heath.