Copy number aberration (CNA) is frequently observed in cancer genomes. Meta-analysis of genomic variations helps to disentangle the multiplex molecular mechanism underlying tumorigenesis as well as identify and characterize molecular subtypes. Over the years, cancer genomic research have resulted in a considerable amount of data segregated by studies. The Progenetix project (www.progenetix.org), initiated in 2001, aims to systematize the published cancer genomic profiles and provide accurate annotation to facilitate integrative analysis. Since the last update of the Progenetix database in 2013, the field of genomics research and cancer research has seen significant advancement in terms of molecular genetics technology, annotation/reference refinement, data standard harmonization and data gathering in an increasingly structured and systematic manner. Continuous data integration, curation and maintenance have been dedicated to provide the most comprehensive representation of the cancer genome profiling data.
Since last release, there have been significant expansion on sample number, cancer type diversity, publications, study cohorts and technology platforms, improvements on data quality, ontology representation, as well as stuctural changes in terms of database schema, and data access including web interface. In this presentation, we detail the 2020 update of the Progenetix in six aspects: 1. Introduction, 2 New meta-data features with domain-specific mapping, 3 Data sources with sample expansion and process pipeline, 4 Data standard and object model, 5 Adopted Beacon protocol and associated features, 6 New web interface and features.