Skip to content

Welcome to the baudisgroup Pages

The baudisgroup website represents projects and information by the Computational Oncogenomics Group of the University of Zurich (UZH) and the Swiss Institute of Bioinformatics (SIB). For visitors more interested in Particle Astrophysics, we strongly recommend the website of another, although related, Professor Baudis.

The Computational Oncogenomics Group's research focus lies in the exploration of structural genome variations in cancer. Our work centres around our Progenetix resource of curated molecular-cytogenetic and sequencing data. Specific projects explore computational methods, genomics of selected tumour entities and genomic variant patterns across malignancies. As members of the Global Alliance for Genomics and Health, the group is developing standards in biocuration and data sharing for genomic variants and phenotypic data, for instance in driving development of the ELIXIR Beacon project. Other research is related to genome data epistemology, e.g. geographic and diagnostic sampling biases in cancer studies.

Latest News & Publications

A systematic benchmark of copy number variation detection tools for high density SNP genotyping arrays

M.N. van Baardwijk, L.S.E.M. Heijnen, H. Zhao, M. Baudis and A.P. Stubbs

Genomics (Elsevier). 2024 Nov 14.

Abstract Copy Number Variations (CNVs) are crucial in various diseases, especially cancer, but detecting them accurately from SNP genotyping arrays remains challenging. Therefore, this study benchmarked five CNV detection tools-PennCNV, QuantiSNP, iPattern, EnsembleCNV, and R-GADA-using SNP array and WGS data from 2002 individuals of the DRAGEN re-analysis of the 1000 Genomes project. Continue reading

Copy number variation heterogeneity reveals biological inconsistency in hierarchical cancer classifications

Research Article

Ziying Yang, Paula Carrio-Cordo and Michael Baudis

Molecular Cytogenetics (Spring Nature). doi: 10.1186/s13039-024-00692-2

Abstract: Cancers are heterogeneous diseases with unifying features of abnormal and consuming cell growth, where the deregulation of normal cellular functions is initiated by the accumulation of genomic mutations in cells of - potentially - any organ. At diagnosis malignancies typically present with patterns of somatic genome variants on diverse levels of heterogeneity. Among the different types of genomic alterations, copy number variants (CNV) represent a distinct, near-ubiquitous class of structural variants. Cancer classifications are foundational for patient care and oncology research. Terminologies such as the National Cancer Institute Thesaurus provide large sets of hierarchical cancer classification vocabularies and promote data interoperability and ontology-driven computational analysis. To find out how categorical classifications correspond to genomic observations, we conducted a meta-analysis of inter-sample genomic heterogeneity for classification hierarchies on CNV profiles from 97,142 individual samples across 512 cancer entities, and evaluated recurring CNV signatures across diagnostic subsets. Our results highlight specific biological mechanisms across cancer entities with the potential for improvement of patient stratification and future enhancement of cancer classification systems and provide some indications for cooperative genomic events across distinct clinical entities.

Continue reading

Data Resources, Sharing, Discovery in Biomedical Genetics and Cancer Genomics

セミナー at National Cancer Center Hospital Tsukiji Campus

In this seminar you will learn about Progentix, a large open resource in cancer genomics focussing on curation and sharing of somatic copy number variation (CNV) data as well as other resources and research topics of the Theoretical Cytogenetics and Oncogenomics group at the University of Zucrich and teh Swiss Institute of Bioinformatics.

The second part of the presentation will make a case why such resources cannot be sufficient and introduce the approaches shepherded by the Global Alliance for Genomics and Health (GA4GH) Continue reading

Global Alliance for Genomics and Health - Promoting a New Paradigm for Data Discovery in Biomedical Genomics

Seminar at Tohoku University Sendai

The Global Alliance for Genomics and Health (GA4GH) has developed into a major international organization to address the challenges of data discovery in biomedical genomics. The Beacon standard developed by GA4GH with technical support by the European bioinformatics orgabnization ELIXIR and many international stakeholders has become a default for implementing modern concepts of federated data discovery and access. Continue reading

Data Discovery in Biomedical Genomics - Time for a New Paradigm

The XIVth MEDICAL GENETICS TÂRGU MUREȘ
SOCIETATEA ROMÂNĂ DE GENETICĂ MEDICALĂ

In biomedical genetics and genomics the last decades have led to the recognition that the vast amaount and diversity of genomic variations requires large-scale approaches to data analysis for both areas of rare disease genetics as well as cancer genomics. While projects establishing high quality data from large cohorts are highly valuable and necessary to drive the field, the true power of genomics can only be leveraged through access to data from different studies and cohorts beyond geographic or regulatory boundaries.

Since its foundation in 2013 the Global Alliance for Genomics and Health (GA4GH) has emerged as the main international organization to address the challenges of data discovery, access and utilization in genomics and related fields. The Beaconi> standard developed by GA4GH with technical support by the European bioinformatics orgabnization ELIXIR and many international stakeholders has become a default for implementing modern concepts of federated data discovery and access.

This presentation will provide an introduction into data federation in biomedical genomics with emphasis on the use of the Beaconm protocol for genomic data discovery, together with a general introduction into principles and mechanisms of genomics standards development by GA4GH.

Continue reading

Congratulations to Rahel!

Rahel Paloots today passed her PhD defence

Congratulations to Rahel Paloots for passing her defence for a PhD with the topic

Molecular Heterogeneity among Cancer Cell Lines and their Correspondence to Primary Neoplasias

Rahel has been a contributor to progenetix.org and the main developer behind cancercelllines.org (a resource for, well, cancer cell lines…) - but also to the GA4GH & ELIXIR Beacon project.

Continue reading

Opening Cancer Genomics - Deploying the GA4GH Beacon protocol

Zürich Cancer ORD Workshop

The "Moving ORD practices into cancer care" project explores and tests ORD principles in the context of cancer care, through a combination of engagement with relevant communities at the UZH as well as a technical demonstrator of a cutting edge data discovery technology. The cancer community at UZH is being mobilized through a dedicated workshop and topical meetings, to discuss challenges and opportunities of ORD in the oncology domain. A pilot installation of an established ORD approach (Beacon technology by the Global Alliance for Genomics and Health) is being implemented to test the sharing of cancer related -omics and associated data with a focus on the specific data access and security aspects presenting here.

Continue reading

The ELIXIR hCNV Community - Making complex genomics accessible

ELIXIR Webcast

Michael Baudis, Antonio Rausell & Krzysztof Poterlowicz

Abstract Genomic copy number variants (CNV) are a major contributor to human genome variation and important factors in rare disease genetics and cancer genomics. However, the complexity of CNV detection technologies, the lack of standardised annotation formats and the fragmentation of cytogenetic and genomic communities so far has limited large scale utilization of CNV profiles in computational genomics. Continue reading

cancercelllines.org listed in Expasy

Entry in the Swiss Institute of Bioinformatics Catalogue

Expasy logo Our recently launched cancer cell line genomics site cancercelllines.org is now listed as one of the resources in the Swiss Institute of Bioinformatics’ Expasy catalogue.

Continue reading

Genomic Data Mining and The Case for Open Data Standards

ZHAW ICLS colloquium

ZHAW logo The last decades have an enormous amount of data generation in the area of biomedical genomics, driven by a feedback loop of technological developments and increasing recognition of human genomic complexities, in health and disease. While individual projects frequently are tailored towards addressing specific research questions, the approach to many problems especially in rare diseases and cancer genomics requires both curated data collections and the access to data from international repositories and study centers.

Here I will present our oncogenomic data resources and research projects, and how our work on data discovery and exchange standards as part of an international community can improve data accessibility through the support of federated discovery and retrieval protocols.

Continue reading

GA4GH Connect April 2024 in Ascona

Spring 2024 GA4GH Connect working meeting co-organized by our group

elixir logo We're proud to host the next Spring GA4GH Connect meeting in April 2024 at the Congressi Stefano Franscini on the Monte Verità in Ascona. This will provide an excellent opportunity for Swiss genomics and bioinformatics to, well, connect with the international "genomics and health" community and projects.

Continue reading

Genomdaten - Chancen und Risiken für Medizin und Gesellschaft

Vortrag an der Seniorenuniversität Winterthur

ZHAW logo In der medizinischen Forschung werden Genomdaten für genauere Diagnosen und zur Auswahl personalisierter Therapien herangezogen. Genomdaten finden aber auch ausserhalb der Medizin Verwendung – etwa in der genealogischen Forschung oder in der Forensik. Diesem Potential stehen jedoch auch Risiken gegenüber.

Wenn genomische Daten von Millionen von Menschen erhoben werden, wer soll den Einzelnen und die Gesellschaft vor dem Missbrauch solcher Daten schützen? Der Vortrag beantwortet Fragen im Hinblick auf den Bedarf an solchen Daten in Medizin und Forschung, und die dadurch eventuell entstehenden Risiken.

Warum brauchen wir genomische Daten von Millionen von Individuen? Wie können meine Daten eine Gefahr für mich darstellen? Und wer sollte mich als Individuum vor einem Missbrauch schützen?

Continue reading

cancercelllines.org - a Novel Resource for Genomic Variants in Cancer Cell Lines

DATABASE Article

Rahel Paloots and Michael Baudis

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030
bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281

DATABASE logo Abstract: Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource - cancercelllines.org - with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants (SNVs) data. We have gathered over 5,600 copy number profiles as well as SNV annotations for 16,000 cell lines and provide this data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 API and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme.

Availability and Implementation: Our resource is publicly available on the web at cancercelllines.org.

Continue reading

Federated genomic discoveries: Deploying the GA4GH Beacon protocol

Virtual Seminar
GHGA Lecture Series

GHGA logo With the ever increasing amount of genomic data produced in the context of research studies, population analyses and medical diagnostics the need for access to genomic information beyond administrative or geographic boundaries has become a matter of eminent importance. Continue reading

Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines

Literature-derived annotations as entry point for data exploration

Ellery Smith, Rahel Paloots, Dimitris Giagkos, Michael Baudis and Kurt Stockinger

Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045
Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933

arXiv logo arXiv logo Motivation: With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume (Lubowitz et al., 2021). As a consequence, in the fields of biological, medical and clinical research, domain experts have to sift through massive amounts of scientific text to find relevant information. However, this process is extremely tedious and slow to be performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. Results: In this work, we present the design, implementation and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data in the domain of cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard.

Availability and Implementation: Our system is publicly available on the web at cancercelllines.org.

Contact: The authors can be contacted at ellery.smith@zhaw.ch or rahel.paloots@uzh.ch.

Continue reading

Twelve quick tips for deploying a Beacon

Some hints for Beacon developers & implementers

Lauren A Fromont, Mauricio Moldes, Michael Baudis, Anthony J Brookes, Arcadi Navarro and Jordi Rambla

PLoS Comput Biol. 2024 Mar 1;20(3):e1011817.

Introduction: In the age of data-driven biomedical research and clinical practice, the sharing of genomic and clinical data for health research and personalized medicine has become an important contribu- tor to improved diagnosis and treatment. From the data owner’s perspective, potential benefits include improved treatments, personalization of healthcare practice, and more effective con- trol of disease proliferation. However, the requirement for high levels of data security to pro- tect sensitive information presents a barrier to data discovery and sharing.

Beacon is designed to enable the benefits of data discovery while minimizing the associated risks...

Continue reading

Genomic Data Sharing Standard Development with GA4GH and ELIXIR
Opportunities and Pitfalls in Federated Data Discovery

DMLS Lecture Series
University of Zurich Department of Molecular Life Sciences

UZH logo In this presentation Michael talks about the role of GA4GH (and ELIXIR) in the development of standards and practices or genomic data exchange, some general principles, how his group got involved into these efforts - but also some pitfalls ...

Continue reading

Structural Genome Variations in Cancer and the Case for Open Data Standards

Cancer Genomics Seminar at Utrecht
Hubrecht Institute and Princess Maxima Center for Pediatric Oncology

UMC logo Princess Maxima logo The presentation includes notes about work on improving the representation and of genomic copy number variations (CNV), GA4GH and its Beacon protocol as well as challenges towards genomic data privacy.

Continue reading

labelSeg: segment annotation for tumor copy number alteration profiles

A tool to assign relative SCNA levels to segments

Hangjia Zhao and Michael Baudis

Briefings in Bioinformatics (Oxford). 2024 Jan 31;2024:bbad541.

Abstract Somatic copy number alterations (SCNAs) are a predominant type of oncogenomic alterations that affect a large proportion of the genome in the majority of cancer samples. Current technologies allow high-throughput measurement of such copy number aberrations, generating results consisting of frequently large sets of SCNA segments. However, the automated annotation and integration of such data are particularly challenging because the measured signals reflect biased, relative copy number ratios. In this study, we introduce labelSeg, an algorithm designed for rapid and accurate annotation of CNA segments, with the aim of enhancing the interpretation of tumor SCNA profiles. Continue reading

Beaconize this: Databases for Cancer Genomics and the Development of Open Data Standards

Seminar at the Bioinformatics club of the Centre de Recherche des Cordeliers (CRC)
Université Paris Cité

In this seminar at the Centre de Recherche des Cordeliers in Paris Michael presents the work of the group, with special emphasis on the role of the Progenetix oncogenomics resources and tools in the development, implementation and testing of the Beacon standard of the Global Alliance for Genomics and Health (GA4GH).

Continue reading

pgxRpi Accepted by Bioconductor

The R wrapper for Peogenetix API pgxRpi is now part of the 'devel' branch and will be included in the upcoming 3.19 release in mid-April 2024.

Continue reading

CNV Project at biohackathon23

Participating at #BioHackEU23 in Barcelona with a CNV reference resource project

biohackathon logo image With other members of the hCNV community some of us will participate at this year's Biohackathon Europe event. The main project will address the creation of the template for a "beaconized" public resource for reference (i.e. not disease associated) copy number variation data, including the necessary tooling for the import from e.g. VCF or BED file variants into Beacon backends (such as our bycon environment).

Continue reading

Swiss-Korean Life Science Symposium

The 10th Swiss-Korean Life Science Symposium in Seoul

Swiss-Korean logo image As representative of the Swiss delegation and particularly of the University of Zurich UZH Michael will be presenter and panel discussion participant at the 10th Swiss-Korean Life Science Symposium in Seoul, together with members of the Swiss and Korean life sciences and personalized health academic and industrial communities.

Continue reading

Progenetix as SIB and ELIXIR Resource

Recognizing the Progenetix platform as Swiss contribution to the European bioinformatics resources ecosystem

elixir logo The Progenetix resource has finally been recognized as an official contribution to the ELIXIR European bioinformatics informatics ecosystem. Besides Expasy Progenetix now is linked through ELIXIR's resource page. Or just go directly to progenetix.org (and its daughter project cancercelllines.org).

Continue reading

Short tandem repeat mutations regulate gene expression in colorectal cancer

Exploring STR patterns and their relation to expression changes in cancer

Max A Verbiest, Oxana Lundström, Feifei Xia, Michael Baudis, Tugce Bilgin Sonay, Maria Anisimova

doi: https://doi.org/10.1101/2023.11.29.569189

biorXiv logo Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression levels to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. The increased mutability of eSTRs in MSI tumours may be an early indication that eSTR mutations can confer a selective advantage to tumours. Future extensions of our findings into larger cohorts could uncover new STR-based targets in the treatment of cancer.

Continue reading

ELIXIR All Hands Dublin

Baudisgroup presentations at the AHM 2023 in Dublin

Rahel, Hangjia & Michael for the group

At the ELIXIR All Hands Meeting 2023 in Dublin our group presented several posters about our resources and work in standards development.

Continue reading

Phenopacket-tools: Building and validating GA4GH Phenopackets

Bioinformatics tools and examples for working with the Phenopackets standard
Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. (2023) Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One. 18:e0285433.

Abstract The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Continue reading

Theoretical Cytogenetics and Oncogenomics

DMLS Tandem Talks

Michael Baudis

In this short presentation Michael provides an overview of the group's work in cancer genomics resources, data analysis and standard development, including the involvement in the Global Alliance for Genomics and Health GA4GH.

  • Cancer Genome Profiles
  • Oncogenomic Data Resources
  • Bioinformatics Methods
  • Data Exchange Standards for Genomics and Personalized Health
Continue reading

ZHAW Visitors for Cancercelllines Hackathon

Implementing INODE-driven literature collections

Fore some months our group has worked with members of Kurt Stockinger's team from the ZHAW on a cancer cell lines use case for the INODE project. In the last 2 days we had a site visit for a first implementation of the use case specific system on cancercelllines.org. More information to follow - and thanks to Ellery & Dimitris for the great work!

Continue reading

Candidate targets of copy number deletion events across 17 cancer types

Identifying cancer related genes against the background of somatic CNV events

Huang Q and Baudis M

doi: 10.3389/fgene.2022.1017657
previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080

Abstract Genome variation is the direct cause of cancer and driver of its clonal evolution. While the impact of many point mutations can be evaluated through their modification of individual genomic elements, even a single copy number aberration (CNA) may encompass hundreds of genes and therefore pose challenges to untangle potentially complex functional effects. However, consistent, recurring and disease-specific patterns in the genome-wide CNA landscape imply that particular CNA may promote cancer-type-specific characteristics. Discerning essential cancer-promoting alterations from the inherent co-dependency in CNA would improve the understanding of mechanisms of CNA and provide new insights into cancer biology and potential therapeutic targets. Continue reading

Genomic Resource Built with GA4GH Standards

EORTC PAMM Firenze

Michael Baudis

This brief presentation introduces the Progenetix resource, the Gobal Alliance for Genomics and Health as a developers of standards for data sharing in biomedical genomics as well as the use of Progenetix in GA4GH standards development.

Continue reading

Beacon v2 - Onboarding Strategies & Feature Examples

Beacon Sessions at GA4GH Connect

Michael Baudis

The Beacon Sessions at GA4GH Connect November 2022 targeted the migration of existing and implementation of new v2 Beacons, with emphasis on the "how to get there easily" rather than on all Beacon v2 features. Continue reading

Genomics Data Federation through Global Alliance for Genomics and Health Standards: Development and Implementation of the GA4GH Beacon Protocol

Seminar Yonsei University Medical School Seoul

Michael Baudis

In this Seoul meeting presentation Michael introduces the Global Alliance for Genomics and Healt and its involvement in Genomics standards development, followed by a discussion of the Beacon protocol and the role of the Progenetix resouce in its development. Continue reading

Beacon v2 - Feature-rich Implementation of the Genomic Data Discovery Protocol

GA4GH 2022 Plenary Barcelona

Michael Baudis

The “Beacon” protocol - developed with support from ELIXIR, the European bioinformatics infrastructure organization, as a standard of the Global Alliance for Genomics and Health (GA4GH) - represents an emerging standard for an “Internet for Genomics”. While the initial version of the protocol served as a widely adopted test bed for the sharing of genomic variants over federated query systems connecting hundreds of internationally distributed resources, the version 2 of the protocol provides a framework for extended, metadata-rich query and response options in both public and restricted federated access scenarios. Continue reading

GA4GH Phenopackets: A Practical Introduction

Phenopackets v2 introduction with practical examples

Ladewig MS, Jacobsen JO, Wagner AH, Danis D, Kassaby BE, Gargano M, Groza T, Baudis M, Steinhaus R, Seelow D, Bechrakis NE, Mungall CJ, Schofield PN, Elemento O, Smith L, McMurry JA, Munoz-Torres M, Haendel MA and Robinson PN

Abstract The Global Alliance for Genomics and Health (GA4GH) is developing a suite of coordinated standards for genomics for healthcare. The Phenopacket is a new GA4GH standard for sharing disease and phenotype information that characterizes an individual person, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments. A detailed example is presented that illustrates how to use the schema to represent the clinical course of a patient with retinoblastoma, including demographic information, the clinical diagnosis, phenotypic features and clinical measurements, an examination of the extirpated tumor, therapies, and the results of genomic analysis. The Phenopacket Schema, together with other GA4GH data and technical standards, will enable data exchange and provide a foundation for the computational analysis of disease and phenotype information to improve our ability to diagnose and conduct research on all types of disorders, including cancer and rare diseases.

The Phenopacket software is available at github.com/phenopackets/.

Continue reading

Beacon - Ethical & Legal Aspects of a Genomic Data Discovery Protocol

DSI Ethics Project Pitch

Michael Baudis

Here Michael provides a very brief presentation about the GA4GH Beacon protocol, especially as a target for projects discussing the ethical implications of genome data discovery & sharing as well as the relevant legal frameworks, with emphasis on the Swiss context. Continue reading

A cancer genomics resource built on GA4GH standards

Rahel Paloots, Michael Baudis

CGC St Louis 2022

Progenetix is a cancer genomics resource that includes genomic profiling data as well as biomedical annotations and provenance data for cancer studies. The main goal of the Progenetix database is to provide easy, open access for research studies and clinical diagnostics. To facilitate sharing of genomic data, Progenetix complies with and contributes to GA4GH and Beacon data standards. Beacon, developed with the support from ELXIR (the European bioinformatics infrastructure organization), started out as protocol to share genomic variants over federated queries.

Continue reading

A cancer genomics reference resource powered by GA4GH standards

Roche Data Science Seminar

Michael Baudis

The presentation reports about the Progenetix cancer genomics resource and its role in the GA4GH ecosystem & the Beacon genomics API development process.

Continue reading

The GA4GH Phenopacket schema defines a computable representation of clinical data

Phenopackets v2 publication

Cell Genomics logo

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Buske OJ, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Essenwanger A, Freimuth RR, ... , Haendel MA, Robinson PN, The GAGHPMC.

Abstract Despite great strides made in the development and wide acceptance of standards for exchanging structured information about genomic variants, progress in standards for computational phenotype analysis for translational genomics has lagged behind. Phenotypic features (signs, symptoms, laboratory and imaging findings, results of physiological tests, etc.) are of high clinical importance, yet exchanging them in conjunction with genomic variation information is often overlooked or even neglected. Continue reading

Implementation of the GA4GH Beacon protocol for discovery and sharing of genomic copy number variation data

ESHG Vienna 2022

Michael Baudis

Background & Objectives Genomic copy number variations (CNV) are a major contributor to inter-individual genomic variation, can be causative events in rare diseases, but especially represent the majority of the mutational landscape in the most malignancies. While specific CNV events and some recurring patterns have contributed to the identification of individual cancer drivers and the recognition of cancer subtypes, the complexity of genomic CNV patterns requires large amounts of well-defined genomic profiles for statistically meaningful analyses. At the other end of the spectrum, in the area of rare disease genomics the potential pathogenicity of individual CNV events requires validation against a vast set of disease-related and reference genomic profiles and annotations.

Continue reading

Progenetix & BeaconPlus - An open cancer genomics resource on a stack of Beacon code...

ELIXIR All Hands Amsterdam 2022

Michael Baudis

Here Michael provides some overview of the multi-year trajectory of the Beacon API development, and how BeaconPlus & Progenetix have been utilized for "implementation driven design".

Continue reading

Beacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond

Beacon v2 publication

Rambla J, Baudis M, Ariosa R, Beck T, Fromont LA, Navarro A, Paloots R, Rueda M, Saunders G, Singh B, Spalding JD.

Human Mutation. 2022 Mar 17. PMID:35297548

Abstract Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers.

Continue reading

hCNV Implementation Studies Old and New

ELIXIR Human Data Communities

Michael Baudis

This presentation provided an overview about the hCNV community, implementation studies and ongoing work, e.g. interaction with the GA4GH VRS standard group and Beacon development.

Continue reading

Technical, legal and ethics aspects of genomic data sharing

Genomes | Privacy | Laws | Society - DSI Ethics Seminar

Michael Baudis

The presentation introduces the need for sharing and federated discovery of genome data in the contexts of personalized health and genomic researchand some of teh current developments in international standards and practices in the area. Continue reading

The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine

Phenopackets v2 preprint

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Freimuth RR, ..., Haendel MA, Robinson PN.

medRxiv, 2021.11.27.21266944. doi:10.1101/2021.11.27.21266944

Abstract Despite great strides in the development and wide acceptance of standards for exchanging structured information about genomic variants, there is no corresponding standard for exchanging phenotypic data, and this has impeded the sharing of phenotypic information for computational analysis. Here, we introduce the Global Alliance for Genomics and Health (GA4GH) Phenopacket schema, which supports exchange of computable longitudinal case-level phenotypic information for diagnosis and research of all types of disease including Mendelian and complex genetic diseases, cancer, and infectious diseases. Continue reading

The GA4GH Variation Representation Specification (VRS): a Computational Framework for the Precise Representation and Federated Identification of Molecular Variation.

Alex H. Wagner, Lawrence Babb, Gil Alterovitz, Michael Baudis, Matthew Brush, Daniel L. Cameron, Melissa Cline , Malachi Griffith, Obi L. Griffith, ..., Melissa Konopko, Heidi L. Rehm, Andrew D. Yates, Robert R. Freimuth, Reece K. Hart

Wagner, Alex H. et al. Cell Genomics, Volume 1, Issue 2, 100027 doi:10.1016/j.xgen.2021.100027
bioRxiv. version 20212021.01.15.426843. (2021-01-15)

Note

Cell Genomics logoThis article was published as part of a special GA4GH edition of Cell Genomics.

Abstract Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced “verse”), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. Continue reading

International federation of genomic medicine databases using GA4GH standards

Adrian Thorogood, Heidi L. Rehm, Peter Goodhand, Angela J.H. Page, Yann Joly, Michael Baudis, Jordi Rambla, Arcadi Navarro, Tommi H. Nyronen, Mikael Linden, Edward S. Dove, Marc Fiume, Michael Brudno, Melissa S. Cline, Ewan Birney

Thorogood, Adrian et al. Cell Genomics, Volume 1, Issue 2, 100032 doi:10.1016/j.xgen.2021.100032

Note

Cell Genomics logoThis article was published as part of a special GA4GH edition of Cell Genomics.

Abstract We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.

Continue reading

GA4GH: International policies and standards for data sharing across genomic research and healthcare

Heidi L. Rehm, Angela J.H. Page, Lindsay Smith, Jeremy B. Adams, Gil Alterovitz, Lawrence J. Babb, Maxmillian P. Barkley, Michael Baudis, Michael J.S. Beauvais, Tim Beck, Jacques S. Beckmann, Sergi Beltran, David Bernick, Alexander Bernier, James K. Bonfield, Tiffany F. Boughtwood, Guillaume Bourque, Sarion R. Bowers, Anthony J. Brookes, Michael Brudno, Matthew H. Brush, David Bujold, Tony Burdett, Orion J. Buske, Moran N. Cabili , Daniel L. Cameron, Robert J. Carroll, Esmeralda Casas-Silva, Debyani Chakravarty, Bimal P. Chaudhari, Shu Hui Chen, J. Michael Cherry, Justina Chung, Melissa Cline, Hayley L. Clissold, Robert M. Cook-Deegan, Mélanie Courtot, ..., Peter Goodhand, Kathryn North, Ewan Birney

Rehm, Heidi L. et al. Cell Genomics, Volume 1, Issue 2, 100029 doi:10.1016/j.xgen.2021.100029

Note

This article was published as part of a special GA4GH edition of Cell Genomics.

Abstract The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. Continue reading

A cancer genomics resource built around and driving GA4GH standards

GRIC sponsored workshop with the Swiss Institute of Bioinformatics

Michael Baudis

The Progenetix oncogenomics resource provides sample-specific cancer genome profiling data and biomedical annotations as well as provenance data for cancer studies. With more than 100k genomic copy number number (CNV) profiles from over 700 cancer types, Progenetix empowers comparative analyses beyond individual studies and diagnostic concepts.

Continue reading

A Standardized Format for Federated Genomic Data Exchange

The GA4GH Beacon Protocol Presented at BC2 Basel 2021
Session "Federating computational analyses with GA4GH standards"

Michael Baudis

BC2 logoDuring the "Federating computational analyses with GA4GH standards" workshop at BC2 2021 Michael presented history and the current status of the Beacon project, as well as its integration with specific data resources and analysis initiatives.

Continue reading

Cancer genomics reference resource and toolkit around GA4GH standards

ESHG 2021

Q. Huang, B. Gao, R. Paloots, P. Carrio-Cordo, Z. Yang, M. Baudis

ESHG LogoThis poster presentation at the European Society of Human Genetics meeting 2021 discusses the integration and development of GA4GH standards by the Progenetix oncogenomics resource.

Continue reading

Progenetix - An open reference resource for copy number vatiation data in cancer

Qingyao Huang

Continue reading

The Progenetix oncogenomic resource in 2021

Article describing the current content & technical status of progenetix.org

Qingyao Huang, Paula Carrio Cordo, Bo Gao, Rahel Paloots, Michael Baudis

Database (Oxford). 2021 Jul 17;2021:baab043.

DATABASE logo

Abstract In cancer, copy number aberrations (CNAs) represent a type of nearly ubiquitous and frequently extensive structural genome variations. To disentangle the molecular mechanisms underlying tumorigenesis as well as identify and characterize molecular subtypes, the comparative and meta-analysis of large genomic variant collections can be of immense importance. Over the last decades, cancer genomic profiling projects have resulted in a large amount of somatic genome variation profiles, however segregated in a multitude of individual studies and datasets. The Progenetix project, initiated in 2001, curates individual cancer CNA profiles and associated metadata from published oncogenomic studies and data repositories with the aim to empower integrative analyses spanning all different cancer biologies. Continue reading

hCNV Community and Implementation Studies

Michael Baudis

ELIXIR All Hands 2021 Human Data Day

At the Human Data Day Michael presents a very brief overview of the ending and upcoming ELIXIR hCNV implementation studies.

Continue reading

hCNV data and the Progenetix Beacon

Presentation at ELIXIR All Hands 2021

Michael Baudis

ELIXIR All Hands 2021

This presentation gives a brief overview of the use of the Progenetix resource to test and implement a genomics reference resource using the emerging Beacon v2 protocol.

Continue reading

Signatures of Discriminative CNA in 31 Cancer Subtypes

Bo Gao and Michael Baudis (2021)

Published at Frontiers in Genetics, 2021-05-13

Abstract

Copy number aberrations (CNA) are one of the most important classes of genomic mutations relatedto oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated bymolecular-cytogenetic and genome sequencing based methods. While this data has been instrumentalin the identification of cancer-related genes and promoted research into the relation between CNA andhisto-pathologically defined cancer types, the heterogeneity of source data and derived CNV profilespose great challenges for data integration and comparative analysis. Furthermore, a majority of exist-ing studies have been focused on the association of CNA to pre-selected ”driver” genes with limitedapplication to rare drivers and other genomic elements.

Continue reading

Progenetix, Beacon and GA4GH at RDA

Research Data Alliance - RDA Virtual Plenary 17

Concepts | Status | History | Outlook

Michael Baudis

Research Data Alliance - RDA Virtual Plenary 17

This seminar gives an overview of current state of the Progenetix Beacon project and the overall connection to the Global Alliance for Genomics and Health (GA4GH).

Continue reading

Implementing GA4GH Standards to Drive an Open Oncogenomics Resource

Research Seminar Kinderspital Zürich - Neuroonkologie

Michael Baudis

Seminar Neurooncology Childrens Hospital Zürich

This seminar gives an overview of the history & current state of the Progenetix resource, it's role in Beacon API development and the overall connection to the Global Alliance for Genomics and Health (GA4GH).

Continue reading

Discovering copy number variation across multiple cancer types

Qingyao Huang

Abstract

Genomic variations are direct cause of tumor formation and accomplice in its continuous evolution. While point mutations can be pinpointed to a targeted genetic element, copy number variations (CNVs) involve copy number gain or loss of a large DNA segment which often covers hundreds of genetic elements in one event. Continue reading

EACR conference - The Progenetix Oncogenomic Resource

Continue reading

GA4GH Connect - Beacon v2 and SchemaBlocks

GA4GH Connect 2020

Michael Baudis

Beacon v2 Structural Variants [slides]
SchemaBlocks {S}[B] [slides]
Continue reading

Copy number variant heterogeneity among cancer types reflects inconsistent concordance with diagnostic classifications

Paula Carrio Cordo and Michael Baudis

bioRxiv. doi: doi.org/10.1101/2021.03.01.433348

This article explores the correlation between subsets of cancer entities, grouped by their somatic CNV patterns, and levels of diagnostic classification systems.

Continue reading

Genomic data and Privacy

Michael Baudis

ETHZ Lecture

The understanding of the impact of individual inherited and somatic genome variants on phenotypes and diseases requires a thorough understanding of the occurrence of such variants amongst populations in general and carriers of the phenotypes and diseases in particular. This information can only be provided through the inclusion of data from a multitude of genome resources in variant evaluation efforts, including such from outside (international) jurisdictions. However, opening such resources carries the inherent risk of breaching privacy, particularly through re-identification of individuals or their relatives and potentially through the exposure of individual genome-related personal information including phenotypic and "performance" prediction and relative disease risk.

Continue reading

Beacon v2 – Towards flexible use and clinical applications for a reference genomic data protocol

SPHN Webinar

Michael Baudis

Genomic “Beacons” provide discovery services for genomic data using the Beacon API developed as a key driver project of the Global Alliance for Genomics and Health (GA4GH). The Beacon protocol itself defines an open standard for genomics data discovery and provides a framework for web services responding to queries against genomic data collections, for instance from population based or disease specific genome repositories. Continue reading

Update of Progenetix Oncogenomics Resource

Research Progress Report, DMLS, University of Zurich

Qingyao Huang

Copy number aberration (CNA) is frequently observed in cancer genomes. Meta-analysis of genomic variations helps to disentangle the multiplex molecular mechanism underlying tumorigenesis as well as identify and characterize molecular subtypes. Over the years, cancer genomic research have resulted in a considerable amount of data segregated by studies. The Progenetix project (www.progenetix.org), initiated in 2001, aims to systematize the published cancer genomic profiles and provide accurate annotation to facilitate integrative analysis. Continue reading

Welcome to Ziying

Today Ziying Yang arrived as a new member of the baudisgroup.

Welcome Ziying!

Continue reading

GA4GH Beacon v2 at GA4GH Plenary

GA4GH Beacon v2 - Evolving Reference Standard for Genomic Data Exchange

GA4GH 8th Plenary

Gary Saunders, Jordi Rambla de Argila, Anthony Brookes, Juha Törnroos and Michael Baudis

For the ELIXIR Beacon project, GA4GH Discovery work stream and the international network of Beacon API developers

The Beacon driver project was one of the earliest initiatives of the Global Alliance for Genomics and Health with the Beacon v1.0 API as first approved GA4GH standard. Version 2 of the protocol is slated to provide fundamental changes, towards a Internet of Genomics foundational standard: * requests beyond genomic variants ("filters") * payload responses, secured through open AAI * aligning w/ GA4GH standards (Phenopackets, VRS, DUO...) through SchemaBlocks {S}[B] * Working with international partners on deployment of advanced implementations

Continue reading

Progenetix - A cancer genomics reference resource around GA4GH standards

GA4GH 8th Plenary

Michael Baudis

The Progenetix oncogenomics resource provides sample-specific cancer genome profiling data and biomedical annotations as well as provenance data from cancer studies. Especially through currently 113322 curated genomic copy number number (CNV) profiles from 1600 individual studies representing over 500 cancer types (NCIt), Progenetix empowers aggregate and comparative analyses which vastly exceed individual studies or single diagnostic concepts. Continue reading

Cancer Data - ELIXIR::GA4GH: Advancing genomics resources through standards and ontologies

ECCB2020

Michael Baudis

Continue reading

The Ubiquitin Ligase TRIP12 Limits PARP1 Trapping and Constrains PARP Inhibitor Efficiency

Marco Gatti, Ralph Imhof, Qingyao Huang, Michael Baudis, Matthias Altmeyer

Cell Rep. 2020 Aug 4 DOI: 10.1016/j.celrep.2020.107985

Abstract PARP inhibitors (PARPi) cause synthetic lethality in BRCA-deficient tumors. Whether specific vulnerabilities to PARPi exist beyond BRCA mutations and related defects in homology-directed repair (HDR) is not well understood. Here, we identify the ubiquitin E3 ligase TRIP12 as negative regulator of PARPi sensitivity. Continue reading

Beacon v2 - Towards Flexible Use and Clinical Applications for a Reference Genomic Data Sharing Protocol

Personalized Health Technologies 2020

Michael Baudis

Beacons provide discovery services for genomic data using the Beacon API developed under the leadership of ELIXIR, as a key driver project of the Global Alliance for Genomics and Health (GA4GH). The Beacon protocol itself defines an open standard for genomics data discovery. It provides a framework for public web services responding to queries against genomic data collections, for instance from population based or disease specific genome repositories. Sites offering beacons can scale through aggregation in "Beacon Networks", which distribute single genome queries among a potentially large number of international beacons and assemble their responses. Continue reading

ELIXIR All Hands - Beacon Evolution

ELIXIR All Hands 2020 - Beacon Workshop

Michael Baudis

This presentation covers some of Beacon's origins, features and directions.

Continue reading

Oncology Informatics: Status Quo and Outlook - Review

Paul Martin Putora, Michael Baudis, Beth M. Beadle, Issam El Naqa, Frank A. Giordano and Nils H. Nicolay

Oncology, 2020-05-14. DOI 10.1159/000507586 (Review)

Abstract Oncology has undergone rapid progress, with emerging developments in areas including cancer stem cells, molecularly targeted therapies, genomic analyses, and individually tai- lored immunotherapy. These advances have expanded the tools available in the fight against cancer. Some of these have seen broad media coverage resulting in justified public attention. However, these achievements have only been possible due to rapid developments in the expanding field of biomedical informatics and information technology (IT). Continue reading

Swissnex SF: Laura & Michael Baudis - Life & Family

Originally planned for their stays at UCB and LBNL, Laura & Michael were interviewed by Tabea Stoeckel from swissnex San Francisco about their stay in the Bay Area and their research & family life as internationally active scientists.

Continue reading

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Bo Gao and Michael Baudis (2020)

bioRxiv, 2019-07-31. DOI 10.1101/720854
Genomics, Volume 112, Issue 5, September 2020, Pages 3331-3341, accepted 2020-05-06 doi.org/10.1016/j.ygeno.2020.05.008.
Background

Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive "somatic" CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading

SchemaBlocks and GA4GH TASC

A brief presentation about SchemaBlocks concepts and its possible integration into the new GA4GH TASC effort.

Continue reading

A harmonized meta-knowledgebase of clinical interpretations of somatic genomic variants in cancer

Alex H. Wagner, Brian Walsh, Georgia Mayfield, David Tamborero, Dmitriy Sonkin, Kilannin Krysiak, Jordi Deu-Pons, Ryan P. Duren, Jianjiong Gao, Julie McMurry, Sara Patterson, Catherine del Vecchio Fitz, Beth A. Pitel, ..., Nuria Lopez-Bigas, Mark Lawler, Jeremy Goecks, Malachi Griffith, Obi L. Griffith, Adam A. Margolin & Variant Interpretation for Cancer Consortium

Nature Genetics volume 52, pages 448–457 (2020)

Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. Continue reading

Geographic assessment of cancer genome profiling studies

Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)

DATABASE, Volume 2020, 2020, baaa009, doi.org/10.1093/database/baaa009
bioRxiv preprint, 2020-01-11. DOI 10.1101/827683

Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading

SWISSNEX SF Lunch Seminar - Data Mining in Genomics

Genomic Research and Personalised Health Strategies

Resources | Standards | Protocols | Tools | Discourse

These are the slides of a short presentation, given (virtually, since COVID-19) for a SWISSNEX San Francisco lunch meeting.

Continue reading

Enabling population assignment from cancer genomes with SNP2pop

Huang Q and Baudis M. (2020)

Sci Rep 10, 4846 (2020). doi.org/10.1038/s41598-020-61854-x

Abstract In many cancers, incidence, treatment efficacy and overall prognosis vary between geographic populations. Studies disentangling the contributing factors may help in both understanding cancer biology and tailoring therapeutic interventions. Ancestry estimation in such studies should preferably be driven by genomic data, due to frequently missing or erroneous self-reported or inferred metadata. While respective algorithms have been demonstrated for baseline genomes, such a strategy has not been shown for cancer genomes carrying a substantial somatic mutation load. We have developed a bioinformatics tool for the assignment of population groups from genome profiling data for both unaltered and cancer genomes. Continue reading

BBOP Presentation - Baudisgroup Projects & Interests

Continue reading

ELIXIR Beacon Project - Networking Resources Across and Beyond ELIXIR Human Data Communities

ELIXIR Open Day - Wellcome Trust Genome Campus Hinxton

Michael Baudis

In this presentation I introduce the Beacon project and provide my opinions about its future trajectory, and especially its role in driving the alignment of ELIXIR and GA4GH projects in related to (human) genome data sharing.

Continue reading

GA4GH SchemaBlocks for Human Cell Atlas

This is a presentation of the SchemaBlocks initiative and the overall GA4GH context, for the Human Cell Atlas project, given by Michael at one of their teleconferences.

Continue reading

Geographic assessment of cancer genome profiling studies

Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)

bioRxiv, 2020-11-01. DOI 10.1101/827683

Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading

Talk at St. Gallen Radiation Oncology - Bioinformatics and Data Exchange

3rd St. Gallen Radiation Oncology Informatics Meeting

Bioinformatics and data exchange for genomics in an international context

Michael Baudis

The presentation at the St. Gallen meeting introduced the audience to the group's research and resources, and how this is connected to the different national & international data standards and sharing initiatives.

Continue reading

Talk at AMED Tokyo - Cancer Genomics and Standards

Mini-Symposium about CNV and Data Standards at AMED Japan, Tokyo

Cancer Genomics and Implementation of Data Driven Standards for Genomic Data Exchange

Michael Baudis

At this meeting, several Japanese participants presented their research and results, with a focus on Copy Number Variants and other structural genome variations. Continue reading

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

BC2 2019, Basel

Bo Gao

Abstract

Background:
Copy number variations (CNV) are regional deviations from thenormal autosomal bi-allelic DNA content. While germline CNVs are a majorcontributor to genomic syndromes and inherited diseases, the majority of cancersaccumulate extensive ”somatic” CNV (sCNV or CNA) during the process ofoncogenetic transformation and progression. While specific sCNV have closelybeen associated with tumorigenesis, intriguingly many neoplasias exhibitrecurrent sCNV patterns beyond the involvement of a few cancer driver genes.Currently, CNV profiles of tumor samples are generated using genomicmicro-arrays or high-throughput DNA sequencing. Regardless of the underlyingtechnology, genomic copy number data is derived from the relative assessmentand integration of multiple signals, with the data generation process being proneto contamination from several sources. Estimated copy number values have noabsolute and linear correlation to their corresponding DNA levels, and the extentof deviation differs between sample profiles which poses a great challenge fordata integration and comparison in large scale genome analysis. Continue reading

Cancer cell lines in focus: somatic copy number & germline variation

BC2 2019, Basel

Qingyao Huang

Abstract

Background:
Human cell lines are convenient model systems in cancer research, for validation of proposed molecular mechanisms as well as to evaluate potential therapeutic approaches, e.g. through high- throughput screening of potential anti-tumour compounds against cancer cell line panels. However, conclusions about biological pathways or pharmacological potential depend on a close molecular relation between the cancer type represented and the cell line model used for analyses. Continue reading

Structural Genome Variants in Cancer: Research, resources standards

Seminar at the University of Florence

Seminar Universita degli Studi Firenze - Dipartimento di Biologia

Structural Genome Variants in Cancer: Research, resources standards

Michael Baudis

Abstract

Genomic copy number variations are major contributors to malignant transformation and progression and constitute - at least in their quantitative extension - the largest contributors to genomic mutation landscapes, in the majority of cancer types. Such mutations occur in the vast majority of tumors as somatic genome alterations (sCNV) during clonal development and expansion and are promoted by a variety of mechanisms leading to extended or focal changes in the number of genomic segments. Continue reading

Leveraging European infrastructures to access 1 million human genomes by 2022

Gary Saunders, Michael Baudis, Regina Becker, Sergi Beltran, Christophe Béroud, Ewan Birney, Cath Brooksbank, Søren Brunak, Marc Van den Bulcke, Rachel Drysdale, Salvador Capella-Gutierrez, Paul Flicek, ..., Niklas Blomberg, and Serena Scollen

Nature Reviews Genetics volume 20, pages693–701 (2019)

Abstract Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.

Continue reading

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Bo Gao and Michael Baudis (2019)

bioRxiv, 2019-07-31. DOI 10.1101/720854

Abstract Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive “somatic” CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading

ELIXIR All Hands - Beacon Introduction

Michael Baudis

This presentation was the opener for the ELIXIR Beacon session, and introduces to current developments and especially the interactions between GA4GH :: Discovery and ELIXIR Beacon.

Continue reading

HGVS 2019 - Development of Standards for Genomic Data Exchange

Human Genome Variation Society - Gothenburg 2019

Implementation Driven Development of Standards for Genomic Data Exchange from Cancer Genome Data Collections

Michael Baudis

Abstract

Cancers are genomic diseases, arising from the clonal propagation of somatic mutation events, with a limited contribution from inherited genomic variants. Genomic copy number variations are major contributors to malignant transformation and progression and constitute - at least in their quantitative extension - the largest contributors to genomic mutation landscapes, in the majority of cancer types. Continue reading

Connecting the silos - Genomic Data Standards, Resources and the Global Alliance for Genomics and Health

R&D Data Intelligence Leaders Forum Basel

Michael Baudis

Abstract

This presentation discusses the need for data sharing in genomics, provides information about the Global Alliance for Genomics and Health (GA4GH), and shows some of our group's contributions, especially regarding Beacon development.

Continue reading

Federated discovery and sharing of genomic data using Beacons

Miroslav Cupak , Stephen Keenan , Jordi Rambla , Sabela de la Torre , Stephanie Dyke , Anthony Brookes , Knox Carey , David Lloyd , Peter Goodhand , Maximilian Haeussler , Michael Baudis , Heinz Stockinger , Lena Dolman , Ilkka Lappalainen , Juha Törnroos , Mikael Linden , John Spalding , Saif Ur-Rehman , Angela Page , Paul Flicek , Susheel Varma , Gary Saunders , Serena Scollen , Stephen Sherry , David Haussler , Beacon Project Team

Nat Biotechnol (2019), accepted 2019-01-23

Abstract The Beacon Project (github.com/ga4gh-beacon/) is a GA4GH initiative that is developing an open specification for genetic variation discovery and sharing. The project is demonstrating the willingness of international organizations to work together to define standards for, and actively engage in, genomic data sharing. In the two years since the project’s inception, over 90 Beacons have been lit by 35 organizations serving over 200 datasets. Continue reading

DNA copy number imbalances in primary cutaneous lymphomas (PCL)

Gug G, Huang Q, Chiticariu E, Solovan C and Baudis M (2019)

JEADV, 2019-01-19. doi.org/10.1111/jdv.15442

The article has been published with the Journal of the European Academy of Dermatology and Venereology on January 19, 2019. A corresponding preprint can be accessed through [bioRxiv].

Background

Cutaneous lymphomas (CL) represent a clinically defined group of extran‐ odal non‐Hodgkin lymphomas harbouring heterogeneous and incompletely delineated molecular aberrations. Over the past decades, molecular stud‐ ies have identified several chromosomal aberrations, but the interpreta‐ tion of individual genomic studies can be challenging.

Objective

With a comprehensive meta‐analysis, we aim to delineate genomic alter‐ ations for different types of CL and propose a more accurate classifica‐ tion in line with their various pathogenicity. Continue reading

Enabling population assignment from cancer genomes with SNP2pop

Huang Q and Baudis M. (2019)

bioRxiv, 2019-01-14. doi.org/10.1101/368647 (first version 2018-07-14)

Abstract For a variety of human malignancies, incidence, treatment efficacy and overall prognosis show considerable variation between different populations and ethnic groups. Disentangling the effects related to particular population backgrounds can help in both understanding cancer biology and in tailoring therapeutic interventions. Because self-reported or inferred patient data can be incomplete or misleading due to migration and genomic admixture, a data-driven ancestry estimation should be preferred. While algorithms to analyze ancestry structure from healthy individuals have been developed, an easy-to-use tool to assign population groups based on genotyping data from SNP profiles is still missing and benchmarking for the validity of population assignment strategy for aberrant cancer genomes was not tested. Continue reading

2018 09 18 BIO390 Michael Baudis Introduction to Bioinformatics

UZH BIO390 "Introduction to Bioinformatics"

Bioinformatics - Introduction

Michael Baudis

Abstract

First lecture in the UZH BIO390 "Introduction to Bioinformatics" series, introducing concepts and scope of bioinformatics as a field - 2018 version.

Continue reading

ECCB 2018 - Beacon

Abstract: ECCB 2018

ELIXIR Beacon - A Driver Project for the Global Alliance for Genomics and Health

Michael Baudis for the ELIXIR Beacon Project

The Global Alliance for Genomics and Health (GA4GH) develops standards and guidelines to facilitate the international sharing of genomic and health related metadata. The creation of GA4GH work stream products is moved forward through driver projects, which address particular scientific, technical, regulatory or security related aspects of data access and sharing. Continue reading

Registered access: authorizing data access

Dyke SOM, Linden M, Lappalainen I, De Argila JR, Carey K, Lloyd D, Spalding JD, Cabili MN, Kerry G, Foreman J, Cutts T, Shabani M, Rodriguez LL, Haeussler M, Walsh B, Jiang X, Wang S, Perrett D, Boughtwood T, ..., Rehm HL, Baudis M, Sherry ST, Kato K, Knoppers BM, Baker D, and Flicek P

European Journal of Human Genetics (2018)

Abstract The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.

Continue reading