Welcome to the baudisgroup Pages ¶

The baudisgroup website represents projects and information by the Computational Oncogenomics Group of the University of Zurich (UZH) and the Swiss Institute of Bioinformatics (SIB). For visitors more interested in Particle Astrophysics, we strongly recommend the website of another, although related, Professor Baudis.

The Computational Oncogenomics Group's research focus lies in the exploration of structural genome variations in cancer. Our work centres around our Progenetix resource of curated molecular-cytogenetic and sequencing data. Specific projects explore computational methods, genomics of selected tumour entities and genomic variant patterns across malignancies. As members of the Global Alliance for Genomics and Health, the group is developing standards in biocuration and data sharing for genomic variants and phenotypic data, for instance in driving development of the ELIXIR Beacon project. Other research is related to genome data epistemology, e.g. geographic and diagnostic sampling biases in cancer studies.

Latest News & Publications¶

Congratulations to Rahel!

Rahel Paloots today passed her PhD defence

Congratulations to Rahel Paloots for passing her defence for a PhD with the topic

**Molecular Heterogeneity among Cancer Cell Lines and their Correspondence

to Primary Neoplasias**

Rahel has been a contributor to progenetix.org and the main developer behind cancercelllines.org (a resource for, well, cancer cell lines…) - but also to the GA4GH & ELIXIR Beacon project.

Opening Cancer Genomics - Deploying the GA4GH Beacon protocol

Zürich Cancer ORD Workshop

The "Moving ORD practices into cancer care" project explores and tests ORD principles in the context of cancer care, through a combination of engagement with relevant communities at the UZH as well as a technical demonstrator of a cutting edge data discovery technology. The cancer community at UZH is being mobilized through a dedicated workshop and topical meetings, to discuss challenges and opportunities of ORD in the oncology domain. A pilot installation of an established ORD approach (Beacon technology by the Global Alliance for Genomics and Health) is being implemented to test the sharing of cancer related -omics and associated data with a focus on the specific data access and security aspects presenting here.

The ELIXIR hCNV Community - Making complex genomics accessible

ELIXIR Webcast

Michael Baudis, Antonio Rausell & Krzysztof Poterlowicz¶

Abstract Genomic copy number variants (CNV) are a major contributor to human genome variation and important factors in rare disease genetics and cancer genomics. However, the complexity of CNV detection technologies, the lack of standardised annotation formats and the fragmentation of cytogenetic and genomic communities so far has limited large scale utilization of CNV profiles in computational genomics. Continue reading

cancercelllines.org listed in Expasy

Entry in the Swiss Institute of Bioinformatics Catalogue

Expasy logo Our recently launched cancer cell line genomics site cancercelllines.org is now listed as one of the resources in the Swiss Institute of Bioinformatics’ Expasy catalogue.

Genomic Data Mining and The Case for Open Data Standards

ZHAW ICLS colloquium

ZHAW logo The last decades have an enormous amount of data generation in the area of biomedical genomics, driven by a feedback loop of technological developments and increasing recognition of human genomic complexities, in health and disease. While individual projects frequently are tailored towards addressing specific research questions, the approach to many problems especially in rare diseases and cancer genomics requires both curated data collections and the access to data from international repositories and study centers.

Here I will present our oncogenomic data resources and research projects, and how our work on data discovery and exchange standards as part of an international community can improve data accessibility through the support of federated discovery and retrieval protocols.

GA4GH Connect April 2024 in Ascona

Spring 2024 GA4GH Connect working meeting co-organized by our group

We're proud to host the next Spring GA4GH Connect meeting in April 2024 at the Congressi Stefano Franscini on the Monte Verità in Ascona. This will provide an excellent opportunity for Swiss genomics and bioinformatics to, well, connect with the international "genomics and health" community and projects.

Genomdaten - Chancen und Risiken für Medizin und Gesellschaft

Vortrag an der Seniorenuniversität Winterthur

ZHAW logo In der medizinischen Forschung werden Genomdaten für genauere Diagnosen und zur Auswahl personalisierter Therapien herangezogen. Genomdaten finden aber auch ausserhalb der Medizin Verwendung – etwa in der genealogischen Forschung oder in der Forensik. Diesem Potential stehen jedoch auch Risiken gegenüber.

Wenn genomische Daten von Millionen von Menschen erhoben werden, wer soll den Einzelnen und die Gesellschaft vor dem Missbrauch solcher Daten schützen? Der Vortrag beantwortet Fragen im Hinblick auf den Bedarf an solchen Daten in Medizin und Forschung, und die dadurch eventuell entstehenden Risiken.

Warum brauchen wir genomische Daten von Millionen von Individuen? Wie können meine Daten eine Gefahr für mich darstellen? Und wer sollte mich als Individuum vor einem Missbrauch schützen?

cancercelllines.org - a Novel Resource for Genomic Variants in Cancer Cell Lines

DATABASE Article

Rahel Paloots and Michael Baudis¶

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030 ¶

bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281 ¶

Abstract: Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource - cancercelllines.org - with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants (SNVs) data. We have gathered over 5,600 copy number profiles as well as SNV annotations for 16,000 cell lines and provide this data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 API and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme.

Availability and Implementation: Our resource is publicly available on the web at cancercelllines.org.

Federated genomic discoveries: Deploying the GA4GH Beacon protocol

Virtual Seminar
GHGA Lecture Series

With the ever increasing amount of genomic data produced in the context of research studies, population analyses and medical diagnostics the need for access to genomic information beyond administrative or geographic boundaries has become a matter of eminent importance. Continue reading

Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines

Literature-derived annotations as entry point for data exploration

Ellery Smith, Rahel Paloots, Dimitris Giagkos, Michael Baudis and Kurt Stockinger¶

Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045 ¶

Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933 ¶

arXiv logo Motivation: With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume (Lubowitz et al., 2021). As a consequence, in the fields of biological, medical and clinical research, domain experts have to sift through massive amounts of scientific text to find relevant information. However, this process is extremely tedious and slow to be performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. Results: In this work, we present the design, implementation and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data in the domain of cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard.

Availability and Implementation: Our system is publicly available on the web at cancercelllines.org.

Contact: The authors can be contacted at ellery.smith@zhaw.ch or rahel.paloots@uzh.ch.

Twelve quick tips for deploying a Beacon

Some hints for Beacon developers & implementers

Lauren A Fromont, Mauricio Moldes, Michael Baudis, Anthony J Brookes, Arcadi Navarro and Jordi Rambla¶

PLoS Comput Biol. 2024 Mar 1;20(3):e1011817.¶

doi: 10.1371/journal.pcbi.1011817.
PMID: 38427629

Introduction: In the age of data-driven biomedical research and clinical practice, the sharing of genomic and clinical data for health research and personalized medicine has become an important contribu- tor to improved diagnosis and treatment. From the data owner’s perspective, potential benefits include improved treatments, personalization of healthcare practice, and more effective con- trol of disease proliferation. However, the requirement for high levels of data security to pro- tect sensitive information presents a barrier to data discovery and sharing.

Beacon is designed to enable the benefits of data discovery while minimizing the associated risks...

Genomic Data Sharing Standard Development with GA4GH and ELIXIR
Opportunities and Pitfalls in Federated Data Discovery

DMLS Lecture Series
University of Zurich Department of Molecular Life Sciences

UZH logo In this presentation Michael talks about the role of GA4GH (and ELIXIR) in the development of standards and practices or genomic data exchange, some general principles, how his group got involved into these efforts - but also some pitfalls ...

Structural Genome Variations in Cancer and the Case for Open Data Standards

Cancer Genomics Seminar at Utrecht
Hubrecht Institute and Princess Maxima Center for Pediatric Oncology

The presentation includes notes about work on improving the representation and of genomic copy number variations (CNV), GA4GH and its Beacon protocol as well as challenges towards genomic data privacy.

labelSeg: segment annotation for tumor copy number alteration profiles

A tool to assign relative SCNA levels to segments

Hangjia Zhao and Michael Baudis¶

Briefings in Bioinformatics (Oxford). 2024 Jan 31;2024:bbad541.¶

doi: 10.1093/bib/bbad541
PMID: 38300514
bioRxiv. doi: doi.org/10.1101/2023.05.17.541097

Abstract Somatic copy number alterations (SCNAs) are a predominant type of oncogenomic alterations that affect a large proportion of the genome in the majority of cancer samples. Current technologies allow high-throughput measurement of such copy number aberrations, generating results consisting of frequently large sets of SCNA segments. However, the automated annotation and integration of such data are particularly challenging because the measured signals reflect biased, relative copy number ratios. In this study, we introduce labelSeg, an algorithm designed for rapid and accurate annotation of CNA segments, with the aim of enhancing the interpretation of tumor SCNA profiles. Continue reading

Beaconize this: Databases for Cancer Genomics and the Development of Open Data Standards

Seminar at the Bioinformatics club of the Centre de Recherche des Cordeliers (CRC)
Université Paris Cité

In this seminar at the Centre de Recherche des Cordeliers in Paris Michael presents the work of the group, with special emphasis on the role of the Progenetix oncogenomics resources and tools in the development, implementation and testing of the Beacon standard of the Global Alliance for Genomics and Health (GA4GH).

pgxRpi Accepted by Bioconductor

The R wrapper for Peogenetix API pgxRpi is now part of the 'devel' branch and will be included in the upcoming 3.19 release in mid-April 2024.

CNV Project at biohackathon23

Participating at #BioHackEU23 in Barcelona with a CNV reference resource project

biohackathon logo image With other members of the hCNV community some of us will participate at this year's Biohackathon Europe event. The main project will address the creation of the template for a "beaconized" public resource for reference (i.e. not disease associated) copy number variation data, including the necessary tooling for the import from e.g. VCF or BED file variants into Beacon backends (such as our bycon environment).

Swiss-Korean Life Science Symposium

The 10th Swiss-Korean Life Science Symposium in Seoul

As representative of the Swiss delegation and particularly of the University of Zurich UZH Michael will be presenter and panel discussion participant at the 10th Swiss-Korean Life Science Symposium in Seoul, together with members of the Swiss and Korean life sciences and personalized health academic and industrial communities.

Progenetix as SIB and ELIXIR Resource

Recognizing the Progenetix platform as Swiss contribution to the European bioinformatics resources ecosystem

elixir logo The Progenetix resource has finally been recognized as an official contribution to the ELIXIR European bioinformatics informatics ecosystem. Besides Expasy Progenetix now is linked through ELIXIR's resource page. Or just go directly to progenetix.org (and its daughter project cancercelllines.org).

Short tandem repeat mutations regulate gene expression in colorectal cancer

Exploring STR patterns and their relation to expression changes in cancer

Max A Verbiest, Oxana Lundström, Feifei Xia, Michael Baudis, Tugce Bilgin Sonay, Maria Anisimova¶

doi: https://doi.org/10.1101/2023.11.29.569189 ¶

biorXiv logo Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression levels to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. The increased mutability of eSTRs in MSI tumours may be an early indication that eSTR mutations can confer a selective advantage to tumours. Future extensions of our findings into larger cohorts could uncover new STR-based targets in the treatment of cancer.

ELIXIR All Hands Dublin

Baudisgroup presentations at the AHM 2023 in Dublin

Rahel, Hangjia & Michael for the group¶

At the ELIXIR All Hands Meeting 2023 in Dublin our group presented several posters about our resources and work in standards development.

Phenopacket-tools: Building and validating GA4GH Phenopackets

Bioinformatics tools and examples for working with the Phenopackets standard

Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. (2023) Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One. 18:e0285433.¶

Abstract The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Continue reading

Theoretical Cytogenetics and Oncogenomics

DMLS Tandem Talks

Michael Baudis¶

In this short presentation Michael provides an overview of the group's work in cancer genomics resources, data analysis and standard development, including the involvement in the Global Alliance for Genomics and Health GA4GH.

Cancer Genome Profiles
Oncogenomic Data Resources
Bioinformatics Methods
Data Exchange Standards for Genomics and Personalized Health

ZHAW Visitors for Cancercelllines Hackathon

Implementing INODE-driven literature collections

Fore some months our group has worked with members of Kurt Stockinger's team from the ZHAW on a cancer cell lines use case for the INODE project. In the last 2 days we had a site visit for a first implementation of the use case specific system on cancercelllines.org. More information to follow - and thanks to Ellery & Dimitris for the great work!

Candidate targets of copy number deletion events across 17 cancer types

Identifying cancer related genes against the background of somatic CNV events

Huang Q and Baudis M¶

doi: 10.3389/fgene.2022.1017657 ¶

previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080 ¶

Abstract Genome variation is the direct cause of cancer and driver of its clonal evolution. While the impact of many point mutations can be evaluated through their modification of individual genomic elements, even a single copy number aberration (CNA) may encompass hundreds of genes and therefore pose challenges to untangle potentially complex functional effects. However, consistent, recurring and disease-specific patterns in the genome-wide CNA landscape imply that particular CNA may promote cancer-type-specific characteristics. Discerning essential cancer-promoting alterations from the inherent co-dependency in CNA would improve the understanding of mechanisms of CNA and provide new insights into cancer biology and potential therapeutic targets. Continue reading

Genomic Resource Built with GA4GH Standards

EORTC PAMM Firenze

Michael Baudis¶

This brief presentation introduces the Progenetix resource, the Gobal Alliance for Genomics and Health as a developers of standards for data sharing in biomedical genomics as well as the use of Progenetix in GA4GH standards development.

Beacon v2 - Onboarding Strategies & Feature Examples

Beacon Sessions at GA4GH Connect

Michael Baudis¶

The Beacon Sessions at GA4GH Connect November 2022 targeted the migration of existing and implementation of new v2 Beacons, with emphasis on the "how to get there easily" rather than on all Beacon v2 features. Continue reading

Genomics Data Federation through Global Alliance for Genomics and Health Standards: Development and Implementation of the GA4GH Beacon Protocol

Seminar Yonsei University Medical School Seoul

Michael Baudis¶

In this Seoul meeting presentation Michael introduces the Global Alliance for Genomics and Healt and its involvement in Genomics standards development, followed by a discussion of the Beacon protocol and the role of the Progenetix resouce in its development. Continue reading

Beacon v2 - Feature-rich Implementation of the Genomic Data Discovery Protocol

GA4GH 2022 Plenary Barcelona

Michael Baudis¶

The “Beacon” protocol - developed with support from ELIXIR, the European bioinformatics infrastructure organization, as a standard of the Global Alliance for Genomics and Health (GA4GH) - represents an emerging standard for an “Internet for Genomics”. While the initial version of the protocol served as a widely adopted test bed for the sharing of genomic variants over federated query systems connecting hundreds of internationally distributed resources, the version 2 of the protocol provides a framework for extended, metadata-rich query and response options in both public and restricted federated access scenarios. Continue reading

GA4GH Phenopackets: A Practical Introduction

Phenopackets v2 introduction with practical examples

Ladewig MS, Jacobsen JO, Wagner AH, Danis D, Kassaby BE, Gargano M, Groza T, Baudis M, Steinhaus R, Seelow D, Bechrakis NE, Mungall CJ, Schofield PN, Elemento O, Smith L, McMurry JA, Munoz-Torres M, Haendel MA and Robinson PN¶

Advanced Genetics 2022, 2200016. LINK ¶

Abstract The Global Alliance for Genomics and Health (GA4GH) is developing a suite of coordinated standards for genomics for healthcare. The Phenopacket is a new GA4GH standard for sharing disease and phenotype information that characterizes an individual person, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments. A detailed example is presented that illustrates how to use the schema to represent the clinical course of a patient with retinoblastoma, including demographic information, the clinical diagnosis, phenotypic features and clinical measurements, an examination of the extirpated tumor, therapies, and the results of genomic analysis. The Phenopacket Schema, together with other GA4GH data and technical standards, will enable data exchange and provide a foundation for the computational analysis of disease and phenotype information to improve our ability to diagnose and conduct research on all types of disorders, including cancer and rare diseases.

The Phenopacket software is available at github.com/phenopackets/.

Beacon - Ethical & Legal Aspects of a Genomic Data Discovery Protocol

DSI Ethics Project Pitch

Michael Baudis¶

Here Michael provides a very brief presentation about the GA4GH Beacon protocol, especially as a target for projects discussing the ethical implications of genome data discovery & sharing as well as the relevant legal frameworks, with emphasis on the Swiss context. Continue reading

A cancer genomics resource built on GA4GH standards

Rahel Paloots, Michael Baudis¶

CGC St Louis 2022¶

Progenetix is a cancer genomics resource that includes genomic profiling data as well as biomedical annotations and provenance data for cancer studies. The main goal of the Progenetix database is to provide easy, open access for research studies and clinical diagnostics. To facilitate sharing of genomic data, Progenetix complies with and contributes to GA4GH and Beacon data standards. Beacon, developed with the support from ELXIR (the European bioinformatics infrastructure organization), started out as protocol to share genomic variants over federated queries.

A cancer genomics reference resource powered by GA4GH standards

Roche Data Science Seminar

Michael Baudis¶

The presentation reports about the Progenetix cancer genomics resource and its role in the GA4GH ecosystem & the Beacon genomics API development process.

The GA4GH Phenopacket schema defines a computable representation of clinical data

Phenopackets v2 publication

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Buske OJ, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Essenwanger A, Freimuth RR, ... , Haendel MA, Robinson PN, The GAGHPMC.¶

Nature Biotechnology. 2022;40:817-820. LINK | PMID:35705716 ¶

Abstract Despite great strides made in the development and wide acceptance of standards for exchanging structured information about genomic variants, progress in standards for computational phenotype analysis for translational genomics has lagged behind. Phenotypic features (signs, symptoms, laboratory and imaging findings, results of physiological tests, etc.) are of high clinical importance, yet exchanging them in conjunction with genomic variation information is often overlooked or even neglected. Continue reading

Implementation of the GA4GH Beacon protocol for discovery and sharing of genomic copy number variation data

ESHG Vienna 2022

Michael Baudis¶

Background & Objectives Genomic copy number variations (CNV) are a major contributor to inter-individual genomic variation, can be causative events in rare diseases, but especially represent the majority of the mutational landscape in the most malignancies. While specific CNV events and some recurring patterns have contributed to the identification of individual cancer drivers and the recognition of cancer subtypes, the complexity of genomic CNV patterns requires large amounts of well-defined genomic profiles for statistically meaningful analyses. At the other end of the spectrum, in the area of rare disease genomics the potential pathogenicity of individual CNV events requires validation against a vast set of disease-related and reference genomic profiles and annotations.

Progenetix & BeaconPlus - An open cancer genomics resource on a stack of Beacon code...

ELIXIR All Hands Amsterdam 2022

Michael Baudis¶

Here Michael provides some overview of the multi-year trajectory of the Beacon API development, and how BeaconPlus & Progenetix have been utilized for "implementation driven design".

Beacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond

Beacon v2 publication

Rambla J, Baudis M, Ariosa R, Beck T, Fromont LA, Navarro A, Paloots R, Rueda M, Saunders G, Singh B, Spalding JD.¶

Human Mutation. 2022 Mar 17. PMID:35297548 ¶

Abstract Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers.

hCNV Implementation Studies Old and New

ELIXIR Human Data Communities

Michael Baudis¶

This presentation provided an overview about the hCNV community, implementation studies and ongoing work, e.g. interaction with the GA4GH VRS standard group and Beacon development.

Technical, legal and ethics aspects of genomic data sharing

Genomes | Privacy | Laws | Society - DSI Ethics Seminar

Michael Baudis¶

The presentation introduces the need for sharing and federated discovery of genome data in the contexts of personalized health and genomic researchand some of teh current developments in international standards and practices in the area. Continue reading

The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine

Phenopackets v2 preprint

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Freimuth RR, ..., Haendel MA, Robinson PN.¶

medRxiv, 2021.11.27.21266944. doi:10.1101/2021.11.27.21266944 ¶

Abstract Despite great strides in the development and wide acceptance of standards for exchanging structured information about genomic variants, there is no corresponding standard for exchanging phenotypic data, and this has impeded the sharing of phenotypic information for computational analysis. Here, we introduce the Global Alliance for Genomics and Health (GA4GH) Phenopacket schema, which supports exchange of computable longitudinal case-level phenotypic information for diagnosis and research of all types of disease including Mendelian and complex genetic diseases, cancer, and infectious diseases. Continue reading

The GA4GH Variation Representation Specification (VRS): a Computational Framework for the Precise Representation and Federated Identification of Molecular Variation.

Alex H. Wagner, Lawrence Babb, Gil Alterovitz, Michael Baudis, Matthew Brush, Daniel L. Cameron, Melissa Cline , Malachi Griffith, Obi L. Griffith, ..., Melissa Konopko, Heidi L. Rehm, Andrew D. Yates, Robert R. Freimuth, Reece K. Hart¶

Wagner, Alex H. et al. Cell Genomics, Volume 1, Issue 2, 100027 doi:10.1016/j.xgen.2021.100027 ¶

bioRxiv. version 20212021.01.15.426843. (2021-01-15)¶

Note¶

This article was published as part of a special GA4GH edition of Cell Genomics.

Abstract Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced “verse”), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. Continue reading

International federation of genomic medicine databases using GA4GH standards

Adrian Thorogood, Heidi L. Rehm, Peter Goodhand, Angela J.H. Page, Yann Joly, Michael Baudis, Jordi Rambla, Arcadi Navarro, Tommi H. Nyronen, Mikael Linden, Edward S. Dove, Marc Fiume, Michael Brudno, Melissa S. Cline, Ewan Birney¶

Thorogood, Adrian et al. Cell Genomics, Volume 1, Issue 2, 100032 doi:10.1016/j.xgen.2021.100032 ¶

Note¶

This article was published as part of a special GA4GH edition of Cell Genomics.

Abstract We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.

GA4GH: International policies and standards for data sharing across genomic research and healthcare

Heidi L. Rehm, Angela J.H. Page, Lindsay Smith, Jeremy B. Adams, Gil Alterovitz, Lawrence J. Babb, Maxmillian P. Barkley, Michael Baudis, Michael J.S. Beauvais, Tim Beck, Jacques S. Beckmann, Sergi Beltran, David Bernick, Alexander Bernier, James K. Bonfield, Tiffany F. Boughtwood, Guillaume Bourque, Sarion R. Bowers, Anthony J. Brookes, Michael Brudno, Matthew H. Brush, David Bujold, Tony Burdett, Orion J. Buske, Moran N. Cabili , Daniel L. Cameron, Robert J. Carroll, Esmeralda Casas-Silva, Debyani Chakravarty, Bimal P. Chaudhari, Shu Hui Chen, J. Michael Cherry, Justina Chung, Melissa Cline, Hayley L. Clissold, Robert M. Cook-Deegan, Mélanie Courtot, ..., Peter Goodhand, Kathryn North, Ewan Birney¶

Rehm, Heidi L. et al. Cell Genomics, Volume 1, Issue 2, 100029 doi:10.1016/j.xgen.2021.100029 ¶

Note¶

This article was published as part of a special GA4GH edition of Cell Genomics.

Abstract The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. Continue reading

A cancer genomics resource built around and driving GA4GH standards

GRIC sponsored workshop with the Swiss Institute of Bioinformatics

Michael Baudis¶

The Progenetix oncogenomics resource provides sample-specific cancer genome profiling data and biomedical annotations as well as provenance data for cancer studies. With more than 100k genomic copy number number (CNV) profiles from over 700 cancer types, Progenetix empowers comparative analyses beyond individual studies and diagnostic concepts.

A Standardized Format for Federated Genomic Data Exchange

The GA4GH Beacon Protocol Presented at BC² Basel 2021
Session "Federating computational analyses with GA4GH standards"

Michael Baudis¶

BC2 logo During the "Federating computational analyses with GA4GH standards" workshop at BC² 2021 Michael presented history and the current status of the Beacon project, as well as its integration with specific data resources and analysis initiatives.

Cancer genomics reference resource and toolkit around GA4GH standards

ESHG 2021

Q. Huang, B. Gao, R. Paloots, P. Carrio-Cordo, Z. Yang, M. Baudis¶

ESHG Logo This poster presentation at the European Society of Human Genetics meeting 2021 discusses the integration and development of GA4GH standards by the Progenetix oncogenomics resource.

Progenetix - An open reference resource for copy number vatiation data in cancer

Qingyao Huang¶

The Progenetix oncogenomic resource in 2021

Article describing the current content & technical status of progenetix.org

Qingyao Huang, Paula Carrio Cordo, Bo Gao, Rahel Paloots, Michael Baudis¶

Database (Oxford). 2021 Jul 17;2021:baab043.¶

doi: 10.1093/database/baab043.
PMID: 34272855
PMCID: PMC8285936.
bioRxiv. doi: doi.org/10.1101/2021.02.15.428237

Abstract In cancer, copy number aberrations (CNAs) represent a type of nearly ubiquitous and frequently extensive structural genome variations. To disentangle the molecular mechanisms underlying tumorigenesis as well as identify and characterize molecular subtypes, the comparative and meta-analysis of large genomic variant collections can be of immense importance. Over the last decades, cancer genomic profiling projects have resulted in a large amount of somatic genome variation profiles, however segregated in a multitude of individual studies and datasets. The Progenetix project, initiated in 2001, curates individual cancer CNA profiles and associated metadata from published oncogenomic studies and data repositories with the aim to empower integrative analyses spanning all different cancer biologies. Continue reading

hCNV Community and Implementation Studies

Michael Baudis¶

ELIXIR All Hands 2021 Human Data Day ¶

At the Human Data Day Michael presents a very brief overview of the ending and upcoming ELIXIR hCNV implementation studies.

hCNV data and the Progenetix Beacon

Presentation at ELIXIR All Hands 2021

Michael Baudis¶

ELIXIR All Hands 2021 ¶

This presentation gives a brief overview of the use of the Progenetix resource to test and implement a genomics reference resource using the emerging Beacon v2 protocol.

Signatures of Discriminative CNA in 31 Cancer Subtypes

Bo Gao and Michael Baudis (2021)¶

Published at Frontiers in Genetics, 2021-05-13¶

Abstract¶

Copy number aberrations (CNA) are one of the most important classes of genomic mutations relatedto oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated bymolecular-cytogenetic and genome sequencing based methods. While this data has been instrumentalin the identification of cancer-related genes and promoted research into the relation between CNA andhisto-pathologically defined cancer types, the heterogeneity of source data and derived CNV profilespose great challenges for data integration and comparative analysis. Furthermore, a majority of exist-ing studies have been focused on the association of CNA to pre-selected ”driver” genes with limitedapplication to rare drivers and other genomic elements.

Progenetix, Beacon and GA4GH at RDA

Research Data Alliance - RDA Virtual Plenary 17

Concepts | Status | History | Outlook¶

Michael Baudis¶

Research Data Alliance - RDA Virtual Plenary 17 ¶

This seminar gives an overview of current state of the Progenetix Beacon project and the overall connection to the Global Alliance for Genomics and Health (GA4GH).

Implementing GA4GH Standards to Drive an Open Oncogenomics Resource

Research Seminar Kinderspital Zürich - Neuroonkologie

Michael Baudis¶

Seminar Neurooncology Childrens Hospital Zürich¶

This seminar gives an overview of the history & current state of the Progenetix resource, it's role in Beacon API development and the overall connection to the Global Alliance for Genomics and Health (GA4GH).

Discovering copy number variation across multiple cancer types

Qingyao Huang¶

Abstract

Genomic variations are direct cause of tumor formation and accomplice in its continuous evolution. While point mutations can be pinpointed to a targeted genetic element, copy number variations (CNVs) involve copy number gain or loss of a large DNA segment which often covers hundreds of genetic elements in one event. Continue reading

EACR conference - The Progenetix Oncogenomic Resource

GA4GH Connect - Beacon v2 and SchemaBlocks

GA4GH Connect 2020 ¶

Michael Baudis¶

Beacon v2 Structural Variants [slides]¶

SchemaBlocks {S}[B] [slides]¶

Copy number variant heterogeneity among cancer types reflects inconsistent concordance with diagnostic classifications

Paula Carrio Cordo and Michael Baudis¶

bioRxiv. doi: doi.org/10.1101/2021.03.01.433348 ¶

This article explores the correlation between subsets of cancer entities, grouped by their somatic CNV patterns, and levels of diagnostic classification systems.

Genomic data and Privacy

Michael Baudis¶

ETHZ Lecture¶

The understanding of the impact of individual inherited and somatic genome variants on phenotypes and diseases requires a thorough understanding of the occurrence of such variants amongst populations in general and carriers of the phenotypes and diseases in particular. This information can only be provided through the inclusion of data from a multitude of genome resources in variant evaluation efforts, including such from outside (international) jurisdictions. However, opening such resources carries the inherent risk of breaching privacy, particularly through re-identification of individuals or their relatives and potentially through the exposure of individual genome-related personal information including phenotypic and "performance" prediction and relative disease risk.

Beacon v2 – Towards flexible use and clinical applications for a reference genomic data protocol

SPHN Webinar ¶

Michael Baudis¶

Genomic “Beacons” provide discovery services for genomic data using the Beacon API developed as a key driver project of the Global Alliance for Genomics and Health (GA4GH). The Beacon protocol itself defines an open standard for genomics data discovery and provides a framework for web services responding to queries against genomic data collections, for instance from population based or disease specific genome repositories. Continue reading

Update of Progenetix Oncogenomics Resource

Research Progress Report, DMLS, University of Zurich ¶

Qingyao Huang¶

Copy number aberration (CNA) is frequently observed in cancer genomes. Meta-analysis of genomic variations helps to disentangle the multiplex molecular mechanism underlying tumorigenesis as well as identify and characterize molecular subtypes. Over the years, cancer genomic research have resulted in a considerable amount of data segregated by studies. The Progenetix project (www.progenetix.org), initiated in 2001, aims to systematize the published cancer genomic profiles and provide accurate annotation to facilitate integrative analysis. Continue reading

Welcome to Ziying

Today Ziying Yang arrived as a new member of the baudisgroup.

Welcome Ziying!

GA4GH Beacon v2 at GA4GH Plenary

GA4GH Beacon v2 - Evolving Reference Standard for Genomic Data Exchange¶

GA4GH 8th Plenary ¶

Gary Saunders, Jordi Rambla de Argila, Anthony Brookes, Juha Törnroos and Michael Baudis¶

For the ELIXIR Beacon project, GA4GH Discovery work stream and the international network of Beacon API developers¶

The Beacon driver project was one of the earliest initiatives of the Global Alliance for Genomics and Health with the Beacon v1.0 API as first approved GA4GH standard. Version 2 of the protocol is slated to provide fundamental changes, towards a Internet of Genomics foundational standard: * requests beyond genomic variants ("filters") * payload responses, secured through open AAI * aligning w/ GA4GH standards (Phenopackets, VRS, DUO...) through SchemaBlocks {S}[B] * Working with international partners on deployment of advanced implementations

Progenetix - A cancer genomics reference resource around GA4GH standards

GA4GH 8th Plenary ¶

Michael Baudis¶

The Progenetix oncogenomics resource provides sample-specific cancer genome profiling data and biomedical annotations as well as provenance data from cancer studies. Especially through currently 113322 curated genomic copy number number (CNV) profiles from 1600 individual studies representing over 500 cancer types (NCIt), Progenetix empowers aggregate and comparative analyses which vastly exceed individual studies or single diagnostic concepts. Continue reading

Cancer Data - ELIXIR::GA4GH: Advancing genomics resources through standards and ontologies

ECCB2020 ¶

Michael Baudis¶

Additional Links¶

The Ubiquitin Ligase TRIP12 Limits PARP1 Trapping and Constrains PARP Inhibitor Efficiency

Marco Gatti, Ralph Imhof, Qingyao Huang, Michael Baudis, Matthias Altmeyer¶

Cell Rep. 2020 Aug 4 DOI: 10.1016/j.celrep.2020.107985 ¶

Abstract PARP inhibitors (PARPi) cause synthetic lethality in BRCA-deficient tumors. Whether specific vulnerabilities to PARPi exist beyond BRCA mutations and related defects in homology-directed repair (HDR) is not well understood. Here, we identify the ubiquitin E3 ligase TRIP12 as negative regulator of PARPi sensitivity. Continue reading

Beacon v2 - Towards Flexible Use and Clinical Applications for a Reference Genomic Data Sharing Protocol

Personalized Health Technologies 2020 ¶

Michael Baudis¶

Beacons provide discovery services for genomic data using the Beacon API developed under the leadership of ELIXIR, as a key driver project of the Global Alliance for Genomics and Health (GA4GH). The Beacon protocol itself defines an open standard for genomics data discovery. It provides a framework for public web services responding to queries against genomic data collections, for instance from population based or disease specific genome repositories. Sites offering beacons can scale through aggregation in "Beacon Networks", which distribute single genome queries among a potentially large number of international beacons and assemble their responses. Continue reading

ELIXIR All Hands - Beacon Evolution

ELIXIR All Hands 2020 - Beacon Workshop¶

Michael Baudis¶

This presentation covers some of Beacon's origins, features and directions.

Links¶

Oncology Informatics: Status Quo and Outlook - Review

Paul Martin Putora, Michael Baudis, Beth M. Beadle, Issam El Naqa, Frank A. Giordano and Nils H. Nicolay¶

Oncology, 2020-05-14. DOI 10.1159/000507586 (Review)¶

Abstract Oncology has undergone rapid progress, with emerging developments in areas including cancer stem cells, molecularly targeted therapies, genomic analyses, and individually tai- lored immunotherapy. These advances have expanded the tools available in the fight against cancer. Some of these have seen broad media coverage resulting in justified public attention. However, these achievements have only been possible due to rapid developments in the expanding field of biomedical informatics and information technology (IT). Continue reading

Swissnex SF: Laura & Michael Baudis - Life & Family

Originally planned for their stays at UCB and LBNL, Laura & Michael were interviewed by Tabea Stoeckel from swissnex San Francisco about their stay in the Bay Area and their research & family life as internationally active scientists.

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Bo Gao and Michael Baudis (2020)¶

bioRxiv, 2019-07-31. DOI 10.1101/720854 ¶

Genomics, Volume 112, Issue 5, September 2020, Pages 3331-3341, accepted 2020-05-06 doi.org/10.1016/j.ygeno.2020.05.008.¶

Background¶

Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive "somatic" CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading

SchemaBlocks and GA4GH TASC

A brief presentation about SchemaBlocks concepts and its possible integration into the new GA4GH TASC effort.

Additional Links¶

A harmonized meta-knowledgebase of clinical interpretations of somatic genomic variants in cancer

Alex H. Wagner, Brian Walsh, Georgia Mayfield, David Tamborero, Dmitriy Sonkin, Kilannin Krysiak, Jordi Deu-Pons, Ryan P. Duren, Jianjiong Gao, Julie McMurry, Sara Patterson, Catherine del Vecchio Fitz, Beth A. Pitel, ..., Nuria Lopez-Bigas, Mark Lawler, Jeremy Goecks, Malachi Griffith, Obi L. Griffith, Adam A. Margolin & Variant Interpretation for Cancer Consortium¶

Nature Genetics volume 52, pages 448–457 (2020)¶

Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. Continue reading

Geographic assessment of cancer genome profiling studies

Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)¶

DATABASE, Volume 2020, 2020, baaa009, doi.org/10.1093/database/baaa009 ¶

bioRxiv preprint, 2020-01-11. DOI 10.1101/827683 ¶

Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading

SWISSNEX SF Lunch Seminar - Data Mining in Genomics

Genomic Research and Personalised Health Strategies¶

Resources | Standards | Protocols | Tools | Discourse¶

These are the slides of a short presentation, given (virtually, since COVID-19) for a SWISSNEX San Francisco lunch meeting.

Additional Links¶

Enabling population assignment from cancer genomes with SNP2pop

Huang Q and Baudis M. (2020)¶

Sci Rep 10, 4846 (2020). doi.org/10.1038/s41598-020-61854-x ¶

Abstract In many cancers, incidence, treatment efficacy and overall prognosis vary between geographic populations. Studies disentangling the contributing factors may help in both understanding cancer biology and tailoring therapeutic interventions. Ancestry estimation in such studies should preferably be driven by genomic data, due to frequently missing or erroneous self-reported or inferred metadata. While respective algorithms have been demonstrated for baseline genomes, such a strategy has not been shown for cancer genomes carrying a substantial somatic mutation load. We have developed a bioinformatics tool for the assignment of population groups from genome profiling data for both unaltered and cancer genomes. Continue reading

BBOP Presentation - Baudisgroup Projects & Interests

Links¶

ELIXIR Beacon Project - Networking Resources Across and Beyond ELIXIR Human Data Communities

ELIXIR Open Day - Wellcome Trust Genome Campus Hinxton¶

Michael Baudis¶

In this presentation I introduce the Beacon project and provide my opinions about its future trajectory, and especially its role in driving the alignment of ELIXIR and GA4GH projects in related to (human) genome data sharing.

Links¶

GA4GH SchemaBlocks for Human Cell Atlas

This is a presentation of the SchemaBlocks initiative and the overall GA4GH context, for the Human Cell Atlas project, given by Michael at one of their teleconferences.

Additional Links¶

Geographic assessment of cancer genome profiling studies

Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)¶

bioRxiv, 2020-11-01. DOI 10.1101/827683 ¶

Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading

Talk at St. Gallen Radiation Oncology - Bioinformatics and Data Exchange

3rd St. Gallen Radiation Oncology Informatics Meeting¶

Bioinformatics and data exchange for genomics in an international context¶

Michael Baudis¶

The presentation at the St. Gallen meeting introduced the audience to the group's research and resources, and how this is connected to the different national & international data standards and sharing initiatives.

Talk at AMED Tokyo - Cancer Genomics and Standards

Mini-Symposium about CNV and Data Standards at AMED Japan, Tokyo ¶

Cancer Genomics and Implementation of Data Driven Standards for Genomic Data Exchange¶

Michael Baudis¶

At this meeting, several Japanese participants presented their research and results, with a focus on Copy Number Variants and other structural genome variations. Continue reading

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

BC2 2019, Basel¶

Bo Gao¶

Abstract¶

Background:
Copy number variations (CNV) are regional deviations from thenormal autosomal bi-allelic DNA content. While germline CNVs are a majorcontributor to genomic syndromes and inherited diseases, the majority of cancersaccumulate extensive ”somatic” CNV (sCNV or CNA) during the process ofoncogenetic transformation and progression. While specific sCNV have closelybeen associated with tumorigenesis, intriguingly many neoplasias exhibitrecurrent sCNV patterns beyond the involvement of a few cancer driver genes.Currently, CNV profiles of tumor samples are generated using genomicmicro-arrays or high-throughput DNA sequencing. Regardless of the underlyingtechnology, genomic copy number data is derived from the relative assessmentand integration of multiple signals, with the data generation process being proneto contamination from several sources. Estimated copy number values have noabsolute and linear correlation to their corresponding DNA levels, and the extentof deviation differs between sample profiles which poses a great challenge fordata integration and comparison in large scale genome analysis. Continue reading

Cancer cell lines in focus: somatic copy number & germline variation

BC2 2019, Basel¶

Qingyao Huang¶

Abstract¶

Background:
Human cell lines are convenient model systems in cancer research, for validation of proposed molecular mechanisms as well as to evaluate potential therapeutic approaches, e.g. through high- throughput screening of potential anti-tumour compounds against cancer cell line panels. However, conclusions about biological pathways or pharmacological potential depend on a close molecular relation between the cancer type represented and the cell line model used for analyses. Continue reading

Structural Genome Variants in Cancer: Research, resources standards

Seminar at the University of Florence

Seminar Universita degli Studi Firenze - Dipartimento di Biologia¶

Structural Genome Variants in Cancer: Research, resources standards¶

Michael Baudis¶

Abstract¶

Genomic copy number variations are major contributors to malignant transformation and progression and constitute - at least in their quantitative extension - the largest contributors to genomic mutation landscapes, in the majority of cancer types. Such mutations occur in the vast majority of tumors as somatic genome alterations (sCNV) during clonal development and expansion and are promoted by a variety of mechanisms leading to extended or focal changes in the number of genomic segments. Continue reading

Leveraging European infrastructures to access 1 million human genomes by 2022

Gary Saunders, Michael Baudis, Regina Becker, Sergi Beltran, Christophe Béroud, Ewan Birney, Cath Brooksbank, Søren Brunak, Marc Van den Bulcke, Rachel Drysdale, Salvador Capella-Gutierrez, Paul Flicek, ..., Niklas Blomberg, and Serena Scollen¶

Nature Reviews Genetics volume 20, pages693–701 (2019)¶

Abstract Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Bo Gao and Michael Baudis (2019)¶

bioRxiv, 2019-07-31. DOI 10.1101/720854 ¶

Abstract Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive “somatic” CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading

ELIXIR All Hands - Beacon Introduction

Michael Baudis¶

This presentation was the opener for the ELIXIR Beacon session, and introduces to current developments and especially the interactions between GA4GH :: Discovery and ELIXIR Beacon.

HGVS 2019 - Development of Standards for Genomic Data Exchange

Human Genome Variation Society - Gothenburg 2019¶

Implementation Driven Development of Standards for Genomic Data Exchange from Cancer Genome Data Collections¶

Michael Baudis¶

Abstract¶

Cancers are genomic diseases, arising from the clonal propagation of somatic mutation events, with a limited contribution from inherited genomic variants. Genomic copy number variations are major contributors to malignant transformation and progression and constitute - at least in their quantitative extension - the largest contributors to genomic mutation landscapes, in the majority of cancer types. Continue reading

Connecting the silos - Genomic Data Standards, Resources and the Global Alliance for Genomics and Health

R&D Data Intelligence Leaders Forum Basel¶

Michael Baudis¶

Abstract¶

This presentation discusses the need for data sharing in genomics, provides information about the Global Alliance for Genomics and Health (GA4GH), and shows some of our group's contributions, especially regarding Beacon development.

Federated discovery and sharing of genomic data using Beacons

Miroslav Cupak , Stephen Keenan , Jordi Rambla , Sabela de la Torre , Stephanie Dyke , Anthony Brookes , Knox Carey , David Lloyd , Peter Goodhand , Maximilian Haeussler , Michael Baudis , Heinz Stockinger , Lena Dolman , Ilkka Lappalainen , Juha Törnroos , Mikael Linden , John Spalding , Saif Ur-Rehman , Angela Page , Paul Flicek , Susheel Varma , Gary Saunders , Serena Scollen , Stephen Sherry , David Haussler , Beacon Project Team¶

Nat Biotechnol (2019), accepted 2019-01-23¶

Abstract The Beacon Project (github.com/ga4gh-beacon/) is a GA4GH initiative that is developing an open specification for genetic variation discovery and sharing. The project is demonstrating the willingness of international organizations to work together to define standards for, and actively engage in, genomic data sharing. In the two years since the project’s inception, over 90 Beacons have been lit by 35 organizations serving over 200 datasets. Continue reading

DNA copy number imbalances in primary cutaneous lymphomas (PCL)

Gug G, Huang Q, Chiticariu E, Solovan C and Baudis M (2019)¶

JEADV, 2019-01-19. doi.org/10.1111/jdv.15442 ¶

The article has been published with the Journal of the European Academy of Dermatology and Venereology on January 19, 2019. A corresponding preprint can be accessed through [bioRxiv].

Background

Cutaneous lymphomas (CL) represent a clinically defined group of extran‐ odal non‐Hodgkin lymphomas harbouring heterogeneous and incompletely delineated molecular aberrations. Over the past decades, molecular stud‐ ies have identified several chromosomal aberrations, but the interpreta‐ tion of individual genomic studies can be challenging.

Objective

With a comprehensive meta‐analysis, we aim to delineate genomic alter‐ ations for different types of CL and propose a more accurate classifica‐ tion in line with their various pathogenicity. Continue reading

Enabling population assignment from cancer genomes with SNP2pop

Huang Q and Baudis M. (2019)¶

bioRxiv, 2019-01-14. doi.org/10.1101/368647 (first version 2018-07-14)¶

Abstract For a variety of human malignancies, incidence, treatment efficacy and overall prognosis show considerable variation between different populations and ethnic groups. Disentangling the effects related to particular population backgrounds can help in both understanding cancer biology and in tailoring therapeutic interventions. Because self-reported or inferred patient data can be incomplete or misleading due to migration and genomic admixture, a data-driven ancestry estimation should be preferred. While algorithms to analyze ancestry structure from healthy individuals have been developed, an easy-to-use tool to assign population groups based on genotyping data from SNP profiles is still missing and benchmarking for the validity of population assignment strategy for aberrant cancer genomes was not tested. Continue reading

2018 09 18 BIO390 Michael Baudis Introduction to Bioinformatics

UZH BIO390 "Introduction to Bioinformatics"¶

Bioinformatics - Introduction¶

Michael Baudis¶

Abstract¶

First lecture in the UZH BIO390 "Introduction to Bioinformatics" series, introducing concepts and scope of bioinformatics as a field - 2018 version.

ECCB 2018 - Beacon

Abstract: ECCB 2018 ¶

ELIXIR Beacon - A Driver Project for the Global Alliance for Genomics and Health¶

Michael Baudis for the ELIXIR Beacon Project¶

The Global Alliance for Genomics and Health (GA4GH) develops standards and guidelines to facilitate the international sharing of genomic and health related metadata. The creation of GA4GH work stream products is moved forward through driver projects, which address particular scientific, technical, regulatory or security related aspects of data access and sharing. Continue reading

Registered access: authorizing data access

Dyke SOM, Linden M, Lappalainen I, De Argila JR, Carey K, Lloyd D, Spalding JD, Cabili MN, Kerry G, Foreman J, Cutts T, Shabani M, Rodriguez LL, Haeussler M, Walsh B, Jiang X, Wang S, Perrett D, Boughtwood T, ..., Rehm HL, Baudis M, Sherry ST, Kato K, Knoppers BM, Baker D, and Flicek P¶

European Journal of Human Genetics (2018)¶

Abstract The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.

Mountains and Chasms - Surveying the Oncogenomic Publication Landscape

Carrio Cordo P and Baudis M. (2018)¶

Preprints 2018, 2018070618 (doi: 10.20944/preprints201807.0618.v1).¶

Oncology (2018; online Oct 26)¶

Abstract Cancers arise from the accumulation of somatic genome mutations, with varying contributions of intrinsic (i.e. genetic predisposition) and extrinsic (i.e. environmental) factors. For the understanding of malignant clones, precise information about their genomic composition has to be correlated with morphological, clinical and individual features, in the context of the available medical knowledge. Continue reading

Population assignment from cancer genome profiling data

Huang Q and Baudis M. (2018)¶

bioRxiv, 2018-07-14. doi:10.1101/368647¶

Abstract For a variety of human malignancies, incidence, treatment efficacy and overall prognosis show considerable variation between different populations and ethnic groups. Disentangling the effects related to particular population backgrounds can help in both understanding cancer biology and in tailoring therapeutic interventions. Because self-reported or inferred patient data can be incomplete or misleading due to migration and genomic admixture, a data-driven ancestry estimation should be preferred. While tools to map and utilize ancestry information from healthy individuals have been introduced, a population assignment based on genotyping data from somatic variation profiling of cancer samples is still missing. Continue reading

A harmonized meta-knowledgebase of clinical interpretations of cancer genomic variants

Wagner AH, Walsh B, Mayfield G, Tamborero D, Sonkin D, Krysiak K, Deu Pons J, Duren R, Gao J, McMurry J, Patterson S, Del Vecchio Fitz C, Sezerman OU, Warner J, Rieke DT, Aittokallio T, Cerami E, Ritter D, Schriml LM, Haendel M, Raca G, Madhavan S, Baudis M, ..., Griffith M, Griffith OL, and Margolin A¶

bioRxiv. doi:10.1101/366856¶

Precision oncology relies on the accurate discovery and interpretation of genomic variants to enable individualized therapy selection, diagnosis, or prognosis. However, knowledgebases containing clinical interpretations of somatic cancer variants are highly disparate in interpretation content, structure, and supporting primary literature, reducing consistency and impeding consensus when evaluating variants and their relevance in a clinical settin Continue reading

Qingyao - Institute Progress Report

IMLS Progress Report¶

Towards understanding population effect on cancer¶

Qingyao Huang¶

Abstract¶

With a combination of ~50,000 curated oncogenomic array data from the arrayMap database and ~20,000 profiles from TCGA project depository, we perform a meta- analysis to investigate influence of genetic background on the CNV patterns in cancer. From sequencing data of 26 world-wide populations from 1000 Genomes project, we extract the SNP markers and use them for subsequent sample analysis. Continue reading

The ELIXIR Beacon in 2018: A driver project of GA4GH

ELIXIR All Hands, Berlin¶

Michael Baudis¶

Abstract¶

The core mission of the Global Alliance for Genomics and Health is to "...enable genomic data sharing for the benefit of human health". One of the instruments to enact this mission is the selection and support of driver projects, which address particular scientific, technical, regulatory or security related aspects of federated access to human genomes and related metadata. Continue reading

Welcome to the baudisgroup Pages¶

Latest News & Publications¶

Congratulations to Rahel!

Rahel Paloots today passed her PhD defence

Opening Cancer Genomics - Deploying the GA4GH Beacon protocol

Zürich Cancer ORD Workshop

The ELIXIR hCNV Community - Making complex genomics accessible

ELIXIR Webcast

Michael Baudis, Antonio Rausell & Krzysztof Poterlowicz¶

cancercelllines.org listed in Expasy

Entry in the Swiss Institute of Bioinformatics Catalogue

Genomic Data Mining and The Case for Open Data Standards

ZHAW ICLS colloquium

GA4GH Connect April 2024 in Ascona

Spring 2024 GA4GH Connect working meeting co-organized by our group

Genomdaten - Chancen und Risiken für Medizin und Gesellschaft

Vortrag an der Seniorenuniversität Winterthur

cancercelllines.org - a Novel Resource for Genomic Variants in Cancer Cell Lines

DATABASE Article

Rahel Paloots and Michael Baudis¶

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030¶

bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281¶

Federated genomic discoveries: Deploying the GA4GH Beacon protocol

Virtual SeminarGHGA Lecture Series

Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines

Literature-derived annotations as entry point for data exploration

Ellery Smith, Rahel Paloots, Dimitris Giagkos, Michael Baudis and Kurt Stockinger¶

Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045¶

Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933¶

Twelve quick tips for deploying a Beacon

Some hints for Beacon developers & implementers

Lauren A Fromont, Mauricio Moldes, Michael Baudis, Anthony J Brookes, Arcadi Navarro and Jordi Rambla¶

PLoS Comput Biol. 2024 Mar 1;20(3):e1011817.¶

Genomic Data Sharing Standard Development with GA4GH and ELIXIROpportunities and Pitfalls in Federated Data Discovery

DMLS Lecture SeriesUniversity of Zurich Department of Molecular Life Sciences

Structural Genome Variations in Cancer and the Case for Open Data Standards

Cancer Genomics Seminar at UtrechtHubrecht Institute and Princess Maxima Center for Pediatric Oncology

labelSeg: segment annotation for tumor copy number alteration profiles

A tool to assign relative SCNA levels to segments

Hangjia Zhao and Michael Baudis¶

Briefings in Bioinformatics (Oxford). 2024 Jan 31;2024:bbad541.¶

Beaconize this: Databases for Cancer Genomics and the Development of Open Data Standards

Seminar at the Bioinformatics club of the Centre de Recherche des Cordeliers (CRC)Université Paris Cité

pgxRpi Accepted by Bioconductor

CNV Project at biohackathon23

Participating at #BioHackEU23 in Barcelona with a CNV reference resource project

Swiss-Korean Life Science Symposium

The 10th Swiss-Korean Life Science Symposium in Seoul

Progenetix as SIB and ELIXIR Resource

Recognizing the Progenetix platform as Swiss contribution to the European bioinformatics resources ecosystem

Short tandem repeat mutations regulate gene expression in colorectal cancer

Exploring STR patterns and their relation to expression changes in cancer

Max A Verbiest, Oxana Lundström, Feifei Xia, Michael Baudis, Tugce Bilgin Sonay, Maria Anisimova¶

doi: https://doi.org/10.1101/2023.11.29.569189¶

ELIXIR All Hands Dublin

Baudisgroup presentations at the AHM 2023 in Dublin

Rahel, Hangjia & Michael for the group¶

Phenopacket-tools: Building and validating GA4GH Phenopackets

Bioinformatics tools and examples for working with the Phenopackets standard

Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. (2023) Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One. 18:e0285433.¶

Theoretical Cytogenetics and Oncogenomics

DMLS Tandem Talks

Michael Baudis¶

ZHAW Visitors for Cancercelllines Hackathon

Implementing INODE-driven literature collections

Candidate targets of copy number deletion events across 17 cancer types

Identifying cancer related genes against the background of somatic CNV events

Huang Q and Baudis M¶

doi: 10.3389/fgene.2022.1017657¶

previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080¶

Genomic Resource Built with GA4GH Standards

EORTC PAMM Firenze

Michael Baudis¶

Beacon v2 - Onboarding Strategies & Feature Examples

Beacon Sessions at GA4GH Connect

Michael Baudis¶

Genomics Data Federation through Global Alliance for Genomics and Health Standards: Development and Implementation of the GA4GH Beacon Protocol

Seminar Yonsei University Medical School Seoul

Michael Baudis¶

Beacon v2 - Feature-rich Implementation of the Genomic Data Discovery Protocol

Welcome to the baudisgroup Pages ¶

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030 ¶

bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281 ¶

Virtual Seminar
GHGA Lecture Series

Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045 ¶

Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933 ¶

Genomic Data Sharing Standard Development with GA4GH and ELIXIR
Opportunities and Pitfalls in Federated Data Discovery

DMLS Lecture Series
University of Zurich Department of Molecular Life Sciences

Cancer Genomics Seminar at Utrecht
Hubrecht Institute and Princess Maxima Center for Pediatric Oncology

Seminar at the Bioinformatics club of the Centre de Recherche des Cordeliers (CRC)
Université Paris Cité

doi: https://doi.org/10.1101/2023.11.29.569189 ¶

doi: 10.3389/fgene.2022.1017657 ¶

previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080 ¶

Advanced Genetics 2022, 2200016. LINK ¶

Nature Biotechnology. 2022;40:817-820. LINK | PMID:35705716 ¶

Human Mutation. 2022 Mar 17. PMID:35297548 ¶

medRxiv, 2021.11.27.21266944. doi:10.1101/2021.11.27.21266944 ¶

Wagner, Alex H. et al. Cell Genomics, Volume 1, Issue 2, 100027 doi:10.1016/j.xgen.2021.100027 ¶

Thorogood, Adrian et al. Cell Genomics, Volume 1, Issue 2, 100032 doi:10.1016/j.xgen.2021.100032 ¶

Rehm, Heidi L. et al. Cell Genomics, Volume 1, Issue 2, 100029 doi:10.1016/j.xgen.2021.100029 ¶

The GA4GH Beacon Protocol Presented at BC² Basel 2021
Session "Federating computational analyses with GA4GH standards"

ELIXIR All Hands 2021 Human Data Day ¶