Welcome to the baudisgroup Pages¶
The baudisgroup website represents projects and information by the Computational Oncogenomics Group of the University of Zurich (UZH) and the Swiss Institute of Bioinformatics (SIB). For visitors more interested in Particle Astrophysics, we strongly recommend the website of another, although related, Professor Baudis.
The Computational Oncogenomics Group's research focus lies in the exploration of structural genome variations in cancer. Our work centres around our Progenetix resource of curated molecular-cytogenetic and sequencing data. Specific projects explore computational methods, genomics of selected tumour entities and genomic variant patterns across malignancies. As members of the Global Alliance for Genomics and Health, the group is developing standards in biocuration and data sharing for genomic variants and phenotypic data, for instance in driving development of the ELIXIR Beacon project. Other research is related to genome data epistemology, e.g. geographic and diagnostic sampling biases in cancer studies.
Latest News & Publications¶
Upcoming: Data Resources, Sharing, Discovery in Biomedical Genetics and Cancer Genomics
セミナー at National Cancer Center Hospital Tsukiji Campus
In this seminar you will learn about Progentix, a large open resource in cancer genomics focussing on curation and sharing of somatic copy number variation (CNV) data as well as other resources and research topics of the Theoretical Cytogenetics and Oncogenomics group at the University of Zucrich and teh Swiss Institute of Bioinformatics.
The second part of the presentation will make a case why such resources cannot be sufficient and introduce the approaches shepherded by the Global Alliance for Genomics and Health (GA4GH) Continue reading
Global Alliance for Genomics and Health - Promoting a New Paradigm for Data Discovery in Biomedical Genomics
Seminar at Tohoku University Sendai
The Global Alliance for Genomics and Health (GA4GH) has developed into a major international organization to address the challenges of data discovery in biomedical genomics. The Beacon standard developed by GA4GH with technical support by the European bioinformatics orgabnization ELIXIR and many international stakeholders has become a default for implementing modern concepts of federated data discovery and access. Continue reading
Data Discovery in Biomedical Genomics - Time for a New Paradigm
The XIVth MEDICAL GENETICS TÂRGU MUREȘ
SOCIETATEA ROMÂNĂ DE GENETICĂ MEDICALĂ
In biomedical genetics and genomics the last decades have led to the recognition that the vast amaount and diversity of genomic variations requires large-scale approaches to data analysis for both areas of rare disease genetics as well as cancer genomics. While projects establishing high quality data from large cohorts are highly valuable and necessary to drive the field, the true power of genomics can only be leveraged through access to data from different studies and cohorts beyond geographic or regulatory boundaries.
Since its foundation in 2013 the Global Alliance for Genomics and Health (GA4GH) has emerged as the main international organization to address the challenges of data discovery, access and utilization in genomics and related fields. The Beaconi> standard developed by GA4GH with technical support by the European bioinformatics orgabnization ELIXIR and many international stakeholders has become a default for implementing modern concepts of federated data discovery and access.
This presentation will provide an introduction into data federation in biomedical genomics with emphasis on the use of the Beaconm protocol for genomic data discovery, together with a general introduction into principles and mechanisms of genomics standards development by GA4GH.
Continue readingCongratulations to Rahel!
Rahel Paloots today passed her PhD defence
Congratulations to Rahel Paloots for passing her defence for a PhD with the topic
Molecular Heterogeneity among Cancer Cell Lines and their Correspondence to Primary Neoplasias
Rahel has been a contributor to progenetix.org and the main developer behind cancercelllines.org (a resource for, well, cancer cell lines…) - but also to the GA4GH & ELIXIR Beacon project.
Continue readingOpening Cancer Genomics - Deploying the GA4GH Beacon protocol
Zürich Cancer ORD Workshop
The "Moving ORD practices into cancer care" project explores and tests ORD principles in the context of cancer care, through a combination of engagement with relevant communities at the UZH as well as a technical demonstrator of a cutting edge data discovery technology. The cancer community at UZH is being mobilized through a dedicated workshop and topical meetings, to discuss challenges and opportunities of ORD in the oncology domain. A pilot installation of an established ORD approach (Beacon technology by the Global Alliance for Genomics and Health) is being implemented to test the sharing of cancer related -omics and associated data with a focus on the specific data access and security aspects presenting here.
Continue readingThe ELIXIR hCNV Community - Making complex genomics accessible
ELIXIR Webcast
Michael Baudis, Antonio Rausell & Krzysztof Poterlowicz¶
Abstract Genomic copy number variants (CNV) are a major contributor to human genome variation and important factors in rare disease genetics and cancer genomics. However, the complexity of CNV detection technologies, the lack of standardised annotation formats and the fragmentation of cytogenetic and genomic communities so far has limited large scale utilization of CNV profiles in computational genomics. Continue reading
cancercelllines.org listed in Expasy
Entry in the Swiss Institute of Bioinformatics Catalogue
Our recently launched cancer cell line genomics site cancercelllines.org is now listed as one of the resources in the Swiss Institute of Bioinformatics’ Expasy catalogue.
Continue readingGenomic Data Mining and The Case for Open Data Standards
ZHAW ICLS colloquium
The last decades have an enormous amount of data generation in the area of biomedical genomics, driven by a feedback loop of technological developments and increasing recognition of human genomic complexities, in health and disease. While individual projects frequently are tailored towards addressing specific research questions, the approach to many problems especially in rare diseases and cancer genomics requires both curated data collections and the access to data from international repositories and study centers.
Here I will present our oncogenomic data resources and research projects, and how our work on data discovery and exchange standards as part of an international community can improve data accessibility through the support of federated discovery and retrieval protocols.
Continue readingGA4GH Connect April 2024 in Ascona
Spring 2024 GA4GH Connect working meeting co-organized by our group
We're proud to host the next Spring GA4GH Connect meeting in April 2024 at the Congressi Stefano Franscini on the Monte Verità in Ascona. This will provide an excellent opportunity for Swiss genomics and bioinformatics to, well, connect with the international "genomics and health" community and projects.
Continue readingGenomdaten - Chancen und Risiken für Medizin und Gesellschaft
Vortrag an der Seniorenuniversität Winterthur
In der medizinischen Forschung werden Genomdaten für genauere Diagnosen und zur Auswahl personalisierter Therapien herangezogen. Genomdaten finden aber auch ausserhalb der Medizin Verwendung – etwa in der genealogischen Forschung oder in der Forensik. Diesem Potential stehen jedoch auch Risiken gegenüber.
Wenn genomische Daten von Millionen von Menschen erhoben werden, wer soll den Einzelnen und die Gesellschaft vor dem Missbrauch solcher Daten schützen? Der Vortrag beantwortet Fragen im Hinblick auf den Bedarf an solchen Daten in Medizin und Forschung, und die dadurch eventuell entstehenden Risiken.
Warum brauchen wir genomische Daten von Millionen von Individuen? Wie können meine Daten eine Gefahr für mich darstellen? Und wer sollte mich als Individuum vor einem Missbrauch schützen?
Continue readingcancercelllines.org - a Novel Resource for Genomic Variants in Cancer Cell Lines
DATABASE Article
Rahel Paloots and Michael Baudis¶
Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030¶
bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281¶
Abstract: Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource - cancercelllines.org - with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants (SNVs) data. We have gathered over 5,600 copy number profiles as well as SNV annotations for 16,000 cell lines and provide this data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 API and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme.
Availability and Implementation: Our resource is publicly available on the web at cancercelllines.org.
Continue readingFederated genomic discoveries: Deploying the GA4GH Beacon protocol
Virtual Seminar
GHGA Lecture Series
With the ever increasing amount of genomic data produced in the context of research studies, population analyses and medical diagnostics the need for access to genomic information beyond administrative or geographic boundaries has become a matter of eminent importance. Continue reading
Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines
Literature-derived annotations as entry point for data exploration
Ellery Smith, Rahel Paloots, Dimitris Giagkos, Michael Baudis and Kurt Stockinger¶
Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045¶
Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933¶
Motivation: With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume (Lubowitz et al., 2021). As a consequence, in the fields of biological, medical and clinical research, domain experts have to sift through massive amounts of scientific text to find relevant information. However, this process is extremely tedious and slow to be performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. Results: In this work, we present the design, implementation and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data in the domain of cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard.
Availability and Implementation: Our system is publicly available on the web at cancercelllines.org.
Contact: The authors can be contacted at ellery.smith@zhaw.ch or rahel.paloots@uzh.ch.
Continue readingTwelve quick tips for deploying a Beacon
Some hints for Beacon developers & implementers
Lauren A Fromont, Mauricio Moldes, Michael Baudis, Anthony J Brookes, Arcadi Navarro and Jordi Rambla¶
PLoS Comput Biol. 2024 Mar 1;20(3):e1011817.¶
- doi: 10.1371/journal.pcbi.1011817.
- PMID: 38427629
Introduction: In the age of data-driven biomedical research and clinical practice, the sharing of genomic and clinical data for health research and personalized medicine has become an important contribu- tor to improved diagnosis and treatment. From the data owner’s perspective, potential benefits include improved treatments, personalization of healthcare practice, and more effective con- trol of disease proliferation. However, the requirement for high levels of data security to pro- tect sensitive information presents a barrier to data discovery and sharing.
Beacon is designed to enable the benefits of data discovery while minimizing the associated risks...
Continue readingGenomic Data Sharing Standard Development with GA4GH and ELIXIR
Opportunities and Pitfalls in Federated Data Discovery
DMLS Lecture Series
University of Zurich Department of Molecular Life Sciences
In this presentation Michael talks about the role of GA4GH (and ELIXIR) in the development of standards and practices or genomic data exchange, some general principles, how his group got involved into these efforts - but also some pitfalls ...
Continue readingStructural Genome Variations in Cancer and the Case for Open Data Standards
Cancer Genomics Seminar at Utrecht
Hubrecht Institute and Princess Maxima Center for Pediatric Oncology
The presentation includes notes about work on improving the representation and of genomic copy number variations (CNV), GA4GH and its Beacon protocol as well as challenges towards genomic data privacy.
Continue readinglabelSeg: segment annotation for tumor copy number alteration profiles
A tool to assign relative SCNA levels to segments
Hangjia Zhao and Michael Baudis¶
Briefings in Bioinformatics (Oxford). 2024 Jan 31;2024:bbad541.¶
- doi: 10.1093/bib/bbad541
- PMID: 38300514
- bioRxiv. doi: doi.org/10.1101/2023.05.17.541097
Abstract Somatic copy number alterations (SCNAs) are a predominant type of oncogenomic alterations that affect a large proportion of the genome in the majority of cancer samples. Current technologies allow high-throughput measurement of such copy number aberrations, generating results consisting of frequently large sets of SCNA segments. However, the automated annotation and integration of such data are particularly challenging because the measured signals reflect biased, relative copy number ratios. In this study, we introduce labelSeg, an algorithm designed for rapid and accurate annotation of CNA segments, with the aim of enhancing the interpretation of tumor SCNA profiles. Continue reading
Beaconize this: Databases for Cancer Genomics and the Development of Open Data Standards
Seminar at the Bioinformatics club of the Centre de Recherche des Cordeliers (CRC)
Université Paris Cité
In this seminar at the Centre de Recherche des Cordeliers in Paris Michael presents the work of the group, with special emphasis on the role of the Progenetix oncogenomics resources and tools in the development, implementation and testing of the Beacon standard of the Global Alliance for Genomics and Health (GA4GH).
Continue readingpgxRpi Accepted by Bioconductor
The R wrapper for Peogenetix API pgxRpi is now part of the 'devel' branch and will be included in the upcoming 3.19 release in mid-April 2024.
Continue readingCNV Project at biohackathon23
Participating at #BioHackEU23 in Barcelona with a CNV reference resource project
With other members of the hCNV community some of us will participate at this year's
Biohackathon Europe event. The main project will address the creation of the template
for a "beaconized" public resource for reference (i.e. not disease associated)
copy number variation data, including the necessary tooling for the import from
e.g. VCF or BED file variants into Beacon backends (such as our bycon
environment).
Swiss-Korean Life Science Symposium
The 10th Swiss-Korean Life Science Symposium in Seoul
As representative of the Swiss delegation and particularly of the University of Zurich UZH Michael will be presenter and panel discussion participant at the 10th Swiss-Korean Life Science Symposium in Seoul, together with members of the Swiss and Korean life sciences and personalized health academic and industrial communities.
Continue readingProgenetix as SIB and ELIXIR Resource
Recognizing the Progenetix platform as Swiss contribution to the European bioinformatics resources ecosystem
The Progenetix resource has finally been recognized as an official contribution to the ELIXIR European bioinformatics informatics ecosystem. Besides Expasy Progenetix now is linked through ELIXIR's resource page. Or just go directly to progenetix.org (and its daughter project cancercelllines.org).
Continue readingShort tandem repeat mutations regulate gene expression in colorectal cancer
Exploring STR patterns and their relation to expression changes in cancer
Max A Verbiest, Oxana Lundström, Feifei Xia, Michael Baudis, Tugce Bilgin Sonay, Maria Anisimova¶
doi: https://doi.org/10.1101/2023.11.29.569189¶
Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression levels to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. The increased mutability of eSTRs in MSI tumours may be an early indication that eSTR mutations can confer a selective advantage to tumours. Future extensions of our findings into larger cohorts could uncover new STR-based targets in the treatment of cancer.
Continue readingELIXIR All Hands Dublin
Baudisgroup presentations at the AHM 2023 in Dublin
Rahel, Hangjia & Michael for the group¶
At the ELIXIR All Hands Meeting 2023 in Dublin our group presented several posters about our resources and work in standards development.
Continue readingPhenopacket-tools: Building and validating GA4GH Phenopackets
Bioinformatics tools and examples for working with the Phenopackets standard
Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. (2023) Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One. 18:e0285433.¶
Abstract The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Continue reading
Theoretical Cytogenetics and Oncogenomics
DMLS Tandem Talks
Michael Baudis¶
In this short presentation Michael provides an overview of the group's work in cancer genomics resources, data analysis and standard development, including the involvement in the Global Alliance for Genomics and Health GA4GH.
- Cancer Genome Profiles
- Oncogenomic Data Resources
- Bioinformatics Methods
- Data Exchange Standards for Genomics and Personalized Health
ZHAW Visitors for Cancercelllines Hackathon
Implementing INODE-driven literature collections
Fore some months our group has worked with members of Kurt Stockinger's team from the ZHAW on a cancer cell lines use case for the INODE project. In the last 2 days we had a site visit for a first implementation of the use case specific system on cancercelllines.org. More information to follow - and thanks to Ellery & Dimitris for the great work!
Continue readingCandidate targets of copy number deletion events across 17 cancer types
Identifying cancer related genes against the background of somatic CNV events
Huang Q and Baudis M¶
doi: 10.3389/fgene.2022.1017657¶
previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080¶
Abstract Genome variation is the direct cause of cancer and driver of its clonal evolution. While the impact of many point mutations can be evaluated through their modification of individual genomic elements, even a single copy number aberration (CNA) may encompass hundreds of genes and therefore pose challenges to untangle potentially complex functional effects. However, consistent, recurring and disease-specific patterns in the genome-wide CNA landscape imply that particular CNA may promote cancer-type-specific characteristics. Discerning essential cancer-promoting alterations from the inherent co-dependency in CNA would improve the understanding of mechanisms of CNA and provide new insights into cancer biology and potential therapeutic targets. Continue reading
Genomic Resource Built with GA4GH Standards
EORTC PAMM Firenze
Michael Baudis¶
This brief presentation introduces the Progenetix resource, the Gobal Alliance for Genomics and Health as a developers of standards for data sharing in biomedical genomics as well as the use of Progenetix in GA4GH standards development.
Continue readingBeacon v2 - Onboarding Strategies & Feature Examples
Beacon Sessions at GA4GH Connect
Michael Baudis¶
The Beacon Sessions at GA4GH Connect November 2022 targeted the migration of existing and implementation of new v2 Beacons, with emphasis on the "how to get there easily" rather than on all Beacon v2 features. Continue reading
Genomics Data Federation through Global Alliance for Genomics and Health Standards: Development and Implementation of the GA4GH Beacon Protocol
Seminar Yonsei University Medical School Seoul
Michael Baudis¶
In this Seoul meeting presentation Michael introduces the Global Alliance for Genomics and Healt and its involvement in Genomics standards development, followed by a discussion of the Beacon protocol and the role of the Progenetix resouce in its development. Continue reading
Beacon v2 - Feature-rich Implementation of the Genomic Data Discovery Protocol
GA4GH 2022 Plenary Barcelona
Michael Baudis¶
The “Beacon” protocol - developed with support from ELIXIR, the European bioinformatics infrastructure organization, as a standard of the Global Alliance for Genomics and Health (GA4GH) - represents an emerging standard for an “Internet for Genomics”. While the initial version of the protocol served as a widely adopted test bed for the sharing of genomic variants over federated query systems connecting hundreds of internationally distributed resources, the version 2 of the protocol provides a framework for extended, metadata-rich query and response options in both public and restricted federated access scenarios. Continue reading
GA4GH Phenopackets: A Practical Introduction
Phenopackets v2 introduction with practical examples
Ladewig MS, Jacobsen JO, Wagner AH, Danis D, Kassaby BE, Gargano M, Groza T, Baudis M, Steinhaus R, Seelow D, Bechrakis NE, Mungall CJ, Schofield PN, Elemento O, Smith L, McMurry JA, Munoz-Torres M, Haendel MA and Robinson PN¶
Advanced Genetics 2022, 2200016. LINK¶
Abstract The Global Alliance for Genomics and Health (GA4GH) is developing a suite of coordinated standards for genomics for healthcare. The Phenopacket is a new GA4GH standard for sharing disease and phenotype information that characterizes an individual person, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments. A detailed example is presented that illustrates how to use the schema to represent the clinical course of a patient with retinoblastoma, including demographic information, the clinical diagnosis, phenotypic features and clinical measurements, an examination of the extirpated tumor, therapies, and the results of genomic analysis. The Phenopacket Schema, together with other GA4GH data and technical standards, will enable data exchange and provide a foundation for the computational analysis of disease and phenotype information to improve our ability to diagnose and conduct research on all types of disorders, including cancer and rare diseases.
The Phenopacket software is available at github.com/phenopackets/.
Continue readingBeacon - Ethical & Legal Aspects of a Genomic Data Discovery Protocol
DSI Ethics Project Pitch
Michael Baudis¶
Here Michael provides a very brief presentation about the GA4GH Beacon protocol, especially as a target for projects discussing the ethical implications of genome data discovery & sharing as well as the relevant legal frameworks, with emphasis on the Swiss context. Continue reading
A cancer genomics resource built on GA4GH standards
Rahel Paloots, Michael Baudis¶
CGC St Louis 2022¶
Progenetix is a cancer genomics resource that includes genomic profiling data as well as biomedical annotations and provenance data for cancer studies. The main goal of the Progenetix database is to provide easy, open access for research studies and clinical diagnostics. To facilitate sharing of genomic data, Progenetix complies with and contributes to GA4GH and Beacon data standards. Beacon, developed with the support from ELXIR (the European bioinformatics infrastructure organization), started out as protocol to share genomic variants over federated queries.
Continue readingA cancer genomics reference resource powered by GA4GH standards
Roche Data Science Seminar
Michael Baudis¶
The presentation reports about the Progenetix cancer genomics resource and its role in the GA4GH ecosystem & the Beacon genomics API development process.
Continue readingThe GA4GH Phenopacket schema defines a computable representation of clinical data
Phenopackets v2 publication
Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Buske OJ, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Essenwanger A, Freimuth RR, ... , Haendel MA, Robinson PN, The GAGHPMC.¶
Nature Biotechnology. 2022;40:817-820. LINK | PMID:35705716¶
Abstract Despite great strides made in the development and wide acceptance of standards for exchanging structured information about genomic variants, progress in standards for computational phenotype analysis for translational genomics has lagged behind. Phenotypic features (signs, symptoms, laboratory and imaging findings, results of physiological tests, etc.) are of high clinical importance, yet exchanging them in conjunction with genomic variation information is often overlooked or even neglected. Continue reading
Implementation of the GA4GH Beacon protocol for discovery and sharing of genomic copy number variation data
ESHG Vienna 2022
Michael Baudis¶
Background & Objectives Genomic copy number variations (CNV) are a major contributor to inter-individual genomic variation, can be causative events in rare diseases, but especially represent the majority of the mutational landscape in the most malignancies. While specific CNV events and some recurring patterns have contributed to the identification of individual cancer drivers and the recognition of cancer subtypes, the complexity of genomic CNV patterns requires large amounts of well-defined genomic profiles for statistically meaningful analyses. At the other end of the spectrum, in the area of rare disease genomics the potential pathogenicity of individual CNV events requires validation against a vast set of disease-related and reference genomic profiles and annotations.
Continue readingProgenetix & BeaconPlus - An open cancer genomics resource on a stack of Beacon code...
ELIXIR All Hands Amsterdam 2022
Michael Baudis¶
Here Michael provides some overview of the multi-year trajectory of the Beacon API development, and how BeaconPlus & Progenetix have been utilized for "implementation driven design".
Continue readingBeacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond
Beacon v2 publication
Rambla J, Baudis M, Ariosa R, Beck T, Fromont LA, Navarro A, Paloots R, Rueda M, Saunders G, Singh B, Spalding JD.¶
Human Mutation. 2022 Mar 17. PMID:35297548¶
Abstract Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers.
Continue readinghCNV Implementation Studies Old and New
ELIXIR Human Data Communities
Michael Baudis¶
This presentation provided an overview about the hCNV community, implementation studies and ongoing work, e.g. interaction with the GA4GH VRS standard group and Beacon development.
Continue readingTechnical, legal and ethics aspects of genomic data sharing
Genomes | Privacy | Laws | Society - DSI Ethics Seminar
Michael Baudis¶
The presentation introduces the need for sharing and federated discovery of genome data in the contexts of personalized health and genomic researchand some of teh current developments in international standards and practices in the area. Continue reading
The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine
Phenopackets v2 preprint
Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Freimuth RR, ..., Haendel MA, Robinson PN.¶
medRxiv, 2021.11.27.21266944. doi:10.1101/2021.11.27.21266944¶
Abstract Despite great strides in the development and wide acceptance of standards for exchanging structured information about genomic variants, there is no corresponding standard for exchanging phenotypic data, and this has impeded the sharing of phenotypic information for computational analysis. Here, we introduce the Global Alliance for Genomics and Health (GA4GH) Phenopacket schema, which supports exchange of computable longitudinal case-level phenotypic information for diagnosis and research of all types of disease including Mendelian and complex genetic diseases, cancer, and infectious diseases. Continue reading
The GA4GH Variation Representation Specification (VRS): a Computational Framework for the Precise Representation and Federated Identification of Molecular Variation.
Alex H. Wagner, Lawrence Babb, Gil Alterovitz, Michael Baudis, Matthew Brush, Daniel L. Cameron, Melissa Cline , Malachi Griffith, Obi L. Griffith, ..., Melissa Konopko, Heidi L. Rehm, Andrew D. Yates, Robert R. Freimuth, Reece K. Hart¶
Wagner, Alex H. et al. Cell Genomics, Volume 1, Issue 2, 100027 doi:10.1016/j.xgen.2021.100027¶
bioRxiv. version 20212021.01.15.426843. (2021-01-15)¶
Note¶
This article was published as part of a special GA4GH edition of Cell Genomics.
Abstract Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced “verse”), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. Continue reading
International federation of genomic medicine databases using GA4GH standards
Adrian Thorogood, Heidi L. Rehm, Peter Goodhand, Angela J.H. Page, Yann Joly, Michael Baudis, Jordi Rambla, Arcadi Navarro, Tommi H. Nyronen, Mikael Linden, Edward S. Dove, Marc Fiume, Michael Brudno, Melissa S. Cline, Ewan Birney¶
Thorogood, Adrian et al. Cell Genomics, Volume 1, Issue 2, 100032 doi:10.1016/j.xgen.2021.100032¶
Note¶
This article was published as part of a special GA4GH edition of Cell Genomics.
Abstract We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.
Continue readingGA4GH: International policies and standards for data sharing across genomic research and healthcare
Heidi L. Rehm, Angela J.H. Page, Lindsay Smith, Jeremy B. Adams, Gil Alterovitz, Lawrence J. Babb, Maxmillian P. Barkley, Michael Baudis, Michael J.S. Beauvais, Tim Beck, Jacques S. Beckmann, Sergi Beltran, David Bernick, Alexander Bernier, James K. Bonfield, Tiffany F. Boughtwood, Guillaume Bourque, Sarion R. Bowers, Anthony J. Brookes, Michael Brudno, Matthew H. Brush, David Bujold, Tony Burdett, Orion J. Buske, Moran N. Cabili , Daniel L. Cameron, Robert J. Carroll, Esmeralda Casas-Silva, Debyani Chakravarty, Bimal P. Chaudhari, Shu Hui Chen, J. Michael Cherry, Justina Chung, Melissa Cline, Hayley L. Clissold, Robert M. Cook-Deegan, Mélanie Courtot, ..., Peter Goodhand, Kathryn North, Ewan Birney¶
Rehm, Heidi L. et al. Cell Genomics, Volume 1, Issue 2, 100029 doi:10.1016/j.xgen.2021.100029¶
Note¶
This article was published as part of a special GA4GH edition of Cell Genomics.
Abstract The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. Continue reading
A cancer genomics resource built around and driving GA4GH standards
GRIC sponsored workshop with the Swiss Institute of Bioinformatics
Michael Baudis¶
The Progenetix oncogenomics resource provides sample-specific cancer genome profiling data and biomedical annotations as well as provenance data for cancer studies. With more than 100k genomic copy number number (CNV) profiles from over 700 cancer types, Progenetix empowers comparative analyses beyond individual studies and diagnostic concepts.
Continue readingA Standardized Format for Federated Genomic Data Exchange
The GA4GH Beacon Protocol Presented at BC2 Basel 2021
Session "Federating computational analyses with GA4GH standards"
Michael Baudis¶
During the "Federating computational analyses with GA4GH standards" workshop at BC2 2021 Michael presented history and the current status of the Beacon project, as well as its integration with specific data resources and analysis initiatives.
Continue readingCancer genomics reference resource and toolkit around GA4GH standards
ESHG 2021
Q. Huang, B. Gao, R. Paloots, P. Carrio-Cordo, Z. Yang, M. Baudis¶
This poster presentation at the European Society of Human Genetics meeting 2021 discusses the integration and development of GA4GH standards by the Progenetix oncogenomics resource.
Continue readingProgenetix - An open reference resource for copy number vatiation data in cancer
Qingyao Huang¶
Continue readingThe Progenetix oncogenomic resource in 2021
Article describing the current content & technical status of progenetix.org
Qingyao Huang, Paula Carrio Cordo, Bo Gao, Rahel Paloots, Michael Baudis¶
Database (Oxford). 2021 Jul 17;2021:baab043.¶
- doi: 10.1093/database/baab043.
- PMID: 34272855
- PMCID: PMC8285936.
- bioRxiv. doi: doi.org/10.1101/2021.02.15.428237
Abstract In cancer, copy number aberrations (CNAs) represent a type of nearly ubiquitous and frequently extensive structural genome variations. To disentangle the molecular mechanisms underlying tumorigenesis as well as identify and characterize molecular subtypes, the comparative and meta-analysis of large genomic variant collections can be of immense importance. Over the last decades, cancer genomic profiling projects have resulted in a large amount of somatic genome variation profiles, however segregated in a multitude of individual studies and datasets. The Progenetix project, initiated in 2001, curates individual cancer CNA profiles and associated metadata from published oncogenomic studies and data repositories with the aim to empower integrative analyses spanning all different cancer biologies. Continue reading
hCNV Community and Implementation Studies
Michael Baudis¶
ELIXIR All Hands 2021 Human Data Day¶
At the Human Data Day Michael presents a very brief overview of the ending and upcoming ELIXIR hCNV implementation studies.
Continue readinghCNV data and the Progenetix Beacon
Presentation at ELIXIR All Hands 2021
Michael Baudis¶
ELIXIR All Hands 2021¶
This presentation gives a brief overview of the use of the Progenetix resource to test and implement a genomics reference resource using the emerging Beacon v2 protocol.
Continue readingSignatures of Discriminative CNA in 31 Cancer Subtypes
Bo Gao and Michael Baudis (2021)¶
Published at Frontiers in Genetics, 2021-05-13¶
Abstract¶
Copy number aberrations (CNA) are one of the most important classes of genomic mutations relatedto oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated bymolecular-cytogenetic and genome sequencing based methods. While this data has been instrumentalin the identification of cancer-related genes and promoted research into the relation between CNA andhisto-pathologically defined cancer types, the heterogeneity of source data and derived CNV profilespose great challenges for data integration and comparative analysis. Furthermore, a majority of exist-ing studies have been focused on the association of CNA to pre-selected ”driver” genes with limitedapplication to rare drivers and other genomic elements.
Continue readingProgenetix, Beacon and GA4GH at RDA
Research Data Alliance - RDA Virtual Plenary 17
Concepts | Status | History | Outlook¶
Michael Baudis¶
Research Data Alliance - RDA Virtual Plenary 17¶
This seminar gives an overview of current state of the Progenetix Beacon project and the overall connection to the Global Alliance for Genomics and Health (GA4GH).
Continue readingImplementing GA4GH Standards to Drive an Open Oncogenomics Resource
Research Seminar Kinderspital Zürich - Neuroonkologie
Michael Baudis¶
Seminar Neurooncology Childrens Hospital Zürich¶
This seminar gives an overview of the history & current state of the Progenetix resource, it's role in Beacon API development and the overall connection to the Global Alliance for Genomics and Health (GA4GH).
Continue readingDiscovering copy number variation across multiple cancer types
Qingyao Huang¶
Abstract
Genomic variations are direct cause of tumor formation and accomplice in its continuous evolution. While point mutations can be pinpointed to a targeted genetic element, copy number variations (CNVs) involve copy number gain or loss of a large DNA segment which often covers hundreds of genetic elements in one event. Continue reading
EACR conference - The Progenetix Oncogenomic Resource
Continue readingGA4GH Connect - Beacon v2 and SchemaBlocks
GA4GH Connect 2020¶
Michael Baudis¶
Beacon v2 Structural Variants [slides]¶
SchemaBlocks {S}[B] [slides]¶
Continue readingCopy number variant heterogeneity among cancer types reflects inconsistent concordance with diagnostic classifications
Paula Carrio Cordo and Michael Baudis¶
bioRxiv. doi: doi.org/10.1101/2021.03.01.433348¶
This article explores the correlation between subsets of cancer entities, grouped by their somatic CNV patterns, and levels of diagnostic classification systems.
Continue readingGenomic data and Privacy
Michael Baudis¶
ETHZ Lecture¶
The understanding of the impact of individual inherited and somatic genome variants on phenotypes and diseases requires a thorough understanding of the occurrence of such variants amongst populations in general and carriers of the phenotypes and diseases in particular. This information can only be provided through the inclusion of data from a multitude of genome resources in variant evaluation efforts, including such from outside (international) jurisdictions. However, opening such resources carries the inherent risk of breaching privacy, particularly through re-identification of individuals or their relatives and potentially through the exposure of individual genome-related personal information including phenotypic and "performance" prediction and relative disease risk.
Continue readingBeacon v2 – Towards flexible use and clinical applications for a reference genomic data protocol
SPHN Webinar¶
Michael Baudis¶
Genomic “Beacons” provide discovery services for genomic data using the Beacon API developed as a key driver project of the Global Alliance for Genomics and Health (GA4GH). The Beacon protocol itself defines an open standard for genomics data discovery and provides a framework for web services responding to queries against genomic data collections, for instance from population based or disease specific genome repositories. Continue reading
Update of Progenetix Oncogenomics Resource
Research Progress Report, DMLS, University of Zurich¶
Qingyao Huang¶
Copy number aberration (CNA) is frequently observed in cancer genomes. Meta-analysis of genomic variations helps to disentangle the multiplex molecular mechanism underlying tumorigenesis as well as identify and characterize molecular subtypes. Over the years, cancer genomic research have resulted in a considerable amount of data segregated by studies. The Progenetix project (www.progenetix.org), initiated in 2001, aims to systematize the published cancer genomic profiles and provide accurate annotation to facilitate integrative analysis. Continue reading
Welcome to Ziying
Today Ziying Yang arrived as a new member of the baudisgroup.
Welcome Ziying!
Continue readingGA4GH Beacon v2 at GA4GH Plenary
GA4GH Beacon v2 - Evolving Reference Standard for Genomic Data Exchange¶
GA4GH 8th Plenary¶
Gary Saunders, Jordi Rambla de Argila, Anthony Brookes, Juha Törnroos and Michael Baudis¶
For the ELIXIR Beacon project, GA4GH Discovery work stream and the international network of Beacon API developers¶
The Beacon driver project was one of the earliest initiatives of the Global Alliance for Genomics and Health with the Beacon v1.0 API as first approved GA4GH standard. Version 2 of the protocol is slated to provide fundamental changes, towards a Internet of Genomics foundational standard: * requests beyond genomic variants ("filters") * payload responses, secured through open AAI * aligning w/ GA4GH standards (Phenopackets, VRS, DUO...) through SchemaBlocks {S}[B] * Working with international partners on deployment of advanced implementations
Continue readingProgenetix - A cancer genomics reference resource around GA4GH standards
GA4GH 8th Plenary¶
Michael Baudis¶
The Progenetix oncogenomics resource provides sample-specific cancer genome profiling data and biomedical annotations as well as provenance data from cancer studies. Especially through currently 113322 curated genomic copy number number (CNV) profiles from 1600 individual studies representing over 500 cancer types (NCIt), Progenetix empowers aggregate and comparative analyses which vastly exceed individual studies or single diagnostic concepts. Continue reading
Cancer Data - ELIXIR::GA4GH: Advancing genomics resources through standards and ontologies
ECCB2020¶
Michael Baudis¶
Additional Links¶
Continue readingThe Ubiquitin Ligase TRIP12 Limits PARP1 Trapping and Constrains PARP Inhibitor Efficiency
Marco Gatti, Ralph Imhof, Qingyao Huang, Michael Baudis, Matthias Altmeyer¶
Cell Rep. 2020 Aug 4 DOI: 10.1016/j.celrep.2020.107985¶
Abstract PARP inhibitors (PARPi) cause synthetic lethality in BRCA-deficient tumors. Whether specific vulnerabilities to PARPi exist beyond BRCA mutations and related defects in homology-directed repair (HDR) is not well understood. Here, we identify the ubiquitin E3 ligase TRIP12 as negative regulator of PARPi sensitivity. Continue reading
Beacon v2 - Towards Flexible Use and Clinical Applications for a Reference Genomic Data Sharing Protocol
Personalized Health Technologies 2020¶
Michael Baudis¶
Beacons provide discovery services for genomic data using the Beacon API developed under the leadership of ELIXIR, as a key driver project of the Global Alliance for Genomics and Health (GA4GH). The Beacon protocol itself defines an open standard for genomics data discovery. It provides a framework for public web services responding to queries against genomic data collections, for instance from population based or disease specific genome repositories. Sites offering beacons can scale through aggregation in "Beacon Networks", which distribute single genome queries among a potentially large number of international beacons and assemble their responses. Continue reading
ELIXIR All Hands - Beacon Evolution
ELIXIR All Hands 2020 - Beacon Workshop¶
Michael Baudis¶
This presentation covers some of Beacon's origins, features and directions.
Links¶
Continue readingOncology Informatics: Status Quo and Outlook - Review
Paul Martin Putora, Michael Baudis, Beth M. Beadle, Issam El Naqa, Frank A. Giordano and Nils H. Nicolay¶
Oncology, 2020-05-14. DOI 10.1159/000507586 (Review)¶
Abstract Oncology has undergone rapid progress, with emerging developments in areas including cancer stem cells, molecularly targeted therapies, genomic analyses, and individually tai- lored immunotherapy. These advances have expanded the tools available in the fight against cancer. Some of these have seen broad media coverage resulting in justified public attention. However, these achievements have only been possible due to rapid developments in the expanding field of biomedical informatics and information technology (IT). Continue reading
Swissnex SF: Laura & Michael Baudis - Life & Family
Originally planned for their stays at UCB and LBNL, Laura & Michael were interviewed by Tabea Stoeckel from swissnex San Francisco about their stay in the Bay Area and their research & family life as internationally active scientists.
Continue readingMinimum Error Calibration and Normalization for Genomic Copy Number Analysis
Bo Gao and Michael Baudis (2020)¶
bioRxiv, 2019-07-31. DOI 10.1101/720854¶
Genomics, Volume 112, Issue 5, September 2020, Pages 3331-3341, accepted 2020-05-06 doi.org/10.1016/j.ygeno.2020.05.008.¶
Background¶
Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive "somatic" CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading
SchemaBlocks and GA4GH TASC
A brief presentation about SchemaBlocks concepts and its possible integration into the new GA4GH TASC effort.
Additional Links¶
Continue readingA harmonized meta-knowledgebase of clinical interpretations of somatic genomic variants in cancer
Alex H. Wagner, Brian Walsh, Georgia Mayfield, David Tamborero, Dmitriy Sonkin, Kilannin Krysiak, Jordi Deu-Pons, Ryan P. Duren, Jianjiong Gao, Julie McMurry, Sara Patterson, Catherine del Vecchio Fitz, Beth A. Pitel, ..., Nuria Lopez-Bigas, Mark Lawler, Jeremy Goecks, Malachi Griffith, Obi L. Griffith, Adam A. Margolin & Variant Interpretation for Cancer Consortium¶
Nature Genetics volume 52, pages 448–457 (2020)¶
Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. Continue reading
Geographic assessment of cancer genome profiling studies
Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)¶
DATABASE, Volume 2020, 2020, baaa009, doi.org/10.1093/database/baaa009¶
bioRxiv preprint, 2020-01-11. DOI 10.1101/827683¶
Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading
SWISSNEX SF Lunch Seminar - Data Mining in Genomics
Genomic Research and Personalised Health Strategies¶
Resources | Standards | Protocols | Tools | Discourse¶
These are the slides of a short presentation, given (virtually, since COVID-19) for a SWISSNEX San Francisco lunch meeting.
Additional Links¶
Continue readingEnabling population assignment from cancer genomes with SNP2pop
Huang Q and Baudis M. (2020)¶
Sci Rep 10, 4846 (2020). doi.org/10.1038/s41598-020-61854-x¶
Abstract In many cancers, incidence, treatment efficacy and overall prognosis vary between geographic populations. Studies disentangling the contributing factors may help in both understanding cancer biology and tailoring therapeutic interventions. Ancestry estimation in such studies should preferably be driven by genomic data, due to frequently missing or erroneous self-reported or inferred metadata. While respective algorithms have been demonstrated for baseline genomes, such a strategy has not been shown for cancer genomes carrying a substantial somatic mutation load. We have developed a bioinformatics tool for the assignment of population groups from genome profiling data for both unaltered and cancer genomes. Continue reading
BBOP Presentation - Baudisgroup Projects & Interests
Links¶
Continue readingELIXIR Beacon Project - Networking Resources Across and Beyond ELIXIR Human Data Communities
ELIXIR Open Day - Wellcome Trust Genome Campus Hinxton¶
Michael Baudis¶
In this presentation I introduce the Beacon project and provide my opinions about its future trajectory, and especially its role in driving the alignment of ELIXIR and GA4GH projects in related to (human) genome data sharing.
Links¶
Continue readingGA4GH SchemaBlocks for Human Cell Atlas
This is a presentation of the SchemaBlocks initiative and the overall GA4GH context, for the Human Cell Atlas project, given by Michael at one of their teleconferences.
Additional Links¶
Continue readingGeographic assessment of cancer genome profiling studies
Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)¶
bioRxiv, 2020-11-01. DOI 10.1101/827683¶
Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading
Talk at St. Gallen Radiation Oncology - Bioinformatics and Data Exchange
3rd St. Gallen Radiation Oncology Informatics Meeting¶
Bioinformatics and data exchange for genomics in an international context¶
Michael Baudis¶
The presentation at the St. Gallen meeting introduced the audience to the group's research and resources, and how this is connected to the different national & international data standards and sharing initiatives.
Continue readingTalk at AMED Tokyo - Cancer Genomics and Standards
Mini-Symposium about CNV and Data Standards at AMED Japan, Tokyo¶
Cancer Genomics and Implementation of Data Driven Standards for Genomic Data Exchange¶
Michael Baudis¶
At this meeting, several Japanese participants presented their research and results, with a focus on Copy Number Variants and other structural genome variations. Continue reading
Minimum Error Calibration and Normalization for Genomic Copy Number Analysis
BC2 2019, Basel¶
Bo Gao¶
Abstract¶
Background:
Copy number variations (CNV) are regional deviations from thenormal autosomal bi-allelic DNA content. While germline CNVs are a majorcontributor to genomic syndromes and inherited diseases, the majority of cancersaccumulate extensive ”somatic” CNV (sCNV or CNA) during the process ofoncogenetic transformation and progression. While specific sCNV have closelybeen associated with tumorigenesis, intriguingly many neoplasias exhibitrecurrent sCNV patterns beyond the involvement of a few cancer driver genes.Currently, CNV profiles of tumor samples are generated using genomicmicro-arrays or high-throughput DNA sequencing. Regardless of the underlyingtechnology, genomic copy number data is derived from the relative assessmentand integration of multiple signals, with the data generation process being proneto contamination from several sources. Estimated copy number values have noabsolute and linear correlation to their corresponding DNA levels, and the extentof deviation differs between sample profiles which poses a great challenge fordata integration and comparison in large scale genome analysis.
Continue reading
Cancer cell lines in focus: somatic copy number & germline variation
BC2 2019, Basel¶
Qingyao Huang¶
Abstract¶
Background:
Human cell lines are convenient model systems in cancer research, for validation of proposed molecular mechanisms as well as to evaluate potential therapeutic approaches, e.g. through high- throughput screening of potential anti-tumour compounds against cancer cell line panels. However, conclusions about biological pathways or pharmacological potential depend on a close molecular relation between the cancer type represented and the cell line model used for analyses.
Continue reading
Structural Genome Variants in Cancer: Research, resources standards
Seminar at the University of Florence
Seminar Universita degli Studi Firenze - Dipartimento di Biologia¶
Structural Genome Variants in Cancer: Research, resources standards¶
Michael Baudis¶
Abstract¶
Genomic copy number variations are major contributors to malignant transformation and progression and constitute - at least in their quantitative extension - the largest contributors to genomic mutation landscapes, in the majority of cancer types. Such mutations occur in the vast majority of tumors as somatic genome alterations (sCNV) during clonal development and expansion and are promoted by a variety of mechanisms leading to extended or focal changes in the number of genomic segments. Continue reading
Leveraging European infrastructures to access 1 million human genomes by 2022
Gary Saunders, Michael Baudis, Regina Becker, Sergi Beltran, Christophe Béroud, Ewan Birney, Cath Brooksbank, Søren Brunak, Marc Van den Bulcke, Rachel Drysdale, Salvador Capella-Gutierrez, Paul Flicek, ..., Niklas Blomberg, and Serena Scollen¶
Nature Reviews Genetics volume 20, pages693–701 (2019)¶
Abstract Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.
Continue readingMinimum Error Calibration and Normalization for Genomic Copy Number Analysis
Bo Gao and Michael Baudis (2019)¶
bioRxiv, 2019-07-31. DOI 10.1101/720854¶
Abstract Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive “somatic” CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading
ELIXIR All Hands - Beacon Introduction
Michael Baudis¶
This presentation was the opener for the ELIXIR Beacon session, and introduces to current developments and especially the interactions between GA4GH :: Discovery and ELIXIR Beacon.
Continue readingHGVS 2019 - Development of Standards for Genomic Data Exchange
Human Genome Variation Society - Gothenburg 2019¶
Implementation Driven Development of Standards for Genomic Data Exchange from Cancer Genome Data Collections¶
Michael Baudis¶
Abstract¶
Cancers are genomic diseases, arising from the clonal propagation of somatic mutation events, with a limited contribution from inherited genomic variants. Genomic copy number variations are major contributors to malignant transformation and progression and constitute - at least in their quantitative extension - the largest contributors to genomic mutation landscapes, in the majority of cancer types. Continue reading
Connecting the silos - Genomic Data Standards, Resources and the Global Alliance for Genomics and Health
R&D Data Intelligence Leaders Forum Basel¶
Michael Baudis¶
Abstract¶
This presentation discusses the need for data sharing in genomics, provides information about the Global Alliance for Genomics and Health (GA4GH), and shows some of our group's contributions, especially regarding Beacon development.
Continue readingFederated discovery and sharing of genomic data using Beacons
Miroslav Cupak , Stephen Keenan , Jordi Rambla , Sabela de la Torre , Stephanie Dyke , Anthony Brookes , Knox Carey , David Lloyd , Peter Goodhand , Maximilian Haeussler , Michael Baudis , Heinz Stockinger , Lena Dolman , Ilkka Lappalainen , Juha Törnroos , Mikael Linden , John Spalding , Saif Ur-Rehman , Angela Page , Paul Flicek , Susheel Varma , Gary Saunders , Serena Scollen , Stephen Sherry , David Haussler , Beacon Project Team¶
Nat Biotechnol (2019), accepted 2019-01-23¶
Abstract The Beacon Project (github.com/ga4gh-beacon/) is a GA4GH initiative that is developing an open specification for genetic variation discovery and sharing. The project is demonstrating the willingness of international organizations to work together to define standards for, and actively engage in, genomic data sharing. In the two years since the project’s inception, over 90 Beacons have been lit by 35 organizations serving over 200 datasets. Continue reading
DNA copy number imbalances in primary cutaneous lymphomas (PCL)
Gug G, Huang Q, Chiticariu E, Solovan C and Baudis M (2019)¶
JEADV, 2019-01-19. doi.org/10.1111/jdv.15442¶
The article has been published with the Journal of the European Academy of Dermatology and Venereology on January 19, 2019. A corresponding preprint can be accessed through [bioRxiv].
Background
Cutaneous lymphomas (CL) represent a clinically defined group of extran‐ odal non‐Hodgkin lymphomas harbouring heterogeneous and incompletely delineated molecular aberrations. Over the past decades, molecular stud‐ ies have identified several chromosomal aberrations, but the interpreta‐ tion of individual genomic studies can be challenging.
Objective
With a comprehensive meta‐analysis, we aim to delineate genomic alter‐ ations for different types of CL and propose a more accurate classifica‐ tion in line with their various pathogenicity. Continue reading
Enabling population assignment from cancer genomes with SNP2pop
Huang Q and Baudis M. (2019)¶
bioRxiv, 2019-01-14. doi.org/10.1101/368647 (first version 2018-07-14)¶
Abstract For a variety of human malignancies, incidence, treatment efficacy and overall prognosis show considerable variation between different populations and ethnic groups. Disentangling the effects related to particular population backgrounds can help in both understanding cancer biology and in tailoring therapeutic interventions. Because self-reported or inferred patient data can be incomplete or misleading due to migration and genomic admixture, a data-driven ancestry estimation should be preferred. While algorithms to analyze ancestry structure from healthy individuals have been developed, an easy-to-use tool to assign population groups based on genotyping data from SNP profiles is still missing and benchmarking for the validity of population assignment strategy for aberrant cancer genomes was not tested. Continue reading
2018 09 18 BIO390 Michael Baudis Introduction to Bioinformatics
UZH BIO390 "Introduction to Bioinformatics"¶
Bioinformatics - Introduction¶
Michael Baudis¶
Abstract¶
First lecture in the UZH BIO390 "Introduction to Bioinformatics" series, introducing concepts and scope of bioinformatics as a field - 2018 version.
Continue readingECCB 2018 - Beacon
Abstract: ECCB 2018¶
ELIXIR Beacon - A Driver Project for the Global Alliance for Genomics and Health¶
Michael Baudis for the ELIXIR Beacon Project¶
The Global Alliance for Genomics and Health (GA4GH) develops standards and guidelines to facilitate the international sharing of genomic and health related metadata. The creation of GA4GH work stream products is moved forward through driver projects, which address particular scientific, technical, regulatory or security related aspects of data access and sharing. Continue reading
Registered access: authorizing data access
Dyke SOM, Linden M, Lappalainen I, De Argila JR, Carey K, Lloyd D, Spalding JD, Cabili MN, Kerry G, Foreman J, Cutts T, Shabani M, Rodriguez LL, Haeussler M, Walsh B, Jiang X, Wang S, Perrett D, Boughtwood T, ..., Rehm HL, Baudis M, Sherry ST, Kato K, Knoppers BM, Baker D, and Flicek P¶
European Journal of Human Genetics (2018)¶
Abstract The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.
Continue readingMountains and Chasms - Surveying the Oncogenomic Publication Landscape
Carrio Cordo P and Baudis M. (2018)¶
Preprints 2018, 2018070618 (doi: 10.20944/preprints201807.0618.v1).¶
Oncology (2018; online Oct 26)¶
Abstract Cancers arise from the accumulation of somatic genome mutations, with varying contributions of intrinsic (i.e. genetic predisposition) and extrinsic (i.e. environmental) factors. For the understanding of malignant clones, precise information about their genomic composition has to be correlated with morphological, clinical and individual features, in the context of the available medical knowledge. Continue reading
Population assignment from cancer genome profiling data
Huang Q and Baudis M. (2018)¶
bioRxiv, 2018-07-14. doi:10.1101/368647¶
Abstract For a variety of human malignancies, incidence, treatment efficacy and overall prognosis show considerable variation between different populations and ethnic groups. Disentangling the effects related to particular population backgrounds can help in both understanding cancer biology and in tailoring therapeutic interventions. Because self-reported or inferred patient data can be incomplete or misleading due to migration and genomic admixture, a data-driven ancestry estimation should be preferred. While tools to map and utilize ancestry information from healthy individuals have been introduced, a population assignment based on genotyping data from somatic variation profiling of cancer samples is still missing. Continue reading