Baudisgroup Publications¶

A list of publication can also be retrieved through EuropePMC.

cancercelllines.org - a Novel Resource for Genomic Variants in Cancer Cell Lines

DATABASE Article

Rahel Paloots and Michael Baudis¶

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030 ¶

bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281 ¶

Abstract: Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource - cancercelllines.org - with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants (SNVs) data. We have gathered over 5,600 copy number profiles as well as SNV annotations for 16,000 cell lines and provide this data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 API and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme.

Availability and Implementation: Our resource is publicly available on the web at cancercelllines.org.

Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines

Literature-derived annotations as entry point for data exploration

Ellery Smith, Rahel Paloots, Dimitris Giagkos, Michael Baudis and Kurt Stockinger¶

Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045 ¶

Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933 ¶

arXiv logo Motivation: With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume (Lubowitz et al., 2021). As a consequence, in the fields of biological, medical and clinical research, domain experts have to sift through massive amounts of scientific text to find relevant information. However, this process is extremely tedious and slow to be performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. Results: In this work, we present the design, implementation and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data in the domain of cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard.

Availability and Implementation: Our system is publicly available on the web at cancercelllines.org.

Contact: The authors can be contacted at ellery.smith@zhaw.ch or rahel.paloots@uzh.ch.

Twelve quick tips for deploying a Beacon

Some hints for Beacon developers & implementers

Lauren A Fromont, Mauricio Moldes, Michael Baudis, Anthony J Brookes, Arcadi Navarro and Jordi Rambla¶

PLoS Comput Biol. 2024 Mar 1;20(3):e1011817.¶

doi: 10.1371/journal.pcbi.1011817.
PMID: 38427629

Introduction: In the age of data-driven biomedical research and clinical practice, the sharing of genomic and clinical data for health research and personalized medicine has become an important contribu- tor to improved diagnosis and treatment. From the data owner’s perspective, potential benefits include improved treatments, personalization of healthcare practice, and more effective con- trol of disease proliferation. However, the requirement for high levels of data security to pro- tect sensitive information presents a barrier to data discovery and sharing.

Beacon is designed to enable the benefits of data discovery while minimizing the associated risks...

labelSeg: segment annotation for tumor copy number alteration profiles

A tool to assign relative SCNA levels to segments

Hangjia Zhao and Michael Baudis¶

Briefings in Bioinformatics (Oxford). 2024 Jan 31;2024:bbad541.¶

doi: 10.1093/bib/bbad541
PMID: 38300514
bioRxiv. doi: doi.org/10.1101/2023.05.17.541097

Abstract Somatic copy number alterations (SCNAs) are a predominant type of oncogenomic alterations that affect a large proportion of the genome in the majority of cancer samples. Current technologies allow high-throughput measurement of such copy number aberrations, generating results consisting of frequently large sets of SCNA segments. However, the automated annotation and integration of such data are particularly challenging because the measured signals reflect biased, relative copy number ratios. In this study, we introduce labelSeg, an algorithm designed for rapid and accurate annotation of CNA segments, with the aim of enhancing the interpretation of tumor SCNA profiles. Continue reading

Short tandem repeat mutations regulate gene expression in colorectal cancer

Exploring STR patterns and their relation to expression changes in cancer

Max A Verbiest, Oxana Lundström, Feifei Xia, Michael Baudis, Tugce Bilgin Sonay, Maria Anisimova¶

doi: https://doi.org/10.1101/2023.11.29.569189 ¶

biorXiv logo Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression levels to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. The increased mutability of eSTRs in MSI tumours may be an early indication that eSTR mutations can confer a selective advantage to tumours. Future extensions of our findings into larger cohorts could uncover new STR-based targets in the treatment of cancer.

Phenopacket-tools: Building and validating GA4GH Phenopackets

Bioinformatics tools and examples for working with the Phenopackets standard

Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. (2023) Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One. 18:e0285433.¶

Abstract The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Continue reading

Candidate targets of copy number deletion events across 17 cancer types

Identifying cancer related genes against the background of somatic CNV events

Huang Q and Baudis M¶

doi: 10.3389/fgene.2022.1017657 ¶

previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080 ¶

Abstract Genome variation is the direct cause of cancer and driver of its clonal evolution. While the impact of many point mutations can be evaluated through their modification of individual genomic elements, even a single copy number aberration (CNA) may encompass hundreds of genes and therefore pose challenges to untangle potentially complex functional effects. However, consistent, recurring and disease-specific patterns in the genome-wide CNA landscape imply that particular CNA may promote cancer-type-specific characteristics. Discerning essential cancer-promoting alterations from the inherent co-dependency in CNA would improve the understanding of mechanisms of CNA and provide new insights into cancer biology and potential therapeutic targets. Continue reading

GA4GH Phenopackets: A Practical Introduction

Phenopackets v2 introduction with practical examples

Ladewig MS, Jacobsen JO, Wagner AH, Danis D, Kassaby BE, Gargano M, Groza T, Baudis M, Steinhaus R, Seelow D, Bechrakis NE, Mungall CJ, Schofield PN, Elemento O, Smith L, McMurry JA, Munoz-Torres M, Haendel MA and Robinson PN¶

Advanced Genetics 2022, 2200016. LINK ¶

Abstract The Global Alliance for Genomics and Health (GA4GH) is developing a suite of coordinated standards for genomics for healthcare. The Phenopacket is a new GA4GH standard for sharing disease and phenotype information that characterizes an individual person, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments. A detailed example is presented that illustrates how to use the schema to represent the clinical course of a patient with retinoblastoma, including demographic information, the clinical diagnosis, phenotypic features and clinical measurements, an examination of the extirpated tumor, therapies, and the results of genomic analysis. The Phenopacket Schema, together with other GA4GH data and technical standards, will enable data exchange and provide a foundation for the computational analysis of disease and phenotype information to improve our ability to diagnose and conduct research on all types of disorders, including cancer and rare diseases.

The Phenopacket software is available at github.com/phenopackets/.

The GA4GH Phenopacket schema defines a computable representation of clinical data

Phenopackets v2 publication

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Buske OJ, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Essenwanger A, Freimuth RR, ... , Haendel MA, Robinson PN, The GAGHPMC.¶

Nature Biotechnology. 2022;40:817-820. LINK | PMID:35705716 ¶

Abstract Despite great strides made in the development and wide acceptance of standards for exchanging structured information about genomic variants, progress in standards for computational phenotype analysis for translational genomics has lagged behind. Phenotypic features (signs, symptoms, laboratory and imaging findings, results of physiological tests, etc.) are of high clinical importance, yet exchanging them in conjunction with genomic variation information is often overlooked or even neglected. Continue reading

Beacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond

Beacon v2 publication

Rambla J, Baudis M, Ariosa R, Beck T, Fromont LA, Navarro A, Paloots R, Rueda M, Saunders G, Singh B, Spalding JD.¶

Human Mutation. 2022 Mar 17. PMID:35297548 ¶

Abstract Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers.

The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine

Phenopackets v2 preprint

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Freimuth RR, ..., Haendel MA, Robinson PN.¶

medRxiv, 2021.11.27.21266944. doi:10.1101/2021.11.27.21266944 ¶

Abstract Despite great strides in the development and wide acceptance of standards for exchanging structured information about genomic variants, there is no corresponding standard for exchanging phenotypic data, and this has impeded the sharing of phenotypic information for computational analysis. Here, we introduce the Global Alliance for Genomics and Health (GA4GH) Phenopacket schema, which supports exchange of computable longitudinal case-level phenotypic information for diagnosis and research of all types of disease including Mendelian and complex genetic diseases, cancer, and infectious diseases. Continue reading

The GA4GH Variation Representation Specification (VRS): a Computational Framework for the Precise Representation and Federated Identification of Molecular Variation.

Alex H. Wagner, Lawrence Babb, Gil Alterovitz, Michael Baudis, Matthew Brush, Daniel L. Cameron, Melissa Cline , Malachi Griffith, Obi L. Griffith, ..., Melissa Konopko, Heidi L. Rehm, Andrew D. Yates, Robert R. Freimuth, Reece K. Hart¶

Wagner, Alex H. et al. Cell Genomics, Volume 1, Issue 2, 100027 doi:10.1016/j.xgen.2021.100027 ¶

bioRxiv. version 20212021.01.15.426843. (2021-01-15)¶

Note¶

This article was published as part of a special GA4GH edition of Cell Genomics.

Abstract Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced “verse”), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. Continue reading

International federation of genomic medicine databases using GA4GH standards

Adrian Thorogood, Heidi L. Rehm, Peter Goodhand, Angela J.H. Page, Yann Joly, Michael Baudis, Jordi Rambla, Arcadi Navarro, Tommi H. Nyronen, Mikael Linden, Edward S. Dove, Marc Fiume, Michael Brudno, Melissa S. Cline, Ewan Birney¶

Thorogood, Adrian et al. Cell Genomics, Volume 1, Issue 2, 100032 doi:10.1016/j.xgen.2021.100032 ¶

Note¶

This article was published as part of a special GA4GH edition of Cell Genomics.

Abstract We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.

GA4GH: International policies and standards for data sharing across genomic research and healthcare

Heidi L. Rehm, Angela J.H. Page, Lindsay Smith, Jeremy B. Adams, Gil Alterovitz, Lawrence J. Babb, Maxmillian P. Barkley, Michael Baudis, Michael J.S. Beauvais, Tim Beck, Jacques S. Beckmann, Sergi Beltran, David Bernick, Alexander Bernier, James K. Bonfield, Tiffany F. Boughtwood, Guillaume Bourque, Sarion R. Bowers, Anthony J. Brookes, Michael Brudno, Matthew H. Brush, David Bujold, Tony Burdett, Orion J. Buske, Moran N. Cabili , Daniel L. Cameron, Robert J. Carroll, Esmeralda Casas-Silva, Debyani Chakravarty, Bimal P. Chaudhari, Shu Hui Chen, J. Michael Cherry, Justina Chung, Melissa Cline, Hayley L. Clissold, Robert M. Cook-Deegan, Mélanie Courtot, ..., Peter Goodhand, Kathryn North, Ewan Birney¶

Rehm, Heidi L. et al. Cell Genomics, Volume 1, Issue 2, 100029 doi:10.1016/j.xgen.2021.100029 ¶

Note¶

This article was published as part of a special GA4GH edition of Cell Genomics.

Abstract The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. Continue reading

The Progenetix oncogenomic resource in 2021

Article describing the current content & technical status of progenetix.org

Qingyao Huang, Paula Carrio Cordo, Bo Gao, Rahel Paloots, Michael Baudis¶

Database (Oxford). 2021 Jul 17;2021:baab043.¶

doi: 10.1093/database/baab043.
PMID: 34272855
PMCID: PMC8285936.
bioRxiv. doi: doi.org/10.1101/2021.02.15.428237

Abstract In cancer, copy number aberrations (CNAs) represent a type of nearly ubiquitous and frequently extensive structural genome variations. To disentangle the molecular mechanisms underlying tumorigenesis as well as identify and characterize molecular subtypes, the comparative and meta-analysis of large genomic variant collections can be of immense importance. Over the last decades, cancer genomic profiling projects have resulted in a large amount of somatic genome variation profiles, however segregated in a multitude of individual studies and datasets. The Progenetix project, initiated in 2001, curates individual cancer CNA profiles and associated metadata from published oncogenomic studies and data repositories with the aim to empower integrative analyses spanning all different cancer biologies. Continue reading

Signatures of Discriminative CNA in 31 Cancer Subtypes

Bo Gao and Michael Baudis (2021)¶

Published at Frontiers in Genetics, 2021-05-13¶

Abstract¶

Copy number aberrations (CNA) are one of the most important classes of genomic mutations relatedto oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated bymolecular-cytogenetic and genome sequencing based methods. While this data has been instrumentalin the identification of cancer-related genes and promoted research into the relation between CNA andhisto-pathologically defined cancer types, the heterogeneity of source data and derived CNV profilespose great challenges for data integration and comparative analysis. Furthermore, a majority of exist-ing studies have been focused on the association of CNA to pre-selected ”driver” genes with limitedapplication to rare drivers and other genomic elements.

Copy number variant heterogeneity among cancer types reflects inconsistent concordance with diagnostic classifications

Paula Carrio Cordo and Michael Baudis¶

bioRxiv. doi: doi.org/10.1101/2021.03.01.433348 ¶

This article explores the correlation between subsets of cancer entities, grouped by their somatic CNV patterns, and levels of diagnostic classification systems.

The Ubiquitin Ligase TRIP12 Limits PARP1 Trapping and Constrains PARP Inhibitor Efficiency

Marco Gatti, Ralph Imhof, Qingyao Huang, Michael Baudis, Matthias Altmeyer¶

Cell Rep. 2020 Aug 4 DOI: 10.1016/j.celrep.2020.107985 ¶

Abstract PARP inhibitors (PARPi) cause synthetic lethality in BRCA-deficient tumors. Whether specific vulnerabilities to PARPi exist beyond BRCA mutations and related defects in homology-directed repair (HDR) is not well understood. Here, we identify the ubiquitin E3 ligase TRIP12 as negative regulator of PARPi sensitivity. Continue reading

Oncology Informatics: Status Quo and Outlook - Review

Paul Martin Putora, Michael Baudis, Beth M. Beadle, Issam El Naqa, Frank A. Giordano and Nils H. Nicolay¶

Oncology, 2020-05-14. DOI 10.1159/000507586 (Review)¶

Abstract Oncology has undergone rapid progress, with emerging developments in areas including cancer stem cells, molecularly targeted therapies, genomic analyses, and individually tai- lored immunotherapy. These advances have expanded the tools available in the fight against cancer. Some of these have seen broad media coverage resulting in justified public attention. However, these achievements have only been possible due to rapid developments in the expanding field of biomedical informatics and information technology (IT). Continue reading

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Bo Gao and Michael Baudis (2020)¶

bioRxiv, 2019-07-31. DOI 10.1101/720854 ¶

Genomics, Volume 112, Issue 5, September 2020, Pages 3331-3341, accepted 2020-05-06 doi.org/10.1016/j.ygeno.2020.05.008.¶

Background¶

Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive "somatic" CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading

A harmonized meta-knowledgebase of clinical interpretations of somatic genomic variants in cancer

Alex H. Wagner, Brian Walsh, Georgia Mayfield, David Tamborero, Dmitriy Sonkin, Kilannin Krysiak, Jordi Deu-Pons, Ryan P. Duren, Jianjiong Gao, Julie McMurry, Sara Patterson, Catherine del Vecchio Fitz, Beth A. Pitel, ..., Nuria Lopez-Bigas, Mark Lawler, Jeremy Goecks, Malachi Griffith, Obi L. Griffith, Adam A. Margolin & Variant Interpretation for Cancer Consortium¶

Nature Genetics volume 52, pages 448–457 (2020)¶

Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. Continue reading

Geographic assessment of cancer genome profiling studies

Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)¶

DATABASE, Volume 2020, 2020, baaa009, doi.org/10.1093/database/baaa009 ¶

bioRxiv preprint, 2020-01-11. DOI 10.1101/827683 ¶

Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading

Enabling population assignment from cancer genomes with SNP2pop

Huang Q and Baudis M. (2020)¶

Sci Rep 10, 4846 (2020). doi.org/10.1038/s41598-020-61854-x ¶

Abstract In many cancers, incidence, treatment efficacy and overall prognosis vary between geographic populations. Studies disentangling the contributing factors may help in both understanding cancer biology and tailoring therapeutic interventions. Ancestry estimation in such studies should preferably be driven by genomic data, due to frequently missing or erroneous self-reported or inferred metadata. While respective algorithms have been demonstrated for baseline genomes, such a strategy has not been shown for cancer genomes carrying a substantial somatic mutation load. We have developed a bioinformatics tool for the assignment of population groups from genome profiling data for both unaltered and cancer genomes. Continue reading

Geographic assessment of cancer genome profiling studies

Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)¶

bioRxiv, 2020-11-01. DOI 10.1101/827683 ¶

Abstract Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. Continue reading

Leveraging European infrastructures to access 1 million human genomes by 2022

Gary Saunders, Michael Baudis, Regina Becker, Sergi Beltran, Christophe Béroud, Ewan Birney, Cath Brooksbank, Søren Brunak, Marc Van den Bulcke, Rachel Drysdale, Salvador Capella-Gutierrez, Paul Flicek, ..., Niklas Blomberg, and Serena Scollen¶

Nature Reviews Genetics volume 20, pages693–701 (2019)¶

Abstract Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Bo Gao and Michael Baudis (2019)¶

bioRxiv, 2019-07-31. DOI 10.1101/720854 ¶

Abstract Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive “somatic” CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Continue reading

Federated discovery and sharing of genomic data using Beacons

Miroslav Cupak , Stephen Keenan , Jordi Rambla , Sabela de la Torre , Stephanie Dyke , Anthony Brookes , Knox Carey , David Lloyd , Peter Goodhand , Maximilian Haeussler , Michael Baudis , Heinz Stockinger , Lena Dolman , Ilkka Lappalainen , Juha Törnroos , Mikael Linden , John Spalding , Saif Ur-Rehman , Angela Page , Paul Flicek , Susheel Varma , Gary Saunders , Serena Scollen , Stephen Sherry , David Haussler , Beacon Project Team¶

Nat Biotechnol (2019), accepted 2019-01-23¶

Abstract The Beacon Project (github.com/ga4gh-beacon/) is a GA4GH initiative that is developing an open specification for genetic variation discovery and sharing. The project is demonstrating the willingness of international organizations to work together to define standards for, and actively engage in, genomic data sharing. In the two years since the project’s inception, over 90 Beacons have been lit by 35 organizations serving over 200 datasets. Continue reading

DNA copy number imbalances in primary cutaneous lymphomas (PCL)

Gug G, Huang Q, Chiticariu E, Solovan C and Baudis M (2019)¶

JEADV, 2019-01-19. doi.org/10.1111/jdv.15442 ¶

The article has been published with the Journal of the European Academy of Dermatology and Venereology on January 19, 2019. A corresponding preprint can be accessed through [bioRxiv].

Background

Cutaneous lymphomas (CL) represent a clinically defined group of extran‐ odal non‐Hodgkin lymphomas harbouring heterogeneous and incompletely delineated molecular aberrations. Over the past decades, molecular stud‐ ies have identified several chromosomal aberrations, but the interpreta‐ tion of individual genomic studies can be challenging.

Objective

With a comprehensive meta‐analysis, we aim to delineate genomic alter‐ ations for different types of CL and propose a more accurate classifica‐ tion in line with their various pathogenicity. Continue reading

Enabling population assignment from cancer genomes with SNP2pop

Huang Q and Baudis M. (2019)¶

bioRxiv, 2019-01-14. doi.org/10.1101/368647 (first version 2018-07-14)¶

Abstract For a variety of human malignancies, incidence, treatment efficacy and overall prognosis show considerable variation between different populations and ethnic groups. Disentangling the effects related to particular population backgrounds can help in both understanding cancer biology and in tailoring therapeutic interventions. Because self-reported or inferred patient data can be incomplete or misleading due to migration and genomic admixture, a data-driven ancestry estimation should be preferred. While algorithms to analyze ancestry structure from healthy individuals have been developed, an easy-to-use tool to assign population groups based on genotyping data from SNP profiles is still missing and benchmarking for the validity of population assignment strategy for aberrant cancer genomes was not tested. Continue reading

Registered access: authorizing data access

Dyke SOM, Linden M, Lappalainen I, De Argila JR, Carey K, Lloyd D, Spalding JD, Cabili MN, Kerry G, Foreman J, Cutts T, Shabani M, Rodriguez LL, Haeussler M, Walsh B, Jiang X, Wang S, Perrett D, Boughtwood T, ..., Rehm HL, Baudis M, Sherry ST, Kato K, Knoppers BM, Baker D, and Flicek P¶

European Journal of Human Genetics (2018)¶

Abstract The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.

Mountains and Chasms - Surveying the Oncogenomic Publication Landscape

Carrio Cordo P and Baudis M. (2018)¶

Preprints 2018, 2018070618 (doi: 10.20944/preprints201807.0618.v1).¶

Oncology (2018; online Oct 26)¶

Abstract Cancers arise from the accumulation of somatic genome mutations, with varying contributions of intrinsic (i.e. genetic predisposition) and extrinsic (i.e. environmental) factors. For the understanding of malignant clones, precise information about their genomic composition has to be correlated with morphological, clinical and individual features, in the context of the available medical knowledge. Continue reading

Population assignment from cancer genome profiling data

Huang Q and Baudis M. (2018)¶

bioRxiv, 2018-07-14. doi:10.1101/368647¶

Abstract For a variety of human malignancies, incidence, treatment efficacy and overall prognosis show considerable variation between different populations and ethnic groups. Disentangling the effects related to particular population backgrounds can help in both understanding cancer biology and in tailoring therapeutic interventions. Because self-reported or inferred patient data can be incomplete or misleading due to migration and genomic admixture, a data-driven ancestry estimation should be preferred. While tools to map and utilize ancestry information from healthy individuals have been introduced, a population assignment based on genotyping data from somatic variation profiling of cancer samples is still missing. Continue reading

A harmonized meta-knowledgebase of clinical interpretations of cancer genomic variants

Wagner AH, Walsh B, Mayfield G, Tamborero D, Sonkin D, Krysiak K, Deu Pons J, Duren R, Gao J, McMurry J, Patterson S, Del Vecchio Fitz C, Sezerman OU, Warner J, Rieke DT, Aittokallio T, Cerami E, Ritter D, Schriml LM, Haendel M, Raca G, Madhavan S, Baudis M, ..., Griffith M, Griffith OL, and Margolin A¶

bioRxiv. doi:10.1101/366856¶

Precision oncology relies on the accurate discovery and interpretation of genomic variants to enable individualized therapy selection, diagnosis, or prognosis. However, knowledgebases containing clinical interpretations of somatic cancer variants are highly disparate in interpretation content, structure, and supporting primary literature, reducing consistency and impeding consensus when evaluating variants and their relevance in a clinical settin Continue reading

Krüppel-Like Factor 10 participates in cervical cancer immunoediting through transcriptional regulation of Pregnancy-Specific Beta-1 Glycoproteins

Marrero-Rodríguez D, Taniguchi-Ponciano K, Subramaniam M, Hawse JR, Pitel KS, Arreola-De la Cruz H, Huerta-Padilla V, Ponce-Navarrete G, Figueroa-Corona MDP, Gomez-Virgilio L, Martinez-Cuevas TI, Mendoza-Rodriguez M, Rodriguez-Esquivel M, Romero-Morelos P, Ramirez-Salcedo J, Baudis M, Meraz-Rios M, Jimenez-Vega F, Salcedo M.¶

Abstract Cervical cancer (CC) is associated with alterations in immune system balance, which is primarily due to a shift from Th1 to Th2 and the unbalance of Th17/Treg cells. Using in silico DNA copy number analysis, we have demonstrated that ~20% of CC samples exhibit gain of 8q22.3 and 19q13.31; the regions of the genome that encodes the KLF10 and PSG genes, respectively. Continue reading

segment_liftover...

segment_liftover : a Python tool to convert segments between genome assemblies.¶

Gao B, Huang Q and Baudis M¶

Abstract The process of assembling a species’ reference genome may be performed in a number of iterations, with subsequent genome assemblies differing in the coordinates of mapped elements. The conversion of genome coordinates between different assemblies is required for many integrative and comparative studies. While currently a number of bioinformatics tools are available to accomplish this task, most of them are tailored towards the conversion of single genome coordinates. When converting the boundary positions of segments spanning larger genome regions, segments may be mapped into smaller sub-segments if the original segment’s continuity is disrupted in the target assembly. Such a conversion may lead to a relevant degree of data loss in some circumstances such as copy number variation (CNV) analysis, where the quantitative representation of a genomic region takes precedence over base-specific accuracy. segment_liftover aims at continuity-preserving remapping of genome segments between assemblies and provides features such as approximate locus conversion, automated batch processing and comprehensive logging to facilitate processing of datasets containing large numbers of structural genome variation data.

Online¶

Integrated Molecular...

Integrated Molecular Meta-Analysis of 1,000 Pediatric High-Grade and Diffuse Intrinsic Pontine Glioma.¶

Mackay A, Burford A, Carvalho D, Izquierdo E, Fazal-Salom J, Taylor KR, Bjerke L, Clarke M, Vinci M, Nandhabalan M, Temelso S, Popov S, Molinari V, Raman P, Waanders AJ, Han HJ, Gupta S, Marshall L, Zacharoulis S, Vaidya S, Mandeville HC, Bridges LR, Martin AJ, Al-Sarraj S, Chandler C, Ng HK, Li X, Mu K, Trabelsi S, Brahim DH, Kisljakov AN, Konovalov DM, Moore AS, Carcaboso AM, Sunol M, de Torres C, Cruz O, Mora J, Shats LI, Stavale JN, Bidinotto LT, Reis RM, Entz-Werle N, Farrell M, Cryan J, Crimmins D, Caird J, Pears J, Monje M, Debily MA, Castel D, Grill J, Hawkins C, Nikbakht H, Jabado N, Baker SJ, Pfister SM, Jones DTW, Fouladi M, von Bueren AO, Baudis M, Resnick A, Jones C.¶

Abstract We collated data from 157 unpublished cases of pediatric high-grade glioma and diffuse intrinsic pontine glioma and 20 publicly available datasets in an integrated analysis of >1,000 cases. We identified co-segregating mutations in histone-mutant subgroups including loss of FBXW7 in H3.3G34R/V, TOP3A rearrangements in H3.3K27M, and BCOR mutations in H3.1K27M. Histone wild-type subgroups are refined by the presence of key oncogenic events or methylation profiles more closely resembling lower-grade tumors. Genomic aberrations increase with age, highlighting the infant population as biologically and clinically distinct. Uncommon pathway dysregulation is seen in small subsets of tumors, further defining the molecular diversity of the disease, opening up avenues for biological study and providing a basis for functionally defined future treatment stratification.

CNARA: reliability assessment for genomic copy number profiles

Ai N, Cai H, Solovan C, Baudis M.¶

Abstract DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly.Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at github.com/baudisgroup/CNARA .We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research.

PKCα and HMGB1 antagonistically control hydrogen peroxide-induced poly-ADP-ribose formation

PKCα and HMGB1 antagonistically control hydrogen peroxide-induced poly-ADP-ribose formation.¶

Andersson A, Bluwstein A, Kumar N, Teloni F, Traenkle J, Baudis M, Altmeyer M, Hottiger MO.¶

Abstract Harmful oxidation of proteins, lipids and nucleic acids is observed when reactive oxygen species (ROS) are produced excessively and/or the antioxidant capacity is reduced, causing 'oxidative stress'. Nuclear poly-ADP-ribose (PAR) formation is thought to be induced in response to oxidative DNA damage and to promote cell death under sustained oxidative stress conditions. However, what exactly triggers PAR induction in response to oxidative stress is incompletely understood. Using reverse phase protein array (RPPA) and in-depth analysis of key stress signaling components, we observed that PAR formation induced by H2O2 was mediated by the PLC/IP3R/Ca(2+)/PKCα signaling axis. Mechanistically, H2O2-induced PAR formation correlated with Ca(2+)-dependent DNA damage, which, however, was PKCα-independent. In contrast, PAR formation was completely lost upon knockdown of PKCα, suggesting that DNA damage alone was not sufficient for inducing PAR formation, but required a PKCα-dependent process. Intriguingly, the loss of PAR formation observed upon PKCα depletion was overcome when the chromatin structure-modifying protein HMGB1 was co-depleted with PKCα, suggesting that activation and nuclear translocation of PKCα releases the inhibitory effect of HMGB1 on PAR formation. Together, these results identify PKCα and HMGB1 as important co-regulators involved in H2O2-induced PAR formation, a finding that may have important relevance for oxidative stress-associated pathophysiological conditions.

Links¶

EPMC
PDF

The SIB Swiss Institute of Bioinformatics' resources: focus on curated databases

SIB Swiss Institute of Bioinformatics Members.¶

Abstract The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB's Bioinformatics resource portal ExPASy features over 150 resources, including UniProtKB/Swiss-Prot, ENZYME, PROSITE, neXtProt, STRING, UniCarbKB, SugarBindDB, SwissRegulon, EPD, arrayMap, Bgee, SWISS-MODEL Repository, OMA, OrthoDB and other databases, which are briefly described in this article.

A Biobank Supporting Rare Disease Research In Dermatopathology. Our Experience In Establishing A Biobank.

Beleut M, Seclaman E, Baudis M, Nicula A, and Solovan C¶

RoJCED 2, 202-206 (2015)¶

Abstract Biobanks of human patient sample tissues and blood fractions are increasingly recognized as major assets in disease research. We aim to identify DNA copy number and gene expression aberrations typical of different cutaneous pathologies. Another goal is the identification of circulating biomarkers both as prognostic, therapy- responsive and/or therapy-monitoring factors and as disease classifiers and subclassifiers.

Genomic instability of osteosarcoma cell lines in culture: impact on the prediction of metastasis relevant genes

Muff R, Rath P, Ram Kumar RM, Husmann K, Born W, Baudis M, Fuchs B.¶

Abstract Osteosarcoma is a rare but highly malignant cancer of the bone. As a consequence, the number of established cell lines used for experimental in vitro and in vivo osteosarcoma research is limited and the value of these cell lines relies on their stability during culture. Here we investigated the stability in gene expression by microarray analysis and array genomic hybridization of three low metastatic cell lines and derivatives thereof with increased metastatic potential using cells of different passages.The osteosarcoma cell lines showed altered gene expression during in vitro culture, and it was more pronounced in two metastatic cell lines compared to the respective parental cells. Chromosomal instability contributed in part to the altered gene expression in SAOS and LM5 cells with low and high metastatic potential. To identify metastasis-relevant genes in a background of passage-dependent altered gene expression, genes involved in "Pathways in cancer" that were consistently regulated under all passage comparisons were evaluated. Genes belonging to "Hedgehog signaling pathway" and "Wnt signaling pathway" were significantly up-regulated, and IHH, WNT10B and TCF7 were found up-regulated in all three metastatic compared to the parental cell lines.Considerable instability during culture in terms of gene expression and chromosomal aberrations was observed in osteosarcoma cell lines. The use of cells from different passages and a search for genes consistently regulated in early and late passages allows the analysis of metastasis-relevant genes despite the observed instability in gene expression in osteosarcoma cell lines during culture.

arrayMap 2014: an updated cancer genome resource

Cai H, Gupta S, Rath P, Ai N, Baudis M.¶

Abstract Somatic copy number aberrations (CNA) represent a mutation type encountered in the majority of cancer genomes. Here, we present the 2014 edition of arrayMap (www.arraymap.org), a publicly accessible collection of pre-processed oncogenomic array data sets and CNA profiles, representing a vast range of human malignancies. Since the initial release, we have enhanced this resource both in content and especially with regard to data mining support. The 2014 release of arrayMap contains more than 64,000 genomic array data sets, representing about 250 tumor diagnoses. Continue reading

Biopsying parapsoriasis: quo vadis? Are morphological stains enough or are ancillary tests needed?

Baderca F, Chiticariu E, Baudis M, Solovan C.¶

Abstract BACKGROUND: Parapsoriasis represents a group of cutaneous disorders that shows variable clinical aspects somehow resembling to psoriasis, how is reflecting by its name. It was first named by Brocq, in 1902, as an entity with three components: pityriasis lichenoides, small plaque parapsoriasis and large plaque parapsoriasis. Nowadays, under the name of parapsoriasis are included only the last two categories, that are considered disorders characterized by the presence of a mononuclear infiltrate in the dermis, composed of T-cells. Until now, there were not established pathognomonic histopathological features to diagnose parapsoriasis. Continue reading

Chromothripsis-like patterns are recurring but heterogeneously distributed features in a survey of 22,347 cancer genome screens

Cai H, Kumar N, Bagheri HC, von Mering C, Robinson MD, Baudis M.¶

Abstract BACKGROUND: Chromothripsis is a recently discovered phenomenon of genomic rearrangement, possibly arising during a single genome-shattering event. This could provide an alternative paradigm in cancer development, replacing the gradual accumulation of genomic changes with a "one-off" catastrophic event. However, the term has been used with varying operational definitions, with the minimal consensus being a large number of locally clustered copy number aberrations. The mechanisms underlying these chromothripsis-like patterns (CTLP) and their specific impact on tumorigenesis are still poorly understood. Continue reading

Progenetix: 12 years of oncogenomic data curation

Cai H, Kumar N, Ai N, Gupta S, Rath P, Baudis M.¶

Abstract DNA copy number aberrations (CNAs) can be found in the majority of cancer genomes and are crucial for understanding the potential mechanisms underlying tumor initiation and progression. Since the first release in 2001, the Progenetix project (www.progenetix.org) has provided a reference resource dedicated to provide the most comprehensive collection of genome-wide CNA profiles. Reflecting the application of comparative genomic hybridization techniques to tens of thousands of cancer genomes, over the past 12 years our data curation efforts have resulted in a more than 60-fold increase in the number of cancer samples presented through Progenetix. Continue reading

SIL1 mutations and clinical spectrum in patients with Marinesco-Sjogren syndrome

Krieger M, Roos A, Stendel C, Claeys KG, Sonmez FM, Baudis M, Bauer P, Bornemann A, de Goede C, Dufke A, Finkel RS, Goebel HH, Häussler M, Kingston H, Kirschner J, Medne L, Muschke P, Rivier F, Rudnik-Schöneborn S, Spengler S, Inzana F, Stanzial F, Benedicenti F, Synofzik M, Lia Taratuto A, Pirra L, Tay SK, Topaloglu H, Uyanik G, Wand D, Williams D, Zerres K, Weis J, Senderek J.¶

Abstract Marinesco-Sjögren syndrome is a rare autosomal recessive multisystem disorder featuring cerebellar ataxia, early-onset cataracts, chronic myopathy, variable intellectual disability and delayed motor development. More recently, mutations in the SIL1 gene, which encodes an endoplasmic reticulum resident co-chaperone, were identified as the main cause of Marinesco-Sjögren syndrome. Here we describe the results of SIL1 mutation analysis in 62 patients presenting with early-onset ataxia, cataracts and myopathy or combinations of at least two of these. We obtained a mutation detection rate of 60% (15/25) among patients with the characteristic Marinesco-Sjögren syndrome triad (ataxia, cataracts, myopathy) whereas the detection rate in the group of patients with more variable phenotypic presentation was below 3% (1/37). We report 16 unrelated families with a total of 19 different SIL1 mutations. Continue reading

Recurrent loss of heterozygosity in 1p36 associated with TNFRSF14 mutations in IRF4 translocation negative pediatric follicular lymphomas

Martin-Guerrero I, Salaverria I, Burkhardt B, Szczepanowski M, Baudis M, Bens S, de Leval L, Garcia-Orad A, Horn H, Lisfeld J, Pellissery S, Klapper W, Oschlies I, Siebert R.¶

Abstract Pediatric follicular lymphoma is a rare disease that differs genetically and clinically from its adult counterpart. With the exception of pediatric follicular lymphoma with IRF4-translocation, the genetic events associated with these lymphomas have not yet been defined. We applied array-comparative genomic hybridization and molecular inversion probe assay analyses to formalin-fixed paraffin-embedded tissues from 18 patients aged 18 years and under with IRF4 translocation negative follicular lymphoma. Continue reading

High resolution copy number analysis of IRF4 translocation-positive diffuse large B-cell and follicular lymphomas

Salaverria I, Martin-Guerrero I, Burkhardt B, Kreuz M, Zenz T, Oschlies I, Arnold N, Baudis M, Bens S, García-Orad A, Lisfeld J, Schwaenen C, Szczepanowski M, Wessendorf S, Pfreundschuh M, Trümper L, Klapper W, Siebert R.¶

Abstract Translocations affecting chromosome subband 6p25.3 containing the IRF4 gene have been recently described as characteristic alterations in a molecularly distinct subset of germinal center B-cell-derived lymphomas. Secondary changes have yet only been described in few of these lymphomas. Here, we performed array-comparative genomic hybridization and molecular inversion probe microarray analyses on DNA from 12 formalin-fixed paraffin-embedded and two fresh-frozen IRF4 translocation-positive lymphomas, which together with the previously published data on nine cases allowed the extension of copy number analyses to a total of 23 of these lymphomas. Continue reading

PKC signaling prevents irradiation-induced apoptosis of primary human fibroblasts

Bluwstein A, Kumar N, Léger K, Traenkle J, Oostrum Jv, Rehrauer H, Baudis M, Hottiger MO.¶

Abstract Primary cells respond to irradiation by activation of the DNA damage response and cell cycle arrest, which eventually leads to senescence or apoptosis. It is not clear in detail which signaling pathways or networks regulate the induction of either apoptosis or senescence. Primary human fibroblasts are able to withstand high doses of irradiation and to prevent irradiation-induced apoptosis. However, the underlying regulatory basis for this phenotype is not well understood. Continue reading

Molecular karyotyping as a relevant diagnostic tool in children with growth retardation with Silver-Russell features

Spengler S, Begemann M, Ortiz Brüchle N, Baudis M, Denecke B, Kroisel PM, Oehl-Jaschkowitz B, Schulze B, Raabe-Meyer G, Spaich C, Blümel P, Jauch A, Moog U, Zerres K, Eggermann T.¶

Abstract OBJECTIVE: To determine the contribution of submicroscopic chromosomal imbalances to the etiology of Silver-Russell syndrome (SRS) and SRS-like phenotypes. STUDY DESIGN: We performed molecular karyotyping in 41 patients with SRS or SRS-like features without known chromosome 7 and 11 defects using the Affymetrix SNP Array 6.0 system (Affymetrix, High Wycombe, United Kingdom). Continue reading

2p21 Deletions in hypotonia-cystinuria syndrome

Eggermann T, Spengler S, Venghaus A, Denecke B, Zerres K, Baudis M, Ensenauer R.¶

Abstract The significant role of the SLC3A1 gene in the aetiology of cystinuria is meanwhile well established and more than 130 point mutations have been reported. With the reports on genomic deletions including at least both SLC3A1 and the neighboured PREPL gene the spectrum of cystinuria mutations and of clinical symptoms could recently be enlarged: patients homozygous for these deletions suffer from a general neonatal hypotonia and growth retardation in addition to cystinuria. The hypotonia in these hypotonia-cystinuria (HCS) patients has been attributed to the total loss of the PREPL protein. Continue reading

DNA copy number alterations in central primitive neuroectodermal tumors and tumors of the pineal region: an international individual patient data meta-analysis

von Bueren AO, Gerss J, Hagel C, Cai H, Remke M, Hasselblatt M, Feuerstein BG, Pernet S, Delattre O, Korshunov A, Rutkowski S, Pfister SM, Baudis M.¶

Abstract Little is known about frequency, association with clinical characteristics, and prognostic impact of DNA copy number alterations (CNA) on survival in central primitive neuroectodermal tumors (CNS-PNET) and tumors of the pineal region. Searches of MEDLINE, Pubmed, and EMBASE--after the original description of comparative genomic hybridization in 1992 and July 2010--identified 15 case series of patients with CNS-PNET and tumors of the pineal region whose tumors were investigated for genome-wide CNA. One additional case study was identified from contact with experts. Individual patient data were extracted from publications or obtained from investigators, and CNAs were converted to a digitized format suitable for data mining and subgroup identification. Summary profiles for genomic imbalances were generated from case-specific data. Overall survival (OS) was estimated using the Kaplan-Meier method, and by univariable and multivariable Cox regression models. Continue reading

Improved multiplex...

Improved multiplex ligation-dependent probe amplification analysis identifies a deleterious PMS2 allele generated by recombination with crossover between PMS2 and PMS2CL.¶

Wernstedt A, Valtorta E, Armelao F, Togni R, Girlando S, Baudis M, Heinimann K, Messiaen L, Staehli N, Zschocke J, Marra G, Wimmer K.¶

Abstract Heterozygous PMS2 germline mutations are associated with Lynch syndrome. Up to one third of these mutations are genomic deletions. Their detection is complicated by a pseudogene (PMS2CL), which--owing to extensive interparalog sequence exchange--closely resembles PMS2 downstream of exon 12. A recently redesigned multiplex ligation-dependent probe amplification (MLPA) assay identifies PMS2 copy number alterations with improved reliability when used with reference DNAs containing equal numbers of PMS2- and PMS2CL-specific sequences. We selected eight such reference samples--all publicly available--and used them with this assay to study 13 patients with PMS2-defective colorectal tumors. Continue reading

Integrative genome-wide expression profiling identifies three distinct molecular subgroups of renal cell carcinoma with different patient outcome

Beleut M, Zimmermann P, Baudis M, Bruni N, Bühlmann P, Laule O, Luu VD, Gruissem W, Schraml P, Moch H.¶

Abstract BACKGROUND: Renal cell carcinoma (RCC) is characterized by a number of diverse molecular aberrations that differ among individuals. Recent approaches to molecularly classify RCC were based on clinical, pathological as well as on single molecular parameters. As a consequence, gene expression patterns reflecting the sum of genetic aberrations in individual tumors may not have been recognized. In an attempt to uncover such molecular features in RCC, we used a novel, unbiased and integrative approach. Continue reading

Losses at chromosome...

Losses at chromosome 4q are associated with poor survival in operable ductal pancreatic adenocarcinoma.¶

Luebke AM, Baudis M, Matthaei H, Vashist YK, Verde PE, Hosch SB, Erbersdobler A, Klein CA, Izbicki JR, Knoefel WT, Stoecklein NH.¶

Abstract Here we tested the prognostic impact of genomic alterations in operable localized pancreatic ductal adenocarcinoma (PDAC). Fifty-two formalin-fixed and paraffin-embedded primary PDAC were laser micro-dissected and were investigated by comparative genomic hybridization after whole genome amplification using an adapter-linker PCR. Chromosomal gains and losses were correlated to clinico-pathological parameters and clinical follow-up data. The most frequent aberration was loss on chromosome 17p (65%) while the most frequent gains were detected at 2q (41%) and 8q (41%), respectively. The concomitant occurrence of losses at 9p and 17p was found to be statistically significant. Higher rates of chromosomal losses were associated with a more advanced primary tumor stage and losses at 9p and 18q were significantly associated with presence of lymphatic metastasis (chi-square: p = 0.03, p = 0.05, respectively). Deletions on chromosome 4 were of prognostic significance for overall survival and tumor recurrence (Cox-multivariate analysis: p = 0.026 and p = 0.021, respectively). In conclusion our data suggest the common alterations at chromosome 8q, 9p, 17p and 18q as well as the prognostic relevant deletions on chromosome 4q as relevant for PDAC progression. Our comprehensive data from 52 PDAC should provide a basis for future studies with a higher resolution to discover the relevant genes located within the chromosomal aberrations identified.

Specific genomic regions are differentially affected by copy number alterations across distinct cancer types, in aggregated cytogenetic data

Kumar N, Cai H, von Mering C, Baudis M.¶

Abstract BACKGROUND: Regional genomic copy number alterations (CNA) are observed in the vast majority of cancers. Besides specifically targeting well-known, canonical oncogenes, CNAs may also play more subtle roles in terms of modulating genetic potential and broad gene expression patterns of developing tumors. Any significant differences in the overall CNA patterns between different cancer types may thus point towards specific biological mechanisms acting in those cancers. In addition, differences among CNA profiles may prove valuable for cancer classifications beyond existing annotation systems. PRINCIPAL FINDINGS: We have analyzed molecular-cytogenetic data from 25579 tumors samples, which were classified into 160 cancer types according to the International Classification of Disease (ICD) coding system. Continue reading

arrayMap: a reference resource for genomic copy number imbalances in human malignancies

Cai H, Kumar N, Baudis M.¶

Abstract BACKGROUND: The delineation of genomic copy number abnormalities (CNAs) from cancer samples has been instrumental for identification of tumor suppressor genes and oncogenes and proven useful for clinical marker detection. An increasing number of projects have mapped CNAs using high-resolution microarray based techniques. So far, no single resource does provide a global collection of readily accessible oncogenomic array data. METHODOLOGY/PRINCIPAL FINDINGS: We here present arrayMap, a curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. Continue reading

Silver-Russell patients showing a broad range of ICR1 and ICR2 hypomethylation in different tissues

Begemann M, Spengler S, Kanber D, Haake A, Baudis M, Leisten I, Binder G, Markus S, Rupprecht T, Segerer H, Fricke-Otto S, Mühlenberg R, Siebert R, Buiting K, Eggermann T.¶

Abstract In all known congenital imprinting disorders an association with aberrant methylation or mutations at specific loci was well established. However, several patients with transient neonatal diabetes mellitus (TNDM), Silver-Russell syndrome (SRS) and Beckwith-Wiedemann syndrome (BWS) exhibiting multilocus hypomethylation (MLH) have meanwhile been described. Whereas TNDM patients with MLH show clinical symptoms different from carriers with isolated 6q24 aberrations, MLH carriers diagnosed as BWS or SRS present only the syndrome-specific features. Continue reading

CDCOCA: a statistical method to define complexity dependence of co-occuring chromosomal aberrations

Kumar N, Rehrauer H, Cai H, Baudis M.¶

Abstract BACKGROUND: Copy number alterations (CNA) play a key role in cancer development and progression. Since more than one CNA can be detected in most tumors, frequently co-occurring genetic CNA may point to cooperating cancer related genes. Existing methods for co-occurrence evaluation so far have not considered the overall heterogeneity of CNA per tumor, resulting in a preferential detection of frequent changes with limited specificity for each association due to the high genetic instability of many samples. Continue reading

MUC1 oncogene amplification correlates with protein overexpression in invasive breast carcinoma cells

Lacunza E, Baudis M, Colussi AG, Segal-Eiras A, Croce MV, Abba MC.¶

Abstract The MUC1 gene is aberrantly overexpressed in approximately 90% of human breast cancers. Several studies have shown that MUC1 overexpression is due to transcriptional regulatory events. However, the importance of gene amplification as a mechanism leading to the increase of MUC1 expression in breast cancer has been poorly characterized. The aim of this study was to evaluate the role of MUC1 gene amplification and protein expression in human breast cancer development. By means of real-time quantitative polymerase chain reaction and immunohistochemical methods, 83 breast tissue samples were analyzed for MUC1 gene amplification and protein expression. Continue reading

Chromosome 11p15 duplication in Silver-Russell syndrome due to a maternally inherited translocation t(11;15)

Eggermann T, Spengler S, Bachmann N, Baudis M, Mau-Holzmann UA, Singer S, Rossier E.¶

Abstract The role of 11p15 disturbances in the aetiology of Silver-Russell syndrome (SRS) is well established: in addition to hypomethylation of the H19/IGF2 differentially methylated regions, five patients with a duplication of maternal 11p15 material have been described. We report on a boy with SRS carrying a maternally inherited duplication of chromosome 11p15. Continue reading

Increased expression...

Increased expression of cellular retinol-binding protein 1 in laryngeal squamous cell carcinoma.¶

Peralta R, Baudis M, Vazquez G, Juárez S, Ortiz R, Decanini H, Hernandez D, Gallegos F, Valdivia A, Piña P, Salcedo M.¶

Abstract To investigate the genomic alterations in larynx carcinomas (LaCa) tissues and its prognostics values in predicting survival.To analyse the aberrations in the genome of LaCa patients, we used array comparative genomic hybridization in 19 human laryngeal tumour samples. DNA samples were also subjected to detect human papillomavirus (HPV) sequences by polymerase chain reaction (PCR). Copy number gain was confirmed by real-time PCR. The cellular retinol-binding protein 1 (CRBP-1) gene expression was also confirmed by immunohistochemistry assay on LaCa tissues. To identify prognostic feature, CRBP-1 gene gain was correlated to patient survival.The most common gains were detected for CRBP-1 and EGFR genes, while DNA lost in RAF-1 gene. Immunohistochemistry assay was revealed strong expression of CRBP1 protein in those cases with CRBP-1 gene gain. The CRBP-1 gene gain and its expression correlated significantly with survival (P = 0.003). Cox regression analysis indicated that CRBP-1 expression level was a factor of survival (P = 0.008). HPV sequences were detected in 42% of the samples, and did not show any relationship with specific gene alterations.Our data shows that CRBP-1 gene gain can be determined by immunohistochemistry on routinely processed tissue specimens, and could support as a potential novel marker for long-term survival in laryngeal squamous cell carcinoma.

Submicroscopic chromosomal imbalances in idiopathic Silver-Russell syndrome (SRS): the SRS phenotype overlaps with the 12q14 microdeletion syndrome

Spengler S, Schönherr N, Binder G, Wollmann HA, Fricke-Otto S, Mühlenberg R, Denecke B, Baudis M, Eggermann T.¶

Abstract Silver-Russell syndrome (SRS) is a heterogeneous disorder associated with intrauterine and postnatal growth restriction, body asymmetry, a relative macrocephaly, a characteristic triangular face and further dysmorphisms. In about 50% of patients, genetic/epigenetic alterations can be detected: >38% of patients show a hypomethylation of the IGF2/H19 imprinting region in 11p15, whereas the additional 10% carry a maternal uniparental disomy of chromosome 7. Continue reading

Identification of a...

Identification of a 21q22 duplication in a Silver-Russell syndrome patient further narrows down the Down syndrome critical region.¶

Eggermann T, Schönherr N, Spengler S, Jäger S, Denecke B, Binder G, Baudis M.¶

Abstract Several duplications of chromosome 21q helped to narrow down the Down syndrome (DS) critical region (DSCR) to chromosomal band 21q22 with an approximate length of 5.4 Mb. Recently, it has been suggested that the facial gestalt of DS has been linked to the distal part of the DSCR whereas the proximal region harboring DSCR1/RCAN and DSCAM should be associated with the cardiac abnormalities. Here, we report on a patient with Silver-Russell syndrome (SRS) and a paternally inherited 0.46 Mb duplication in 21q22 affecting the KCNE1 and DSCR1/RCAN genes. The identification of an involvement of KCNE1 was interesting because it encodes the beta-subunit of the KvLQT1 channel as the slow component of the cardiac delayed rectifier K(+) current. Since duplication of the KCNQ1 gene encoding the alpha-subunit of the same channel was reported recently in another SRS patient, we screened both genes for mutations in a cohort of SRS patients without detecting pathologic variants. We presume that the duplication of the two functionally linked genes in different patients with the same disorder is a coincidental finding. However, the lack of DS typical clinical features in our case allows us to further narrow down the DSCR in 21q22. We conclude that DSCR1/RCAN is not sufficient for generating phenotypic features associated with DS but our observation does not contradict a possible role for DSCR1/RCAN in mediating DYRK1A-based effects.

Quantifying cancer progression with conjunctive Bayesian networks

Gerstung M, Baudis M, Moch H, Beerenwinkel N.¶

Abstract MOTIVATION: Cancer is an evolutionary process characterized by accumulating mutations. However, the precise timing and the order of genetic alterations that drive tumor progression remain enigmatic. RESULTS: We present a specific probabilistic graphical model for the accumulation of mutations and their interdependencies. The Bayesian network models cancer progression by an explicit unobservable accumulation process in time that is separated from the observable but error-prone detection of mutations. Continue reading

Inferring progressio...

Inferring progression models for CGH data.¶

Liu J, Bandyopadhyay N, Ranka S, Baudis M, Kahveci T.¶

Abstract MOTIVATION: One of the mutational processes that has been monitored genome-wide is the occurrence of regional DNA copy number alterations (CNAs), which may lead to deletion or over-expression of tumor suppressors or oncogenes, respectively. Understanding the relationship between CNAs and different cancer types is a fundamental problem in cancer studies. RESULTS: This article develops an efficient method that can accurately model the progression of the cancer markers and reconstruct evolutionary relationship between multiple types of cancers using comparative genomic hybridization (CGH) data. Such modeling can lead to better understanding of the commonalities and differences between multiple cancer types and potential therapies. We have developed an automatic method to infer a graph model for the markers of multiple cancers from a large population of CGH data. Our method identifies highly related markers across different cancer types. It then builds a directed acyclic graph that shows the evolutionary history of these markers based on how common each marker is in different cancer types. We demonstrated the use of this model in determining the importance of markers in cancer evolution. We have also developed a new method to measure the evolutionary distance between different cancers based on their markers. This method employs the graph model we developed for the individual markers to measure the distance between pairs of cancers. We used this measure to create an evolutionary tree for multiple cancers. Our experiments on Progenetix database show that our markers are largely consistent to the reported hot-spot imbalances and most frequent imbalances. The results show that our distance measure can accurately reconstruct the evolutionary relationship between multiple cancer types.

Recurrent loss, but...

Recurrent loss, but lack of mutations, of the SMARCB1 tumor suppressor gene in T-cell prolymphocytic leukemia with TCL1A-TCRAD juxtaposition.¶

Bug S, Dürig J, Oyen F, Klein-Hitpass L, Martin-Subero JI, Harder L, Baudis M, Arnold N, Kordes U, Dührsen U, Schneppenheim R, Siebert R.¶

Abstract In T-cell prolymphocytic leukemia (T-PLL), chromosomal imbalances affecting the long arm of chromosome 22 are regarded as typical chromosomal aberrations secondary to a TCRAD-TCL1A fusion due to inv(14) or t(14;14). We analyzed recently obtained data from conventional karyotyping, SNP-chip array copy number mapping, genome-wide expression profiling, and interphase fluorescence in situ hybridization (FISH) of inv(14)-positive T-PLL with respect to structural aberrations on chromosome 22. Combined gene chip and interphase FISH analyses revealed interstitial deletions on 22q in 4 of 12 cases, with one case additionally showing a terminal copy number gain. A minimally deleted region of approximately 9.1 Mb was delineated, from 16.2 Mb (22cen) to 25.3 Mb (22q12.1). The distal borders of copy number alterations spread over a region of approximately 8.8 Mb, from 25.2 Mb (22q12.1) to 34 Mb (22q12.3). Mutation screening of candidate tumor suppressor genes SMARCB1 and CHEK2 mapping to the minimally deleted and the breakpoint regions, respectively, in cases with hemizygous deletion, revealed no inactivating mutations. With gene expression profiling, no significantly downregulated genes were identified in the minimally deleted region. We therefore assume that haploinsufficiency or alternative pathomechanisms underlie chromosome 22 aberrations in T-PLL.

Translocations involving 8q24 in Burkitt lymphoma and other malignant lymphomas: a historical review of cytogenetics in the light of todays knowledge

Boerma EG, Siebert R, Kluin PM, Baudis M.¶

Abstract Burkitt lymphoma (BL) has a characteristic clinical presentation, morphology, immunophenotype and primary chromosomal aberration, that is, the translocation t(8;14)(q24;q32) or its variants. However, diagnostic dilemmas may arise in daily practice due to overlap of BL with subsets of other aggressive, mature B-cell lymphomas such as diffuse large B-cell lymphomas (DLBCL). Recently, two gene expression studies have described a distinct molecular profile for BL, but also showed the persistence of some cases intermediate between BL and DLBCL. An alternative approach to define BL is to consider (cyto)genetic data, in particular chromosomal abnormalities other than the t(8;14) or its variants. In this review the 'Mitelman Database of Chromosome Aberrations in Cancer,' harboring the majority of all published neoplasia-related karyotypes, was explored to define a cytogenetic profile of 'true' BL. Continue reading

Chromosomal changes characterize head and neck cancer with poor prognosis

Bauer VL, Braselmann H, Henke M, Mattern D, Walch A, Unger K, Baudis M, Lassmann S, Huber R, Wienberg J, Werner M, Zitzelsberger HF.¶

Abstract It is well established that genetic alterations may be associated to prognosis in tumor patients. This study investigates chromosomal changes that predict the clinical outcome of head and neck squamous cell carcinoma (HNSCC) and correlate to characteristic clinicopathological parameters. We applied comparative genomic hybridization (CGH) to tissue samples from 117 HNSCC patients scheduled for radiotherapy. Genomic aberrations occurring in more than five patients were studied for impact on locoregional progression (LRP)-free survival. Continue reading

A 10.7 Mb interstitial deletion of 13q21 without phenotypic effect defines a further non-pathogenic euchromatic variant

Roos A, Elbracht M, Baudis M, Senderek J, Schönherr N, Eggermann T, Schüler HM.¶

Abstract Chromosome 13 deletions are associated with widely varying phenotypes but the clinical picture nearly almost includes mental and growth retardation, craniofacial dysmorphisms, and/or malformations. Several attempts have been made to link monosomy 13q intervals with specific clinical features, but a genotype-phenotype correlation could not be delineated. We report on a woman with a normal phenotype and intelligence referred for chromosomal analysis because of recurrent abortions followed by reproductive loss. Continue reading

Comprehensive characterization of genomic aberrations in gangliogliomas by CGH, array-based CGH and interphase FISH

Hoischen A, Ehrler M, Fassunke J, Simon M, Baudis M, Landwehr C, Radlwimmer B, Lichter P, Schramm J, Becker AJ, Weber RG.¶

Abstract Gangliogliomas are generally benign neuroepithelial tumors composed of dysplastic neuronal and neoplastic glial elements. We screened 61 gangliogliomas [World Health Organization (WHO) grade I] for genomic alterations by chromosomal and array-based comparative genomic hybridization (CGH). Aberrations were detected in 66% of gangliogliomas (mean +/- SEM = 2.5 +/- 0.5 alterations/tumor). Frequent gains were on chromosomes 7 (21%), 5 (16%), 8 (13%), 12 (12%); frequent losses on 22q (16%), 9 (10%), 10 (8%). Recurrent partial imbalances comprised the minimal overlapping regions dim(10)(q25) and enh(12)(q13.3-q14.1). Continue reading

Recurrent loss of the Y chromosome and homozygous deletions within the pseudoautosomal region 1: association with male predominance in mantle cell lymphoma

Nieländer I, Martín-Subero JI, Wagner F, Baudis M, Gesk S, Harder L, Hasenclever D, Klapper W, Kreuz M, Pott C, Martinez-Climent JA, Dreyling M, Arnold N, Siebert R.¶

Mantle cell lymphoma (MCL) is a B-cell lymphoproliferative disorder which predominantly affects men. In a large retrospective survey of the European MCL Network including 304 patients, median age was 63 years at first diagnosis with a male preponderance of 76%.1 The genetic hallmark of MCL is the translocation t(11;14)(q13;q32) which leads to overexpression of the CCND1 gene encoding Cyclin D1. Although recent studies revealed a number of genomic alterations and differentially expressed genes in MCL, the causes for the male predominance are still unknown. Hormonal differences might contribute to this gender imbalance. Continue reading

Combined single nucleotide polymorphism-based genomic mapping and global gene expression profiling identifies novel chromosomal imbalances, mechanisms and candidate genes important in the pathogenesis of T-cell prolymphocytic leukemia with inv(14)(q11q32)

Combined single nucleotide polymorphism-based genomic mapping and global gene expression profiling identifies novel chromosomal imbalances, mechanisms and candidate genes important in the pathogenesis of T-cell prolymphocytic leukemia with inv(14)(q11q32).¶

Dürig J, Bug S, Klein-Hitpass L, Boes T, Jöns T, Martin-Subero JI, Harder L, Baudis M, Dührsen U, Siebert R.¶

Abstract T-cell prolymphocytic leukemia (T-PLL) is a rare aggressive lymphoma derived from mature T cells, which is, in most cases, characterized by the presence of an inv(14)(q11q32)/t(14;14)(q11;q32) and a characteristic pattern of secondary chromosomal aberrations. DNA microarray technology was employed to compare the transcriptomes of eight immunomagnetically purified CD3+ normal donor-derived peripheral blood cell samples, with five highly purified inv(14)/t(14;14)-positive T-PLL blood samples. Between the two experimental groups, 734 genes were identified as differentially expressed, including functionally important genes involved in lymphomagenesis, cell cycle regulation, apoptosis and DNA repair. Notably, the differentially expressed genes were found to be significantly enriched in genomic regions affected by recurrent chromosomal imbalances. Upregulated genes clustered on chromosome arms 6p and 8q, and downregulated genes on 6q, 8p, 10p, 11q and 18p. High-resolution copy-number determination using single nucleotide polymorphism chip technology in 12 inv(14)/t(14;14)-positive T-PLL including those analyzed for gene expression, refined chromosomal breakpoints as well as regions of imbalances. In conclusion, combined transcriptional and molecular cytogenetic profiling identified novel specific chromosomal loci and genes that are likely to be involved in disease progression and suggests a gene dosage effect as a pathogenic mechanism in T-PLL.

Genomic imbalances in 5918 malignant epithelial tumors: an explorative meta-analysis of chromosomal CGH data

Baudis M.¶

Abstract BACKGROUND: Chromosomal abnormalities have been associated with most human malignancies, with gains and losses on some genomic regions associated with particular entities. METHODS: Of the 15429 cases collected for the Progenetix molecular-cytogenetic database, 5918 malignant epithelial neoplasias analyzed by chromosomal Comparative Genomic Hybridization (CGH) were selected for further evaluation. For the 22 clinico-pathological entities with more than 50 cases, summary profiles for genomic imbalances were generated from case specific data and analyzed.

ABCB1 over-expression and drug-efflux in acute lymphoblastic leukemia cell lines with t(17;19) and E2A-HLF expression

Baudis M, Prima V, Tung YH, Hunger SP.¶

Abstract BACKGROUND: The t(17;19)(q21;p13), which occurs in a small subset of acute lymphoblastic leukemias (ALLs) and is associated with a dismal prognosis, creates a chimeric E2A-HLF transcription factor with transforming properties. PROCEDURE: We used representational difference analysis to identify candidate E2A-HLF target genes. Transient transfection assays and an inducible expression model system were then used to evaluate the ability of E2A-HLF to modulate target gene expression. RESULTS: We identified ABCB1 (MDR1, P-glycoprotein) as a gene differentially expressed in ALL cell lines with and without E2A-HLF expression and demonstrated that t(17;19)+ ALL cell lines expressed high levels of ABCB1 protein and had a drug efflux-positive phenotype. Although ABCB1 transcription is regulated by C/EBPbeta via interaction with a DNA response element that shares significant homology with the optimal E2A-HLF binding site, E2A-HLF did not directly activate transcription of reporter genes under control of ABCB1 promoter elements in transient transfection assays. However, ABCB1 expression was induced in a DNA-binding independent manner by E2A-HLF, E2A-PBX1, and truncated E2A polypeptides consisting of those portions of E2A present in leukemic fusion proteins. CONCLUSIONS: E2A-HLF-mediated over-expression of ABCB1 may play a critical role in defining the clinical phenotype of ALLs with a t(17;19), suggesting pharmacologic modulation of ABCB1 activity as a rational therapeutic strategy for this chemotherapy resistant subtype of ALL.

Distance-based clustering of CGH data

Liu J, Mohammed J, Carter J, Ranka S, Kahveci T, Baudis M.¶

Abstract MOTIVATION: We consider the problem of clustering a population of Comparative Genomic Hybridization (CGH) data samples. The goal is to develop a systematic way of placing patients with similar CGH imbalance profiles into the same cluster. Our expectation is that patients with the same cancer types will generally belong to the same cluster as their underlying CGH profiles will be similar. RESULTS: We focus on distance-based clustering strategies. We do this in two steps. (1) Distances of all pairs of CGH samples are computed. (2) CGH samples are clustered based on this distance. We develop three pairwise distance/similarity measures, namely raw, cosine and sim. Raw measure disregards correlation between contiguous genomic intervals. It compares the aberrations in each genomic interval separately. The remaining measures assume that consecutive genomic intervals may be correlated. Cosine maps pairs of CGH samples into vectors in a high-dimensional space and measures the angle between them. Sim measures the number of independent common aberrations. We test our distance/similarity measures on three well known clustering algorithms, bottom-up, top-down and k-means with and without centroid shrinking. Our results show that sim consistently performs better than the remaining measures. This indicates that the correlation of neighboring genomic intervals should be considered in the structural analysis of CGH datasets. The combination of sim with top-down clustering emerged as the best approach. AVAILABILITY: All software developed in this article and all the datasets are available from the authors upon request. CONTACT: juliu@cise.ufl.edu.

Allele-specific loss of heterozygosity in multiple colorectal adenomas: toward an integrated molecular cytogenetic map II

Mao X, Hamoudi RA, Talbot IC, Baudis M.¶

Abstract Colorectal cancer (CRC) remains a significant public health challenge despite our increased understanding of the genetic defects underlying the pathogenesis of this common disease. It has been thought that multiple mechanisms lead to the malignant phenotype, with familial predisposition syndromes accounting for only a small proportion of all CRC cases. To identify additional loci likely involved in CRC and to test the hypothesis of allele-specific loss of heterozygosity (LOH) for the localization of CRC susceptibility genes, we initially conducted a genome-wide allelotyping analysis of 48 adenomas from a patient with familial adenomatous polyposis coli (FAP) and 63 adenomas from 7 patients with sporadic CRC using 79 fluorescently tagged oligonucleotide primers amplifying microsatellite loci covering the human genome. Continue reading

Online database and bioinformatics toolbox to support data mining in cancer cytogenetics

Online database and bioinformatics toolbox to support data mining in cancer cytogenetics.¶

Baudis M.¶

Genetic losses in breast cancer: toward an integrated molecular cytogenetic map

Mao X, Hamoudi RA, Zhao P, Baudis M.¶

Abstract Breast cancer is the most common malignant disease in Caucasian women, but is less frequent in Chinese women. The molecular basis for such ethnical difference in disease pathogenesis remains unknown. To address this issue, we performed allelotyping analysis of formalin-fixed, paraffin-embedded samples from 21 Chinese patients with breast cancer using 59 fluorescently tagged oligonucleotide primers amplifying microsatellite loci. Loss of heterozygosity (LOH) was found in all tumor samples. Frequent allelic losses were identified at markers D3S1578 (56%); D7S507 (55%); D1S2766 (50%); D17S789 and D17S946 (43% each); D19S814 (35%); D2S162, D13S158 and D13S296 (33% each); D1S551 and D1S2800 (29% each); D3S1597 and D6S260 (22% each); and D1S1588 (21%). To compare our data to previous reports, we determined the band-specific frequency of chromosomal imbalances in breast cancer karyotypes reported in the Mitelman database, and from the CGH results of cases accessible through the Progenetix website. Furthermore, published LOH analyses of breast cancer cases were compared to our own LOH results, demonstrating the most common chromosomal regions affected by allelic losses. The combined results provide a comprehensive view of genetic losses in breast cancers, indicating the comparability of these different techniques and suggesting the presence of a distinct subset of breast cancers with high-frequency LOH at chromosomes 1 and 2p in Chinese patients.

Unequivocal delineation of clinicogenetic subgroups and development of a new model for improved outcome prediction in neuroblastoma

Vandesompele J, Baudis M, De Preter K, Van Roy N, Ambros P, Bown N, Brinkschmidt C, Christiansen H, Combaret V, Lastowska M, Nicholson J, O'Meara A, Plantaz D, Stallings R, Brichard B, Van den Broecke C, De Bie S, De Paepe A, Laureys G, Speleman F.¶

Abstract PURPOSE: Neuroblastoma is a genetically heterogeneous pediatric tumor with a remarkably variable clinical behavior ranging from widely disseminated disease to spontaneous regression. In this study, we aimed for comprehensive genetic subgroup discovery and assessment of independent prognostic markers based on genome-wide aberrations detected by comparative genomic hybridization (CGH). Continue reading

Microarray comparative genomic hybridization detection of chromosomal imbalances in uterine cervix carcinoma

Hidalgo A, Baudis M, Petersen I, Arreola H, Piña P, Vázquez-Ortiz G, Hernández D, González J, Lazos M, López R, Pérez C, García J, Vázquez K, Alatorre B, Salcedo M.¶

Abstract BACKGROUND: Chromosomal Comparative Genomic Hybridization (CGH) has been applied to all stages of cervical carcinoma progression, defining a specific pattern of chromosomal imbalances in this tumor. However, given its limited spatial resolution, chromosomal CGH has offered only general information regarding the possible genetic targets of DNA copy number changes. METHODS: In order to further define specific DNA copy number changes in cervical cancer, we analyzed 20 cervical samples (3 pre-malignant lesions, 10 invasive tumors, and 7 cell lines), using the GenoSensor microarray CGH system to define particular genetic targets that suffer copy number changes. Continue reading

Progenetix.net: an online repository for molecular cytogenetic aberration data.

Baudis M, Cleary ML.¶

Abstract Through sequencing projects and, more recently, array-based expression analysis experiments, a wealth of genetic data has become accessible via online resources. In contrast, few of the (molecular-) cytogenetic aberration data collected in the last decades are available in a format suitable for data mining procedures. www.progenetix.net is a new online repository for previously published chromosomal aberration data, allowing the addition of band-specific information about chromosomal imbalances to oncologic data analysis efforts.

Gain of chromosome arm 9p is characteristic of primary mediastinal B-cell lymphoma (MBL): comprehensive molecular cytogenetic analysis and presentation of a novel MBL cell line

Bentz M, Barth TF, Brüderlein S, Bock D, Schwerer MJ, Baudis M, Joos S, Viardot A, Feller AC, Müller-Hermelink HK, Lichter P, Döhner H, Möller P.¶

Abstract Primary mediastinal B-cell lymphoma (MBL) is an aggressive Non-Hodgkin's Lymphoma, which has been recognized as a distinct disease entity. We performed a comprehensive molecular cytogenetic study analyzing 43 MBLs. By comparative genomic hybridization (CGH), the most common aberrations were gains of chromosome arms 9p and Xq, which were present in 56% and 40% of cases, respectively. Based on the limited resolution of CGH, this technique may underestimate the real incidence of aberrations. Continue reading

Comparative genomic hybridization for the analysis of leukemias and lymphomas

Baudis M, Bentz M.¶

Abstract Cytogenetic methods have become increasingly important tools for both research in hematological malignancies and for the diagnostic workup of leukemias and lymphomas. The knowledge about specific chromosomal aberrations has been an essential prerequisite for the identification of pathogenetically relevant genes. Important examples are molecular genetic analyses of the breakpoint regions in chromosomal translocations, which resulted in the detection of protooncogenes such as ABL in chronic myeloid leukemia (CML) and acute lymphoblastic leukemia (ALL) [t(9,22)(q34;q11)], or MYC in Burkitt's lymphoma [t(8;14)(q24;q32); for a review see refs. 1 and 2].

Potential of chromosomal and matrix-based comparative genomic hybridization for molecular diagnostics in lymphomas

Potential of chromosomal and matrix-based comparative genomic hybridization for molecular diagnostics in lymphomas.¶

Wessendorf S, Lichter P, Schwänen C, Fritz B, Baudis M, Walenta K, Kloess M, Döhner H, Bentz M.¶

Abstract Genome research lead to one of the largest scientific achievements of the last decade. Apart from sequencing the human genome, there was also a huge increase of knowledge regarding genomic aberrations in cancer. In Non-Hodgkin’s lymphomas (NHL), some of these findings already are of clinical relevance; specific genomic aberrations are characteristic of distinct subtypes of NHL and correlate with certain morphological, immunological and clinical findings. Continue reading

t(11;14)-positive mantle cell lymphomas exhibit complex karyotypes and share similarities with B-cell chronic lymphocytic leukemia

Bentz M, Plesch A, Bullinger L, Stilgenbauer S, Ott G, Müller-Hermelink HK, Baudis M, Barth TF, Möller P, Lichter P, Döhner H.¶

Abstract Until now, few data on additional chromosomal aberrations in t(11;14)-positive mantle cell lymphomas (MCLs) have been published. We analyzed 39 t(11;14)-positive MCLs by either comparative genomic hybridization (CGH; n = 8), fluorescence in situ hybridization (FISH) with a set of DNA probes detecting the most frequent aberrations in B-cell neoplasms (n = 12), or both techniques (n = 19). The t(11;14) was present in all cases. In 37 of 39 cases, chromosomal imbalances were found. Continue reading

Analysis of genomic alterations in benign, atypical, and anaplastic meningiomas: toward a genetic model of meningioma progression

Weber RG, Boström J, Wolter M, Baudis M, Collins VP, Reifenberger G, Lichter P.¶

Abstract Nineteen benign [World Health Organization (WHO) grade I; MI], 21 atypical (WHO grade II; MII), and 19 anaplastic (WHO grade III; MIII) sporadic meningiomas were screened for chromosomal imbalances by comparative genomic hybridization (CGH). These data were supplemented by molecular genetic analyses of selected chromosomal regions and genes. With increasing malignancy grade, a marked accumulation of genomic aberrations was observed; i.e., the numbers (mean +/- SEM) of total alterations detected per tumor were 2.9 +/- 0.7 for MI, 9.2 +/- 1.2 for MII, and 13.3 +/- 1.9 for MIII. Continue reading

High-level DNA amplifications are common genetic aberrations in B-cell neoplasms

Werner CA, Döhner H, Joos S, Trümper LH, Baudis M, Barth TF, Ott G, Möller P, Lichter P, Bentz M.¶

Abstract Gene amplification is one of the molecular mechanisms resulting in the up-regulation of gene expression. In non-Hodgkin's lymphomas, such gene amplifications have been identified rarely. Using comparative genomic hybridization, a technique that has proven to be very sensitive for the detection of high-level DNA amplifications, we analyzed 108 cases of B-cell neoplasms (42 chronic B-cell leukemias, 5 mantle cell lymphomas, and 61 aggressive B-cell lymphomas). Continue reading

Efficacy of current molecular cytogenetic protocols for the diagnosis of chromosome aberrations in tumor specimens

Lichter P, Fischer K, Joos S, Fink T, Baudis M, Potkul RK, Ohl S, Solinas-Toldo S, Weber R, Stilgenbauer S, Bentz M, Döhner H.¶

Abstract Molecular cytogenetics provides a powerful link between molecular genetic analysis and chromosome morphology, allowing one to pinpoint structurally aberrant chromosome regions on the molecular level. Fluorescence in situ hybridization with selected DNA probes allows the design of efficient and sensitive tools for the diagnosis of chromosomal aberrations present in tumor cells. Comparative genomic hybridization (CGH) allows the identification of chromosomal imbalances in a comprehensive manner, and is applied to solid tumors and hematological malignancies in order to (i) identify clonal differences within a specimen, (ii) contribute to tumor classifications, (iii) identify recurrent chromosomal gains and losses as starting points for the characterization and isolation of pathogenetically relevant genes, such as proto-oncogenes and tumor suppressor genes respectively, (iv) identify imbalances of prognostic relevance, (v) detect high-copy-number amplification and other markers of genetic instability, and (vi) analyze chromosomal imbalances during tumor progression.

Chromosome imbalances in papillary renal cell carcinoma and first cytogenetic data of familial cases analyzed by comparative genomic hybridization

Bentz M, Bergerheim US, Li C, Joos S, Werner CA, Baudis M, Gnarra J, Merino MJ, Zbar B, Linehan WM, Lichter P.¶

Abstract We used comparative genomic hybridization to analyze 17 tumor samples from 11 patients with papillary renal cell carcinoma (RCC), including three patients with hereditary papillary RCC. Whereas the most frequent aberrations confirmed data obtained by banding analyses, copy number increases on 5q, which previously were considered characteristic of nonpapillary RCC, were identified in two cases. Continue reading

Identification of genetic imbalances in malignant lymphoma using comparative genomic hybridization

Bentz M, Döhner H, Werner CA, Huck K, Baudis M, Joos S, Schlegelberger B, Trümper LH, Feller AC, Pfreundschuh M.¶

Abstract In comparison to leukemias, the clinical relevance of chromosomal aberrations in non-Hodgkin's lymphoma (NHL) is not as well understood. This is primarily due to limitations of chromosomal banding techniques which have been the central methods for cytogenetic analysis. These techniques depend on the availability of fresh tumor tissue and the examination of metaphase cells which may not be representative for the major cell clone in vivo. In contrast, the new technique of comparative genomic hybridization (CGH) allows researchers to obtain a comprehensive view of chromosomal gains and losses by analyzing tumor DNA, which can be prepared from archival tissue samples. Continue reading

Baudisgroup Publications¶

cancercelllines.org - a Novel Resource for Genomic Variants in Cancer Cell Lines

DATABASE Article

Rahel Paloots and Michael Baudis¶

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030¶

bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281¶

Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines

Literature-derived annotations as entry point for data exploration

Ellery Smith, Rahel Paloots, Dimitris Giagkos, Michael Baudis and Kurt Stockinger¶

Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045¶

Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933¶

Twelve quick tips for deploying a Beacon

Some hints for Beacon developers & implementers

Lauren A Fromont, Mauricio Moldes, Michael Baudis, Anthony J Brookes, Arcadi Navarro and Jordi Rambla¶

PLoS Comput Biol. 2024 Mar 1;20(3):e1011817.¶

labelSeg: segment annotation for tumor copy number alteration profiles

A tool to assign relative SCNA levels to segments

Hangjia Zhao and Michael Baudis¶

Briefings in Bioinformatics (Oxford). 2024 Jan 31;2024:bbad541.¶

Short tandem repeat mutations regulate gene expression in colorectal cancer

Exploring STR patterns and their relation to expression changes in cancer

Max A Verbiest, Oxana Lundström, Feifei Xia, Michael Baudis, Tugce Bilgin Sonay, Maria Anisimova¶

doi: https://doi.org/10.1101/2023.11.29.569189¶

Phenopacket-tools: Building and validating GA4GH Phenopackets

Bioinformatics tools and examples for working with the Phenopackets standard

Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. (2023) Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One. 18:e0285433.¶

Candidate targets of copy number deletion events across 17 cancer types

Identifying cancer related genes against the background of somatic CNV events

Huang Q and Baudis M¶

doi: 10.3389/fgene.2022.1017657¶

previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080¶

GA4GH Phenopackets: A Practical Introduction

Phenopackets v2 introduction with practical examples

Ladewig MS, Jacobsen JO, Wagner AH, Danis D, Kassaby BE, Gargano M, Groza T, Baudis M, Steinhaus R, Seelow D, Bechrakis NE, Mungall CJ, Schofield PN, Elemento O, Smith L, McMurry JA, Munoz-Torres M, Haendel MA and Robinson PN¶

Advanced Genetics 2022, 2200016. LINK¶

The GA4GH Phenopacket schema defines a computable representation of clinical data

Phenopackets v2 publication

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Buske OJ, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Essenwanger A, Freimuth RR, ... , Haendel MA, Robinson PN, The GAGHPMC.¶

Nature Biotechnology. 2022;40:817-820. LINK | PMID:35705716¶

Beacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond

Beacon v2 publication

Rambla J, Baudis M, Ariosa R, Beck T, Fromont LA, Navarro A, Paloots R, Rueda M, Saunders G, Singh B, Spalding JD.¶

Human Mutation. 2022 Mar 17. PMID:35297548¶

The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine

Phenopackets v2 preprint

Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Callahan TJ, Chute CG, Courtot M, Danis D, Elemento O, Freimuth RR, ..., Haendel MA, Robinson PN.¶

medRxiv, 2021.11.27.21266944. doi:10.1101/2021.11.27.21266944¶

The GA4GH Variation Representation Specification (VRS): a Computational Framework for the Precise Representation and Federated Identification of Molecular Variation.

Alex H. Wagner, Lawrence Babb, Gil Alterovitz, Michael Baudis, Matthew Brush, Daniel L. Cameron, Melissa Cline , Malachi Griffith, Obi L. Griffith, ..., Melissa Konopko, Heidi L. Rehm, Andrew D. Yates, Robert R. Freimuth, Reece K. Hart¶

Wagner, Alex H. et al. Cell Genomics, Volume 1, Issue 2, 100027 doi:10.1016/j.xgen.2021.100027¶

bioRxiv. version 20212021.01.15.426843. (2021-01-15)¶

Note¶

International federation of genomic medicine databases using GA4GH standards

Adrian Thorogood, Heidi L. Rehm, Peter Goodhand, Angela J.H. Page, Yann Joly, Michael Baudis, Jordi Rambla, Arcadi Navarro, Tommi H. Nyronen, Mikael Linden, Edward S. Dove, Marc Fiume, Michael Brudno, Melissa S. Cline, Ewan Birney¶

Thorogood, Adrian et al. Cell Genomics, Volume 1, Issue 2, 100032 doi:10.1016/j.xgen.2021.100032¶

Note¶

GA4GH: International policies and standards for data sharing across genomic research and healthcare

Rehm, Heidi L. et al. Cell Genomics, Volume 1, Issue 2, 100029 doi:10.1016/j.xgen.2021.100029¶

Note¶

The Progenetix oncogenomic resource in 2021

Article describing the current content & technical status of progenetix.org

Qingyao Huang, Paula Carrio Cordo, Bo Gao, Rahel Paloots, Michael Baudis¶

Database (Oxford). 2021 Jul 17;2021:baab043.¶

Signatures of Discriminative CNA in 31 Cancer Subtypes

Bo Gao and Michael Baudis (2021)¶

Published at Frontiers in Genetics, 2021-05-13¶

Abstract¶

Copy number variant heterogeneity among cancer types reflects inconsistent concordance with diagnostic classifications

Paula Carrio Cordo and Michael Baudis¶

bioRxiv. doi: doi.org/10.1101/2021.03.01.433348¶

The Ubiquitin Ligase TRIP12 Limits PARP1 Trapping and Constrains PARP Inhibitor Efficiency

Marco Gatti, Ralph Imhof, Qingyao Huang, Michael Baudis, Matthias Altmeyer¶

Cell Rep. 2020 Aug 4 DOI: 10.1016/j.celrep.2020.107985¶

Oncology Informatics: Status Quo and Outlook - Review

Paul Martin Putora, Michael Baudis, Beth M. Beadle, Issam El Naqa, Frank A. Giordano and Nils H. Nicolay¶

Oncology, 2020-05-14. DOI 10.1159/000507586 (Review)¶

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Bo Gao and Michael Baudis (2020)¶

bioRxiv, 2019-07-31. DOI 10.1101/720854¶

Genomics, Volume 112, Issue 5, September 2020, Pages 3331-3341, accepted 2020-05-06 doi.org/10.1016/j.ygeno.2020.05.008.¶

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030 ¶

bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281 ¶

Bioinformatics Advances, vbae045, doi.org/10.1093/bioadv/vbae045 ¶

Previous arXiv preprint (2023-07-03): https://doi.org/10.48550/arXiv.2307.00933 ¶

doi: https://doi.org/10.1101/2023.11.29.569189 ¶

doi: 10.3389/fgene.2022.1017657 ¶

previous bioRxiv (first )2022-06-29), doi.org/10.1101/2022.06.29.498080 ¶

Advanced Genetics 2022, 2200016. LINK ¶

Nature Biotechnology. 2022;40:817-820. LINK | PMID:35705716 ¶

Human Mutation. 2022 Mar 17. PMID:35297548 ¶

medRxiv, 2021.11.27.21266944. doi:10.1101/2021.11.27.21266944 ¶

Wagner, Alex H. et al. Cell Genomics, Volume 1, Issue 2, 100027 doi:10.1016/j.xgen.2021.100027 ¶

Thorogood, Adrian et al. Cell Genomics, Volume 1, Issue 2, 100032 doi:10.1016/j.xgen.2021.100032 ¶

Rehm, Heidi L. et al. Cell Genomics, Volume 1, Issue 2, 100029 doi:10.1016/j.xgen.2021.100029 ¶

bioRxiv. doi: doi.org/10.1101/2021.03.01.433348 ¶

Cell Rep. 2020 Aug 4 DOI: 10.1016/j.celrep.2020.107985 ¶

bioRxiv, 2019-07-31. DOI 10.1101/720854 ¶

DATABASE, Volume 2020, 2020, baaa009, doi.org/10.1093/database/baaa009 ¶

bioRxiv preprint, 2020-01-11. DOI 10.1101/827683 ¶

Sci Rep 10, 4846 (2020). doi.org/10.1038/s41598-020-61854-x ¶

bioRxiv, 2020-11-01. DOI 10.1101/827683 ¶

bioRxiv, 2019-07-31. DOI 10.1101/720854 ¶

JEADV, 2019-01-19. doi.org/10.1111/jdv.15442 ¶