Federated discovery and sharing of genomic data using Beacons

Miroslav Cupak , Stephen Keenan , Jordi Rambla , Sabela de la Torre , Stephanie Dyke , Anthony Brookes , Knox Carey , David Lloyd , Peter Goodhand , Maximilian Haeussler , Michael Baudis , Heinz Stockinger , Lena Dolman , Ilkka Lappalainen , Juha Törnroos , Mikael Linden , John Spalding , Saif Ur-Rehman , Angela Page , Paul Flicek , Susheel Varma , Gary Saunders , Serena Scollen , Stephen Sherry , David Haussler , Beacon Project Team

Nat Biotechnol (2019), accepted 2019-01-23

Abstract The Beacon Project (github.com/ga4gh-beacon/) is a GA4GH initiative that is developing an open specification for genetic variation discovery and sharing. The project is demonstrating the willingness of international organizations to work together to define standards for, and actively engage in, genomic data sharing. In the two years since the project’s inception, over 90 Beacons have been lit by 35 organizations serving over 200 datasets. These datasets are searchable individually or in aggregate via the Beacon Network (beacon-network.org), a federated search engine across the world’s public beacons. Beacons serve large, diverse, valuable collections of genomics datasets, showing the viability of a global federated model for genomics data discovery and sharing through a simple and securable technical protocol. With continued adoption, Beacons will produce a large network of searchable genomics datasets whose global representation and accessibility will unlock potential for new genomics-derived discoveries and applications in medicine.