open data


The OmicIDX is an ecosystem that treats omics metadata as data. We collect metadata from omics repositories, process it into computable forms, and make it available for search, analysis, bulk processing, and for integration with additional data resources.

Symposium: Participants Share Data 2019

The NCI organized a symposium to broadly assess the landscape of personal control of genomic data. We focused the meeting on the patient, research participant, and personal perspectives. The Symposium drew 200 in-person participants and over 300 online.

Practical Data Science and Informatics Training

This talk compares and contrasts four formats for data science and informatics education. The discussion will highlight some approaches that I have found useful to facilitate the training process. I also present some practical and simple tips that I …

The cancer data ecosystem: data and cloud resources for cancer genomic data science

In this talk, I motivate the need for cloud-based cancer data resourdces. I provide an overview of the NCI Genomic Data Commons and how to interact with it both interactively through a web portal as well as programmatically using the …

GenomicDataCommons Example: UUID to TCGA and TARGET Barcode Translation

One of the features of the NCI Genomic Data Commons is that everything has a unique identifier in the form of a UUID. However, because many legacy projects and much of the literature do not use UUIDs but instead use TCGA sample barcodes, one simple use case for the GenomicDataCommons package is to map from the UUID for a file or a set of files back to the associated TCGA barcode(s).