Background One of the challenges of producing a performant build environment for linux, such as what might be used to have developers test software in identical environments, is the need to compile R packages from source on linux. If, however, one had an identical set of installed libraries, kernel version, compiler, etc., we could use binary packages in linux as well. Docker provides just such a shareable and identical environment for linux.
The Bioconductor package ecosystem continues to grow at an exponential rate (check it–I am right). We have recently completed the BiocPkgtools package that can mine package metadata, build reports, dependencies and can produce interesting plots of package dependencies. I was recently asked about the dependency structure of packages labeled by the package authors (using biocViews) as “SingleCell”. I am posting the code here, just for fun.
This talk presents a very quick overview of the Bioconductor project, focusing on its values of reproducibility, reuse, and openness.
This short post introducds the gdc_clinical() function recently added to the GenomicDataCommons package.
The rich data model at the NCI Genomic Data Commons (GDC) includes clinical and biospecimen details. A recently added feature to the NCI GDC Data Portal is the ability to download tab-delimited files or JSON files for clinical and biospecimen details of samples. The details available in these simplified formats are also available via the GDC API.
The NCI Genomic Data Commons (GDC) now contains the authoritative source of data from The Cancer Genome Atlas (TCGA) as well as several other projects of import to the cancer research community. One of the available assays produces somatic variant calls, formally identified by comparing tumor reads and normal reads to identify variants relative to the human reference genome that are not present in the normal genome of the patient.