R

Single cell packages and dependencies in Bioconductor using BiocPkgTools

The Bioconductor package ecosystem continues to grow at an exponential rate (check it–I am right). We have recently completed the BiocPkgtools package that can mine package metadata, build reports, dependencies and can produce interesting plots of package dependencies. I was recently asked about the dependency structure of packages labeled by the package authors (using biocViews) as “SingleCell”. I am posting the code here, just for fun.

Bioconductor: software for interpreting high-throughput biological data

This talk presents a very quick overview of the Bioconductor project, focusing on its values of reproducibility, reuse, and openness.

Practical Data Science and Informatics Training

This talk compares and contrasts four formats for data science and informatics education. The discussion will highlight some approaches that I have found useful to facilitate the training process. I also present some practical and simple tips that I …

Orchestrating a community-developed computational workshop and accompanying training materials

The importance of bioinformatics, computational biology, and data science in biomedical research continues to grow, driving a need for effective instruction and education. A workshop setting, with lectures and guided hands-on tutorials, is a common …

A computable Bioconductor build report

Bioconductor spends a substantial amount of effort to build its catalog of software each day. Reporting of these results is critical for developers, users, and project leaders to understand the software “health” of the project. The Bioconductor build reports are generally available as html pages that are navigable with bookmarks and link out to detailed reports of errors, etc. However, the build reports are not readily computable, so mining the reports, automated processing by developers, and learning about failure modes automatically is challenging.

Matched tumor/normal pairs--a use case for the GenomicDataCommons Bioconductor package

Introduction The NCI Genomic Data Commons (GDC) is a reboot of the approach that NCI uses to manage and expose genomic and associated clinical and experimental metadata. I have been working on a Bioconductor package that interfaces with the GDC API to provide search and data retrieval from within R. testing In the first of what will likely be a set of use cases for the GenomicDataCommons, I am going to address a question that came up on twitter from @sleight82