Posts

Building R Binary Packages for Linux

Background One of the challenges of producing a performant build environment for linux, such as what might be used to have developers …

Experimenting with Github Actions

GitHub actions allow flexible and potentially complicated actions that comprise workflows that respond to events on Github. Continuous …

Elasticsearch Strings Runthrough

This little post is just a brain-dump post on elasticsearch testing and searching. I am writing it to allow easy testing of …

OmicIDX on BigQuery

Availability: This ipython notebook is available at https://github.com/seandavi/omicidx_examples. OmicIDX is a project to democratize …

Single cell packages and dependencies in Bioconductor using BiocPkgTools

The Bioconductor package ecosystem continues to grow at an exponential rate (check it–I am right). We have recently completed the …

Using google cloud registry for private docker images

In this post, I will quickly build a docker image containing the sra-toolkit and a key for dbGaP downloads. Because the key file is …

Using directory-local variables to customize the emacs project experience

I use emacs for nearly all my editing and interactive analysis. As one typically does, more than one project is the norm, not the …

Infrastructure-as-Code: Building the Bioconductor Conference AMI With Packer

One of the main features of the annual Bioconductor Conference is the proportion of time spent working with code in the form of …

Extracting Clinical Information Using the Genomicdatacommons Package

This short post introducds the gdc_clinical() function recently added to the GenomicDataCommons package. The rich data model at the NCI …

Create a basic Apache Spark cluster in the cloud (in 5 minutes)

Apache Spark in a few words Apache Spark is a software and data science platform that is purpose-built for large- to massive-scale data …