IT

Experimenting with Github Actions

GitHub actions allow flexible and potentially complicated `actions` that comprise `workflows` that respond to events on Github. Continuous integration, messaging Slack, greeting new contributors, deploying applications, and many other templates are ready for customization and integration into any repo.

Container Notes

General notes about using containers

Infrastructure-as-Code: Building the Bioconductor Conference AMI With Packer

One of the main features of the annual Bioconductor Conference is the proportion of time spent working with code in the form of workshops. To support these workshops, we ask workshop presenters to supply Rmarkdown materials which we collate into workshop materials. Using literate programming approaches like Rmarkdown ensures that the workflows are self-consistent and work as expected. In addition to the Rmarkdown workshop materials, we also need a consistent computing environment that can support reasonably large computation, provide high-performance network and file system access, and is essentially unlimited in scale (we expect to have >150 participants, each with his/her own machine).

Create a basic Apache Spark cluster in the cloud (in 5 minutes)

Apache Spark in a few words Apache Spark is a software and data science platform that is purpose-built for large- to massive-scale data processing. Spark supports processing of data in batch mode (run as a pipeline) or in interactive mode using command-line programming style or in popular notebook style of coding. While scala is the native language for Spark, language bindings exist for python, R, and Java as well.