This project is a work-in-progress. Both the web-based API and R package are being actively developed and may change.
Documentation is currently hosted at:
The Sequence Read Archive (SRA) is NIH’s primary archive of high-throughput sequencing data and is part of the International Nucleotide Sequence Database Collaboration (INSDC) that includes at the NCBI Sequence Read Archive (SRA), the European Bioinformatics Institute (EBI), and the DNA Database of Japan (DDBJ). Data submitted to any of the three organizations are shared among them.
This package serves as a resource for searching and large-scale processing of SRA metadata. This is a complete re-imagining of the SRAdb package which, while still quite useful, requires a large download and suffers from difficulties in maintainability. The SRAdbV2 package provides an R client to a high-performance web-based API (usable outside of R if needed).
The data served by the SRAdbV2 package are processed in the following steps.
The raw API is documented and queriable here: