R/buildPkgDependencyGraph.R
buildPkgDependencyDataFrame.RdBioconductor is built using an extensive set of
core capabilities and data structures. This leads
to package developers depending on other packages
for interoperability and functionality. This
function extracts package dependency information
from biocPkgList() and returns a tidy
data.frame that can be used for analysis
and to build graph structures of package dependencies.
buildPkgDependencyDataFrame(dependencies = c("strong", "most", "all"), ...)character() a vector listing the types of dependencies, a
subset of c("Depends", "Imports", "LinkingTo", "Suggests", "Enhances").
Character string "all" is shorthand for that vector, character string
"most" for the same vector without "Enhances", character string "strong"
(default) for the first three elements of that vector.
parameters passed along to biocPkgList()
A data.frame (also a tbl_df) of
S3 class "biocDepDF" including columns "Package", "dependency",
and "edgetype".
This function requires network access.
# performs a network call, so must be online.
library(BiocPkgTools)
depdf <- buildPkgDependencyDataFrame()
#> 'getOption("repos")' replaces Bioconductor standard repositories, see
#> 'help("repositories", package = "BiocManager")' for details.
#> Replacement repositories:
#> CRAN: https://cran.rstudio.com
head(depdf)
#> Package dependency edgetype
#> 1 a4 a4Base Depends
#> 2 a4 a4Preproc Depends
#> 3 a4 a4Classif Depends
#> 4 a4 a4Core Depends
#> 5 a4 a4Reporting Depends
#> 6 a4Base a4Preproc Depends
library(dplyr)
# filter to include only "Imports" type
# dependencies
imports_only <- depdf |> filter(edgetype=='Imports')
# top ten most imported packages
imports_only |> select(dependency) |>
group_by(dependency) |> tally() |>
arrange(desc(n))
#> # A tibble: 1,877 × 2
#> dependency n
#> <chr> <int>
#> 1 stats 1328
#> 2 methods 1260
#> 3 utils 1083
#> 4 ggplot2 743
#> 5 S4Vectors 672
#> 6 grDevices 596
#> 7 graphics 595
#> 8 dplyr 544
#> 9 SummarizedExperiment 498
#> 10 IRanges 451
#> # ℹ 1,867 more rows
# The Bioconductor packages with the
# largest number of imports
largest_importers <- imports_only |>
select(Package) |>
group_by(Package) |> tally() |>
arrange(desc(n))
# not sure what these packages do. Join
# to their descriptions
biocPkgList() |> select(Package, Description) |>
left_join(largest_importers) |> arrange(desc(n)) |>
head()
#> 'getOption("repos")' replaces Bioconductor standard repositories, see
#> 'help("repositories", package = "BiocManager")' for details.
#> Replacement repositories:
#> CRAN: https://cran.rstudio.com
#> Joining with `by = join_by(Package)`
#> # A tibble: 6 × 3
#> Package Description n
#> <chr> <chr> <int>
#> 1 singleCellTK "The Single Cell Toolkit (SCTK) in the singleCellTK packag… 85
#> 2 ChromSCape "ChromSCape - Chromatin landscape profiling for Single\nCe… 57
#> 3 signeR "The signeR package provides an empirical Bayesian approac… 54
#> 4 metaseqR2 "Provides an interface to several normalization and\nstati… 52
#> 5 FLAMES "Semi-supervised isoform detection and annotation from bot… 51
#> 6 scRNAseqApp "The scRNAseqApp is a Shiny app package designed for\ninte… 50