On the Dependency Heaviness of CRAN/Bioconductor Ecosystem

On the Dependency Heaviness of CRAN/Bioconductor Ecosystem


Author(s): Zuguang Gu

Affiliation(s): German Cancer Research Center



The R package ecosystem is expanding fast and dependencies among packages are becoming more complex in the ecosystem. I explored the package dependencies from a new aspect with a new metric named “dependency heaviness”, which measures the number of additional strong dependencies that a package uniquely contributes to its child or downstream packages. I systematically studied how the dependency heaviness spreads from parent to child packages, and how it further spreads to remote downstream packages in the CRAN/Bioconductor ecosystem. I extracted top packages and key paths that majorly transmit heavy dependencies in the ecosystem. Additionally, the dependency heaviness analysis on the ecosystem has been implemented as a web-based database in a package named “pkgndep” that provides comprehensive tools for querying dependencies of individual R packages. In this short talk, I will introduce the major findings from this study and I will demonstrate how it helps to optimize package dependencies from a developer’s perspective. The study has been published on the Journal of Systems and Software (https://doi.org/10.1016/j.jss.2023.111610) and the paper is freely available at https://arxiv.org/abs/2208.11674.