BiocPy: enabling Bioconductor workflows in Python

BiocPy: enabling Bioconductor workflows in Python


Author(s): Jayaram Kancherla,Aaron Lun

Affiliation(s): Genentech

Social media: https://twitter.com/jayaram

Analysts today use a variety of languages in their workflows, including R/Bioconductor for statistical analysis and Python for imaging or machine learning tasks. Currently, Python lacks an ecosystem that supports genomic interval-based analyses and data structures for managing genomic experiments. Although single-cell representations have become a de-facto standard in Python, they are not appropriate for all types of genomic experiments, nor do they fully support genomic analysis. BiocPy aims to facilitate interoperability between R and Python by providing standardized data structures built on existing Bioconductor data structures. These include genomic ranges for interval-based operations, summarized experiments and other derivatives for managing and analyzing genomic experiments. BiocPy adapts these mature data structures to provide a seamless transition and ease of use across languages. To learn more, visit the BiocPy (https://github.com/biocpy) GitHub organization.