BiocPy: enabling Bioconductor workflows in Python
Author(s): Jayaram Kancherla,Aaron Lun
Affiliation(s): Genentech
Social media: https://twitter.com/jayaram
Analysts today use a variety of languages in their workflows, including R/Bioconductor for statistical analysis and Python for imaging or machine learning tasks. Currently, Python lacks an ecosystem that supports genomic interval-based analyses and data structures for managing genomic experiments. Although single-cell representations have become a de-facto standard in Python, they are not appropriate for all types of genomic experiments, nor do they fully support genomic analysis. BiocPy aims to facilitate interoperability between R and Python by providing standardized data structures built on existing Bioconductor data structures. These include genomic ranges for interval-based operations, summarized experiments and other derivatives for managing and analyzing genomic experiments. BiocPy adapts these mature data structures to provide a seamless transition and ease of use across languages. To learn more, visit the BiocPy (https://github.com/biocpy) GitHub organization.