Voyager: Exploratory spatial data analysis from geospatial to spatial -omics**
Author(s): Lambda Moses,Kayla Jackson,Laura Luebbert,Pétur Helgi Einarsson,Pall Melsted,Lior Pachter
Affiliation(s): California Institute of Technology
Social media: https://twitter.com/LambdaMoses
With the rise of spatial transcriptomics, many methods have been written for specialized tasks in spatial transcriptomics data analysis, such as finding spatially variable genes, finding spatial regions, deconvoluting Visium spots, data integration with other modalities and with multiple tissue slices, and identifying interactions between cell types. Some of these methods adopted methods from geospatial data analysis, as spatial data analysis mostly in geographical space has existed for decades before the rise of spatial transcriptomics. However, there is a rich exploratory spatial data analysis (ESDA) tradition from the geospatial tradition not yet well-utilized in spatial -omics. The SpatialFeatureExperiment (SFE) package brings Simple Feature to SingleCellExperiment to represent and operate on geometries such as cell segmentation polygons and Visium spot polygons bundled with gene expression data. Voyager performs ESDA and spatial data visualization to SFE objects. Voyager implements univariate ESDA methods beyond the commonly used Moran's I, including the correlogram to study length scales of spatial autocorrelation and local spatial analysis methods giving a result for each cell such as local Moran's I and local spatial heteroscedasticity to study local variations in spatial autocorrelation. These analyses can be performed on gene expression, cell metadata, and attributes of geometries. Voyager also implements multivariate spatial data analysis, such as a scalable implementation of MULTISPATI PCA, a form of spatially informed PCA previous used in ecology, which can give more spatially coherent clustering and shed light on negative spatial autocorrelation, which is often neglected in spatial analyses. Both SFE and Voyager are available on Bioconductor. In addition, we have written comprehensive tutorials for Voyager, performing ESDA on data from technologies including Visium, Xenium, CosMX, slide-seq, and MERFISH, with up to about 400,000 cells. These tutorials are built on GitHub Actions to ensure reproducibility and scalability. Example datasets used in the tutorials are available in Bioconductor package SFEData. Finally, we have a Python implementation of core functionalities and have written compatibility tests to ensure that the R and Python implementations give consistent results.