Overview

SpatialData is the scverse standard for spatial omics. A .spatialdata.zarr store is a Zarr directory that encodes expression tables, spatial coordinates, shapes (polygons, circles), tissue images, and coordinate systems in a unified hierarchy. All major spatial technologies now have SpatialData representations via the spatialdata-io package.

scConvert reads and writes SpatialData Zarr stores natively in R, with no Python dependency, supporting both zarr v2 and v3 formats.


1 Read a SpatialData store

readSpatialData() reads the tables/table AnnData from the store and returns a Seurat object. Spatial coordinates, images, and coordinate-system metadata are attached when present.

sd_path <- normalizePath("../blobs.spatialdata.zarr")

t0 <- proc.time()
blobs <- readSpatialData(sd_path, verbose = FALSE)
elapsed <- (proc.time() - t0)[["elapsed"]]

cat(sprintf("Loaded: %d cells x %d features in %.2fs\n",
            ncol(blobs), nrow(blobs), elapsed))
#> Loaded: 26 cells x 3 features in 0.05s
cat(sprintf("Metadata columns: %s\n",
            paste(names(blobs[[]]), collapse = ", ")))
#> Metadata columns: orig.ident, nCount_RNA, nFeature_RNA, instance_id, region
head(blobs[[]], 6)

The three features (channel-0-sum, channel-1-sum, channel-2-sum) correspond to summarised intensity channels from synthetic image blobs. This structure is representative of imaging-based spatial proteomics platforms (CODEX, IMC, CyCIF) where channels map to protein markers rather than gene transcripts.


2 Export to h5ad

SpatialDataToH5AD() extracts the expression table and spatial coordinates and writes a spec-compliant h5ad suitable for direct import into scanpy or squidpy.

h5ad_out <- file.path(tempdir(), "blobs.h5ad")

t0 <- proc.time()
SpatialDataToH5AD(sd_path, h5ad_out, overwrite = TRUE, verbose = FALSE)
elapsed <- (proc.time() - t0)[["elapsed"]]

cat(sprintf("SpatialData -> h5ad: %.2fs | %.1f KB\n",
            elapsed, file.size(h5ad_out) / 1e3))
#> SpatialData -> h5ad: 1.21s | 51.8 KB

3 Python validation with spatialdata

import spatialdata
import matplotlib
matplotlib.use('Agg')

sdata = spatialdata.read_zarr(r.sd_path)
#> no parent found for <ome_zarr.reader.Label object at 0x12e9c2120>: None
print(sdata)
#> SpatialData object, with associated Zarr store: /Users/miana/Desktop/scConvert/vignettes/blobs.spatialdata.zarr
#> ├── Images
#> │     ├── 'blobs_image': DataArray[cyx] (3, 512, 512)
#> │     └── 'blobs_multiscale_image': DataTree[cyx] (3, 512, 512), (3, 256, 256), (3, 128, 128)
#> ├── Labels
#> │     ├── 'blobs_labels': DataArray[yx] (512, 512)
#> │     └── 'blobs_multiscale_labels': DataTree[yx] (512, 512), (256, 256), (128, 128)
#> ├── Points
#> │     └── 'blobs_points': DataFrame with shape: (<Delayed>, 4) (2D points)
#> ├── Shapes
#> │     ├── 'blobs_circles': GeoDataFrame shape: (5, 2) (2D shapes)
#> │     ├── 'blobs_multipolygons': GeoDataFrame shape: (2, 1) (2D shapes)
#> │     └── 'blobs_polygons': GeoDataFrame shape: (5, 1) (2D shapes)
#> └── Tables
#>       └── 'table': AnnData (26, 3)
#> with coordinate systems:
#>     ▸ 'global', with elements:
#>         blobs_image (Images), blobs_multiscale_image (Images), blobs_labels (Labels), blobs_multiscale_labels (Labels), blobs_points (Points), blobs_circles (Shapes), blobs_multipolygons (Shapes), blobs_polygons (Shapes)
print(f"\nTables : {list(sdata.tables.keys())}")
#> 
#> Tables : ['table']
print(f"Images : {list(sdata.images.keys())}")
#> Images : ['blobs_image', 'blobs_multiscale_image']
print(f"Shapes : {list(sdata.shapes.keys())}")
#> Shapes : ['blobs_circles', 'blobs_multipolygons', 'blobs_polygons']
print(f"Points : {list(sdata.points.keys())}")
#> Points : ['blobs_points']
import anndata
adata = anndata.read_h5ad(r.h5ad_out)
print(adata)
#> AnnData object with n_obs × n_vars = 26 × 3
#>     obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'instance_id', 'region'
#>     uns: '__spatialdata_labels__', '__spatialdata_version__', 'spatialdata_attrs_present'
print(f"obs columns: {list(adata.obs.columns)}")
#> obs columns: ['orig.ident', 'nCount_RNA', 'nFeature_RNA', 'instance_id', 'region']

4 Format converters reference

All four SpatialData format converters are exported. The scConvert() dispatcher also handles .spatialdata.zarr paths automatically.

SpatialDataToH5AD(
  source    = "sample.spatialdata.zarr",
  dest      = "sample.h5ad",
  overwrite = TRUE
)

H5ADToSpatialData(
  source    = "sample.h5ad",
  dest      = "sample.spatialdata.zarr",
  overwrite = TRUE
)

SpatialDataToH5Seurat(
  source    = "sample.spatialdata.zarr",
  dest      = "sample.h5seurat",
  overwrite = TRUE
)

scConvert("sample.spatialdata.zarr", dest = "sample.h5ad")

5 Technology coverage

The following spatial technologies produce SpatialData stores via spatialdata-io. scConvert interoperates with all of them.

Technology Vendor Modality spatialdata-io reader
Visium / Visium HD 10x Genomics transcriptomics visium() / visium_hd()
Xenium 10x Genomics transcriptomics (FISH) xenium()
CosMx NanoString transcriptomics (FISH) cosmx()
MERFISH Vizgen transcriptomics (FISH) merfish()
Slide-seq Broad Institute transcriptomics slideseq()
CODEX / PhenoCycler Akoya Biosciences proteomics codex()
IMC Fluidigm / Standard BioTools proteomics imc()

6 Cleanup

unlink(h5ad_out)