scConvert converts between single-cell data formats entirely in R, with no Python dependency. This vignette walks through installation, basic conversion, and file I/O in under five minutes.
Quick conversion with scConvert()
The scConvert() function is a universal dispatcher: give
it a source and a destination, and it picks the fastest conversion path
automatically. File formats are detected from extensions.
h5ad_file <- system.file("testdata", "pbmc_small.h5ad", package = "scConvert")
h5seurat_out <- file.path(tempdir(), "pbmc.h5seurat")
scConvert(h5ad_file, dest = h5seurat_out, overwrite = TRUE)
zarr_out <- file.path(tempdir(), "pbmc.zarr")
scConvert(h5ad_file, dest = zarr_out, overwrite = TRUE)scConvert() works for any supported format pair – h5ad,
h5Seurat, Zarr, Loom, RDS, and more.
Loading files into Seurat
When you need to work with the data in R, use the format-specific readers. Each returns a standard Seurat object.
From h5ad
obj <- readH5AD(h5ad_file)
obj
#> An object of class Seurat
#> 2000 features across 214 samples within 1 assay
#> Active assay: RNA (2000 features, 2000 variable features)
#> 2 layers present: counts, data
#> 2 dimensional reductions calculated: pca, umap
dim(obj)
#> [1] 2000 214
head(obj[[]], n = 4)
#> orig.ident nCount_RNA nFeature_RNA seurat_annotations percent.mt
#> AACCAGTGATACCG pbmc3k 1539 335 FCGR3A+ Mono 2.786458
#> AAGATTACCGCCTT pbmc3k 898 284 DC 1.763553
#> AAGCAAGAGCTTAG pbmc3k 668 201 NK 2.214022
#> AAGCCATGAACTGC pbmc3k 2329 435 DC 1.401472
#> RNA_snn_res.0.5 seurat_clusters
#> AACCAGTGATACCG 5 5
#> AAGATTACCGCCTT 7 7
#> AAGCAAGAGCTTAG 6 6
#> AAGCCATGAACTGC 7 7From h5Seurat
obj2 <- readH5Seurat(h5seurat_out)
obj2
#> An object of class Seurat
#> 2000 features across 214 samples within 1 assay
#> Active assay: RNA (2000 features, 2000 variable features)
#> 2 layers present: counts, data
#> 2 dimensional reductions calculated: pca, umapFrom Zarr
obj3 <- readZarr(zarr_out)
obj3
#> An object of class Seurat
#> 2000 features across 214 samples within 1 assay
#> Active assay: RNA (2000 features, 2000 variable features)
#> 2 layers present: counts, data
#> 2 dimensional reductions calculated: pca, umapWriting files from Seurat
Starting from any Seurat object, write to the format your collaborators or downstream tools need.
h5ad_out <- file.path(tempdir(), "output.h5ad")
writeH5AD(obj, h5ad_out, verbose = FALSE)
h5s_out <- file.path(tempdir(), "output.h5seurat")
writeH5Seurat(obj, h5s_out, overwrite = TRUE, verbose = FALSE)
zarr_out2 <- file.path(tempdir(), "output.zarr")
writeZarr(obj, zarr_out2, verbose = FALSE)
sizes <- data.frame(
Format = c("h5ad", "h5Seurat", "Zarr"),
Size_MB = round(c(
file.size(h5ad_out),
file.size(h5s_out),
sum(file.info(list.files(zarr_out2, recursive = TRUE, full.names = TRUE))$size)
) / 1024^2, 2)
)
knitr::kable(sizes, col.names = c("Format", "Size (MB)"))| Format | Size (MB) |
|---|---|
| h5ad | 0.54 |
| h5Seurat | 0.59 |
| Zarr | 0.34 |
Supported formats
| Format | Extension | Ecosystem | Read | Write |
|---|---|---|---|---|
| AnnData | .h5ad |
scanpy, CELLxGENE | yes | yes |
| h5Seurat | .h5seurat |
Seurat | yes | yes |
| MuData | .h5mu |
muon (multimodal) | yes | yes |
| Loom | .loom |
loompy, HCA | yes | yes |
| Zarr | .zarr |
cloud AnnData | yes | yes |
| TileDB-SOMA | soma:// |
CELLxGENE Census | yes | yes |
| SpatialData | .zarr |
scverse spatial | yes | yes |
| RDS | .rds |
R native | yes | yes |
| SingleCellExperiment | in-memory | Bioconductor | yes | – |
When to use which format:
- h5ad – sharing with Python users or submitting to CELLxGENE.
- h5Seurat – archiving a Seurat object with selective loading support.
- Zarr – cloud-friendly, chunk-based access (e.g., S3 / GCS).
- Loom – interoperability with loompy, velocyto, or legacy pipelines.
- RDS – quick local save/load within R (no HDF5 dependency).
Verifying a conversion
After converting, a quick sanity check confirms that dimensions and cell identifiers are preserved.
original <- readH5AD(h5ad_file)
converted <- readH5Seurat(h5seurat_out)
stopifnot(identical(dim(original), dim(converted)))
stopifnot(identical(sort(colnames(original)), sort(colnames(converted))))
cat("Dimensions match:", paste(dim(original), collapse = " x "), "\n")
#> Dimensions match: 2000 x 214
cat("Cell names match:", length(colnames(original)), "cells verified\n")
#> Cell names match: 214 cells verifiedNext steps
- Convert Between Seurat and AnnData – layer mapping, Python interop, and round-trip verification.
- In-Memory vs On-Disk Conversion – hub path, streaming, and the C binary for large datasets.
- Direct H5AD Loading – selective loading and BPCells for atlas-scale data.
- Multimodal H5MU – CITE-seq and ATAC+RNA via MuData.
- Zarr Format – cloud-native storage and streaming converters.
- Spatial Technologies – Visium, MERFISH, and SpatialData support.
- CLI Usage – the standalone C binary for batch conversion without R.