Universal converter between single-cell file formats and object types. Supports arbitrary source/destination pairs by routing through Seurat as a hub format. Direct HDF5-level paths (h5ad <-> h5seurat) are used when available for memory efficiency.
Usage
scConvert(
source,
dest,
assay,
overwrite = FALSE,
verbose = TRUE,
standardize = FALSE,
...
)
# S3 method for class 'character'
scConvert(
source,
dest,
assay,
overwrite = FALSE,
verbose = TRUE,
standardize = FALSE,
...
)
# S3 method for class 'H5File'
scConvert(
source,
dest = "h5seurat",
assay = "RNA",
overwrite = FALSE,
verbose = TRUE,
...
)
# S3 method for class 'h5Seurat'
scConvert(
source,
dest = "h5ad",
assay = DefaultAssay(object = source),
overwrite = FALSE,
verbose = TRUE,
standardize = FALSE,
...
)
# S3 method for class 'Seurat'
scConvert(
source,
dest,
assay = DefaultAssay(object = source),
overwrite = FALSE,
verbose = TRUE,
standardize = FALSE,
...
)
# S3 method for class 'loom'
scConvert(
source,
dest,
assay = "RNA",
overwrite = FALSE,
verbose = TRUE,
standardize = FALSE,
...
)
# S3 method for class 'SingleCellExperiment'
scConvert(
source,
dest,
assay = NULL,
overwrite = FALSE,
verbose = TRUE,
standardize = FALSE,
...
)Arguments
- source
Source dataset: a Seurat object, SingleCellExperiment, loom connection, filename path, or H5File connection
- dest
Name/path of destination file or format. Supported formats: h5seurat, h5ad, h5mu, loom, rds. Also accepts
"sce"to return a SingleCellExperiment object (in-memory, no file created).- assay
For h5Seurat -> other formats: name of assay to convert. For other formats -> h5Seurat: name to assign to the assay. Default is "RNA".
- overwrite
Logical; if
TRUE, overwrite an existing destination file. Default isFALSE.- verbose
Logical; if
TRUE(default), show progress updates- standardize
Logical; if
TRUE, convert Seurat-style metadata column names to scanpy/AnnData conventions when converting to h5ad format. For example,nCount_RNAbecomesn_counts,nFeature_RNAbecomesn_genes. Only applicable for conversions to h5ad format. Default isFALSE.- ...
Arguments passed to specific conversion methods
Value
For file destinations, invisibly returns the destination filename.
For dest = "sce", returns a SingleCellExperiment object.
Details
Supported Formats:
R objects: Seurat, SingleCellExperiment (requires SingleCellExperiment), loom (R6 connection)
File formats: h5seurat, h5ad, h5mu, loom, rds
Any source format can be converted to any destination format. Conversions
without a direct path go through Seurat as a universal hub:
Source -> Seurat -> Destination.
Direct Paths (memory-efficient, no full dataset loading):
h5ad <-> h5seurat: Direct HDF5-level copy
Key Features:
Preserves expression matrices, metadata, and dimensional reductions
For Visium/spatial data: reconstructs images with scale factors
Handles multiple data layers (V5 compatibility)
AnnData/H5AD to h5Seurat
The AnnData/H5AD to h5Seurat conversion will try to automatically fill in datasets based on data presence. It works in the following manner:
Expression data
The expression matrices counts, data, and scale.data
are filled by /X and /raw/X in the following manner:
countswill be filled with/raw/Xif present; otherwise, it will be filled with/Xdatawill be filled with/raw/Xif/raw/Xis present and/Xis dense; otherwise, it will be filled with/Xscale.datawill be filled with/Xif it dense; otherwise, it will be empty
Feature names are taken from the feature-level metadata
Feature-level metadata
Feature-level metadata is added to the meta.features datasets in each
assay. Feature names are taken from the dataset specified by the
“_index” attribute, the “_index” dataset, or the “index”
dataset, in that order. Metadata is populated with /raw/var if
present, otherwise with /var; if both /raw/var and /var
are present, then meta.features will be populated with /raw/var
first, then /var will be added to it. For columns present in both
/raw/var and /var, the values in /var will be used
instead. Note: it is possible for /var to have fewer features
than /raw/var; if this is the case, then only the features present in
/var will be overwritten, with the metadata for features not
present in /var remaining as they were in /raw/var or empty
Cell-level metadata
Cell-level metadata is added to meta.data; the row names of the
metadata (as determined by the value of the “_index” attribute, the
“_index” dataset, or the “index” dataset, in that order) are
added to the “cell.names” dataset instead. If the
“__categories” dataset is present, each dataset within
“__categories” will be stored as a factor group. Cell-level metadata
will be added as an HDF5 group unless factors are not present and
the scConvert.dtypes.dataframe_as_group option is FALSE
Dimensional reduction information:
Cell embeddings are taken from /obsm; dimensional reductions are
named based on their names from obsm by removing the preceding
“X_”.For example, if a dimensional reduction is named “X_pca”
in /obsm, the resulting dimensional reduction information will be
named “pca”. The key will be set to one of the following:
“PC_” if “pca” is present in the dimensional reduction name (
grepl("pca", reduction.name, ignore.case = TRUE))“tSNE_” if “tsne” is present in the dimensional reduction name (
grepl("tsne", reduction.name, ignore.case = TRUE))reduction.name_for all other reductions
Remember that the preceding “X_” will be removed from the reduction
name before converting to a key. Feature loadings are taken from
/varm and placed in the associated dimensional reduction. The
dimensional reduction is determine from the loadings name in /varm:
“PCs” will be added to a dimensional reduction named “pca”
All other loadings in
/varmwill be added to a dimensional reduction namedtolower(loading)(eg. a loading named “ICA” will be added to a dimensional reduction named “ica”)
If a dimensional reduction cannot be found according to the rules above, the
loading will not be taken from the AnnData/H5AD file. Miscellaneous
information will be taken from /uns/reduction where reduction
is the name of the reduction in /obsm without the preceding
“X_”; if no dimensional reduction information present, then
miscellaneous information will not be taken from the AnnData/H5AD file.
Standard deviations are taken from a dataset /uns/reduction/variance;
the variances will be converted to standard deviations and added to the
stdev dataset of a dimensional reduction
Nearest-neighbor graph
If a nearest neighbor graph is present in /uns/neighbors/distances,
it will be added as a graph dataset in the h5Seurat file and associated with
assay; if a value is present in /uns/neighbors/params/method,
the name of the graph will be assay_method, otherwise, it will be
assay_anndata
h5Seurat to AnnData/H5AD
The h5Seurat to AnnData/H5AD conversion will try to automatically fill in
datasets based on data presence. Data presense is determined by the h5Seurat
index (source$index()). It works in the following manner:
Assay data
Xwill be filled withscale.dataifscale.datais present; otherwise, it will be filled withdatavarwill be filled withmeta.featuresonly for the features present inX; for example, ifXis filled withscale.data, thenvarwill contain only features that have been scaledraw.Xwill be filled withdataifXis filled withscale.data; otherwise, it will be filled withcounts. Ifcountsis not present, thenrawwill not be filledraw.varwill be filled withmeta.featureswith the features present inraw.X; ifraw.Xis not filled, thenraw.varwill not be filled
Dimensional reduction information
Only dimensional reductions associated with assay or marked as
global will be transfered to the H5AD file. For
every reduction reduc:
cell embeddings are placed in
obsmand renamed toX_reducfeature loadings, if present, are placed in
varmand renamed to either “PCs” ifreducis “pca” otherwisereducin all caps
For example, if reduc is “ica”, then cell embeddings will be
“X_ica” in obsm and feature loaodings, if present, will be
“ICA” in varm
Nearest-neighbor graphs
If a nearest-neighbor graph is associated with assay, it will be
added to uns/neighbors/distances; if more than one graph is present,
then only the last graph according to the index will be added.
Layers
Data from other assays can be added to layers if they have the same
shape as X (same number of cells and features). To determine this,
the shape of each alternate assays's scale.data and data slots
are determined. If they are the same shape as X, then that slot
(scale.data is given priority over data) will be added as a
layer named the name of the assay (eg. “SCT”). In addition, the
features names will be added to var as assay_features
(eg. “SCT_features”).
See also
writeH5AD for direct Seurat to h5ad convenience function
writeH5Seurat to save Seurat objects
readH5Seurat to load h5Seurat files
readH5AD to directly load h5ad files
readH5MU to load h5mu files
scConnect to establish file connections
Examples
if (FALSE) { # \dontrun{
library(scConvert)
library(Seurat)
# --- Any format to any format ---
scConvert("data.h5ad", dest = "data.h5seurat") # h5ad -> h5seurat
scConvert("data.h5ad", dest = "data.rds") # h5ad -> RDS
scConvert("data.h5ad", dest = "data.loom") # h5ad -> loom
scConvert("data.h5mu", dest = "data.h5ad") # h5mu -> h5ad
scConvert("data.loom", dest = "data.h5seurat") # loom -> h5seurat
scConvert("data.rds", dest = "data.h5ad") # RDS -> h5ad
# --- From R objects ---
scConvert(seurat_obj, dest = "output.h5ad") # Seurat -> h5ad
scConvert(seurat_obj, dest = "output.loom") # Seurat -> loom
scConvert(seurat_obj, dest = "output.rds") # Seurat -> RDS
scConvert(sce_obj, dest = "output.h5ad") # SCE -> h5ad
sce <- scConvert(seurat_obj, dest = "sce") # Seurat -> SCE
} # }