Convert AnnData/H5AD files to h5Seurat files
Arguments
- source
Source dataset: a Seurat object, filename path, or H5File connection
- dest
Name/path of destination file. If only a file type is provided (e.g., "h5seurat", "h5ad", "loom"), the extension is appended to the source filename (for file sources) or the Seurat project name (for Seurat objects). Supported formats: h5seurat, h5ad, loom
- assay
For h5Seurat -> other formats: name of assay to convert. For other formats -> h5Seurat: name to assign to the assay. Default is "RNA".
- overwrite
Logical; if
TRUE, overwrite an existing destination file. Default isFALSE.- verbose
Logical; if
TRUE(default), show progress updates
Value
Returns a handle to dest as an h5Seurat object
AnnData/H5AD to h5Seurat
The AnnData/H5AD to h5Seurat conversion will try to automatically fill in datasets based on data presence. It works in the following manner:
Expression data
The expression matrices counts, data, and scale.data
are filled by /X and /raw/X in the following manner:
countswill be filled with/raw/Xif present; otherwise, it will be filled with/Xdatawill be filled with/raw/Xif/raw/Xis present and/Xis dense; otherwise, it will be filled with/Xscale.datawill be filled with/Xif it dense; otherwise, it will be empty
Feature names are taken from the feature-level metadata
Feature-level metadata
Feature-level metadata is added to the meta.features datasets in each
assay. Feature names are taken from the dataset specified by the
“_index” attribute, the “_index” dataset, or the “index”
dataset, in that order. Metadata is populated with /raw/var if
present, otherwise with /var; if both /raw/var and /var
are present, then meta.features will be populated with /raw/var
first, then /var will be added to it. For columns present in both
/raw/var and /var, the values in /var will be used
instead. Note: it is possible for /var to have fewer features
than /raw/var; if this is the case, then only the features present in
/var will be overwritten, with the metadata for features not
present in /var remaining as they were in /raw/var or empty
Cell-level metadata
Cell-level metadata is added to meta.data; the row names of the
metadata (as determined by the value of the “_index” attribute, the
“_index” dataset, or the “index” dataset, in that order) are
added to the “cell.names” dataset instead. If the
“__categories” dataset is present, each dataset within
“__categories” will be stored as a factor group. Cell-level metadata
will be added as an HDF5 group unless factors are not present and
the SeuratDisk.dtypes.dataframe_as_group option is FALSE
Dimensional reduction information:
Cell embeddings are taken from /obsm; dimensional reductions are
named based on their names from obsm by removing the preceding
“X_”.For example, if a dimensional reduction is named “X_pca”
in /obsm, the resulting dimensional reduction information will be
named “pca”. The key will be set to one of the following:
“PC_” if “pca” is present in the dimensional reduction name (
grepl("pca", reduction.name, ignore.case = TRUE))“tSNE_” if “tsne” is present in the dimensional reduction name (
grepl("tsne", reduction.name, ignore.case = TRUE))reduction.name_for all other reductions
Remember that the preceding “X_” will be removed from the reduction
name before converting to a key. Feature loadings are taken from
/varm and placed in the associated dimensional reduction. The
dimensional reduction is determine from the loadings name in /varm:
“PCs” will be added to a dimensional reduction named “pca”
All other loadings in
/varmwill be added to a dimensional reduction namedtolower(loading)(eg. a loading named “ICA” will be added to a dimensional reduction named “ica”)
If a dimensional reduction cannot be found according to the rules above, the
loading will not be taken from the AnnData/H5AD file. Miscellaneous
information will be taken from /uns/reduction where reduction
is the name of the reduction in /obsm without the preceding
“X_”; if no dimensional reduction information present, then
miscellaneous information will not be taken from the AnnData/H5AD file.
Standard deviations are taken from a dataset /uns/reduction/variance;
the variances will be converted to standard deviations and added to the
stdev dataset of a dimensional reduction
Nearest-neighbor graph
If a nearest neighbor graph is present in /uns/neighbors/distances,
it will be added as a graph dataset in the h5Seurat file and associated with
assay; if a value is present in /uns/neighbors/params/method,
the name of the graph will be assay_method, otherwise, it will be
assay_anndata