Data preparation APIs
Dataset wrapper classes provide functionality for adding in-memory or local data objects to datasets when rendering Vitessce as a Jupyter widget.
We provide default wrapper class implementations for data formats used by popular single-cell and imaging packages.
To write your own custom wrapper class, create a subclass
of the AbstractWrapper
class, implementing the
getter functions for the data types that can be derived from your object.
vitessce.wrappers
- class vitessce.wrappers.AbstractWrapper(**kwargs)[source]
An abstract class that can be extended when implementing custom dataset object wrapper classes.
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- auto_view_config(vc)[source]
Auto view configuration is intended to be used internally by the VitessceConfig.from_object method. Each subclass of AbstractWrapper may implement this method which takes in a VitessceConfig instance and modifies it by adding datasets, visualization components, and view coordinations. Implementations of this method may create an opinionated view config based on inferred use cases.
- Parameters
vc (VitessceConfig) – The view config instance.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- get_local_dir_route(dataset_uid, obj_i, local_dir_path, local_dir_uid)[source]
Obtain the Mount for some local directory
- Parameters
dataset_uid (str) – A dataset unique identifier for the Mount
obj_i (str) – A index of the current vitessce.wrappers.AbstractWrapper among all other wrappers in the view config
local_dir_path (str) – The path to the local directory to serve.
local_dir_uid (str) – The UID to include as the route path suffix.
- Returns
A starlette Mount of the the local_dir_path
- Return type
list[starlette.routing.Mount]
- get_routes()[source]
Obtain the routes that have been created for this wrapper class.
- Returns
A list of server routes.
- Return type
list[starlette.routing.Route]
- class vitessce.wrappers.AnnDataWrapper(adata_path=None, adata_url=None, adata_store=None, adata_artifact=None, ref_path=None, ref_url=None, ref_artifact=None, obs_feature_matrix_path=None, feature_filter_path=None, initial_feature_filter_path=None, obs_set_paths=None, obs_set_names=None, obs_locations_path=None, obs_segmentations_path=None, obs_embedding_paths=None, obs_embedding_names=None, obs_embedding_dims=None, obs_spots_path=None, obs_points_path=None, feature_labels_path=None, obs_labels_path=None, convert_to_dense=True, coordination_values=None, obs_labels_paths=None, obs_labels_names=None, **kwargs)[source]
Wrap an AnnData object by creating an instance of the
AnnDataWrapper
class.- Parameters
adata_path (str) – A path to an AnnData object written to a Zarr store containing single-cell experiment data.
adata_url (str) – A remote url pointing to a zarr-backed AnnData store.
adata_store (str or zarr.Storage) – A path to pass to zarr.DirectoryStore, or an existing store instance.
adata_artifact (lamindb.Artifact) – A lamindb Artifact corresponding to the AnnData object.
obs_feature_matrix_path (str) – Location of the expression (cell x gene) matrix, like X or obsm/highly_variable_genes_subset
feature_filter_path (str) – A string like var/highly_variable used in conjunction with obs_feature_matrix_path if obs_feature_matrix_path points to a subset of X of the full var list.
initial_feature_filter_path (str) – A string like var/highly_variable used in conjunction with obs_feature_matrix_path if obs_feature_matrix_path points to a subset of X of the full var list.
obs_set_paths (list[str]) – Column names like [‘obs/louvain’, ‘obs/cellType’] for showing cell sets
obs_set_names (list[str]) – Names to display in place of those in obs_set_paths, like [‘Louvain’, ‘Cell Type’]
obs_locations_path (str) – Column name in obsm that contains centroid coordinates for displaying centroids in the spatial viewer
obs_segmentations_path (str) – Column name in obsm that contains polygonal coordinates for displaying outlines in the spatial viewer
obs_embedding_paths (list[str]) – Column names like [‘obsm/X_umap’, ‘obsm/X_pca’] for showing scatterplots
obs_embedding_names (list[str]) – Overriding names like [‘UMAP’, ‘PCA’] for displaying above scatterplots
obs_embedding_dims (list[str]) – Dimensions along which to get data for the scatterplot, like [[0, 1], [4, 5]] where [0, 1] is just the normal x and y but [4, 5] could be comparing the third and fourth principal components, for example.
obs_spots_path (str) – Column name in obsm that contains centroid coordinates for displaying spots in the spatial viewer
obs_points_path (str) – Column name in obsm that contains centroid coordinates for displaying points in the spatial viewer
feature_labels_path (str) – The name of a column containing feature labels (e.g., alternate gene symbols), instead of the default index in var of the AnnData store.
obs_labels_path (str) – (DEPRECATED) The name of a column containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store. Use obs_labels_paths and obs_labels_names instead. This arg will be removed in a future release.
obs_labels_paths (list[str]) – The names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store.
obs_labels_names (list[str]) – The optional display names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store.
convert_to_dense (bool) – Whether or not to convert X to dense the zarr store (dense is faster but takes more disk space).
coordination_values (dict or None) – Coordination values for the file definition.
**kwargs – Keyword arguments inherited from
AbstractWrapper
- auto_view_config(vc)[source]
Auto view configuration is intended to be used internally by the VitessceConfig.from_object method. Each subclass of AbstractWrapper may implement this method which takes in a VitessceConfig instance and modifies it by adding datasets, visualization components, and view coordinations. Implementations of this method may create an opinionated view config based on inferred use cases.
- Parameters
vc (VitessceConfig) – The view config instance.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.CsvWrapper(csv_path=None, csv_url=None, data_type=None, options=None, coordination_values=None, **kwargs)[source]
Wrap a CSV file by creating an instance of the
CsvWrapper
class.- Parameters
data_type (str) – The data type of the information contained in the file.
csv_path (str) – A local filepath to a CSV file.
csv_url (str) – A remote URL of a CSV file.
options (dict) – The file options.
coordination_values (dict) – The coordination values.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.ImageOmeTiffWrapper(img_path=None, img_url=None, img_artifact=None, offsets_path=None, offsets_url=None, offsets_artifact=None, coordinate_transformations=None, coordination_values=None, **kwargs)[source]
Wrap an OME-TIFF File by creating an instance of the
ImageOmeTiffWrapper
class. Intended to be used with the spatialBeta and layerControllerBeta views.- Parameters
img_path (str) – A local filepath to an OME-TIFF file.
offsets_path (str) – A local filepath to an offsets.json file.
img_url (str) – A remote URL of an OME-TIFF file.
offsets_url (str) – A remote URL of an offsets.json file.
coordinate_transformations (list) – A column-major ordered matrix for transforming this image (see http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/#homogeneous-coordinates for more information).
coordination_values (dict) – Optional coordinationValues to be passed in the file definition.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.ImageOmeZarrWrapper(img_path=None, img_url=None, img_artifact=None, coordinate_transformations=None, coordination_values=None, **kwargs)[source]
Wrap an OME-NGFF Zarr store by creating an instance of the
ImageOmeZarrWrapper
class. Intended to be used with the spatialBeta and layerControllerBeta views.- Parameters
img_path (str) – A local filepath to an OME-NGFF Zarr store.
img_url (str) – A remote URL of an OME-NGFF Zarr store.
img_artifact (lamindb.Artifact) – A lamindb Artifact corresponding to the image.
coordinate_transformations (list) – A column-major ordered matrix for transforming this image (see http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/#homogeneous-coordinates for more information).
coordination_values (dict) – Optional coordinationValues to be passed in the file definition.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.JsonWrapper(json_path=None, json_url=None, data_type=None, options=None, coordination_values=None, **kwargs)[source]
Wrap a JSON file by creating an instance of the
JsonWrapper
class.- Parameters
data_type (str) – The data type of the information contained in the file.
json_path (str) – A local filepath JSON a JSON file.
json_url (str) – A remote URL of a CSV file.
options (dict) – The file options.
coordination_values (dict) – The coordination values.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.MultiImageWrapper(image_wrappers, use_physical_size_scaling=False, **kwargs)[source]
Wrap multiple imaging datasets by creating an instance of the
MultiImageWrapper
class.- Parameters
image_wrappers (list) – A list of imaging wrapper classes (only
OmeTiffWrapper
supported now)**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.MultivecZarrWrapper(zarr_path=None, zarr_url=None, **kwargs)[source]
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.ObsSegmentationsOmeTiffWrapper(img_path=None, img_url=None, img_artifact=None, offsets_path=None, offsets_url=None, offsets_artifact=None, coordinate_transformations=None, obs_types_from_channel_names=None, coordination_values=None, **kwargs)[source]
Wrap an OME-TIFF File by creating an instance of the
ObsSegmentationsOmeTiffWrapper
class. Intended to be used with the spatialBeta and layerControllerBeta views.- Parameters
img_path (str) – A local filepath to an OME-TIFF file.
img_url (str) – A remote URL of an OME-TIFF file.
img_artifact (lamindb.Artifact) – A lamindb Artifact corresponding to the image.
offsets_path (str) – A local filepath to an offsets.json file.
offsets_url (str) – A remote URL of an offsets.json file.
offsets_artifact (lamindb.Artifact) – A lamindb Artifact corresponding to the offsets JSON.
coordinate_transformations (list) – A column-major ordered matrix for transforming this image (see http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/#homogeneous-coordinates for more information).
obs_types_from_channel_names (bool) – Whether to use the channel names to determine the obs types. Optional.
coordination_values (dict) – Optional coordinationValues to be passed in the file definition.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.ObsSegmentationsOmeZarrWrapper(img_path=None, img_url=None, img_artifact=None, coordinate_transformations=None, coordination_values=None, obs_types_from_channel_names=None, **kwargs)[source]
Wrap an OME-NGFF Zarr store by creating an instance of the
ObsSegmentationsOmeZarrWrapper
class. Intended to be used with the spatialBeta and layerControllerBeta views.- Parameters
img_path (str) – A local filepath to an OME-NGFF Zarr store.
img_url (str) – A remote URL of an OME-NGFF Zarr store.
img_artifact (lamindb.Artifact) – A lamindb Artifact corresponding to the image.
coordinate_transformations (list) – A column-major ordered matrix for transforming this image (see http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/#homogeneous-coordinates for more information).
coordination_values (dict) – Optional coordinationValues to be passed in the file definition.
obs_types_from_channel_names (bool) – Whether to use the channel names to determine the obs types. Optional.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.OmeTiffWrapper(img_path=None, offsets_path=None, img_url=None, offsets_url=None, name='', transformation_matrix=None, is_bitmask=False, **kwargs)[source]
Wrap an OME-TIFF File by creating an instance of the
OmeTiffWrapper
class.- Parameters
img_path (str) – A local filepath to an OME-TIFF file.
offsets_path (str) – A local filepath to an offsets.json file.
img_url (str) – A remote URL of an OME-TIFF file.
offsets_url (str) – A remote URL of an offsets.json file.
name (str) – The display name for this OME-TIFF within Vitessce.
transformation_matrix (list[number]) – A column-major ordered matrix for transforming this image (see http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/#homogeneous-coordinates for more information).
is_bitmask (bool) – Whether or not this image is a bitmask.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.OmeZarrWrapper(img_path=None, img_url=None, name='', is_bitmask=False, **kwargs)[source]
Wrap an OME-NGFF Zarr store by creating an instance of the
OmeZarrWrapper
class.- Parameters
img_path (str) – A local filepath to an OME-NGFF Zarr store.
img_url (str) – A remote URL of an OME-NGFF Zarr store.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.SpatialDataWrapper(sdata_path: Optional[str] = None, sdata_url: Optional[str] = None, sdata_store: Optional[Union[str, zarr.storage.StoreLike]] = None, sdata_artifact: Optional[ln.Artifact] = None, image_path: Optional[str] = None, region: Optional[str] = None, coordinate_system: Optional[str] = None, affine_transformation: Optional[np.ndarray] = None, obs_spots_path: Optional[str] = None, labels_path: Optional[str] = None, table_path: str = 'tables/table', **kwargs)[source]
Wrap a SpatialData object.
- Parameters
sdata_path (Optional[str]) – SpatialData path, exclusive with other {sdata,adata}_xxxx arguments, by default None
sdata_url (Optional[str]) – SpatialData url, exclusive with other {sdata,adata}_xxxx arguments, by default None
sdata_store (Optional[Union[str, zarr.storage.StoreLike]]) – SpatialData store, exclusive with other {spatialdata,adata}_xxxx arguments, by default None
sdata_artifact (Optional[ln.Artifact]) – Artifact that corresponds to a SpatialData object.
image_path (Optional[str]) – Path to the image element of interest. By default, None.
coordinate_system (Optional[str]) – Name of a target coordinate system.
affine_transformation (Optional[np.ndarray]) – Transformation to be applied to the image. By default, None. Prefer coordinate_system.
obs_spots_path (Optional[str]) – Location of shapes that should be interpreted as spot observations, by default None
labels_path (Optional[str]) – Location of the labels (segmentation bitmask image), by default None
- classmethod from_object(sdata: SpatialData, table_keys_to_image_elems: dict[str, Union[str, None]] = {}, table_keys_to_regions: dict[str, Union[str, None]] = {}, obs_type_label: str = 'spot') list[SpatialDataWrapperType] [source]
Instantiate a wrapper for SpatialData stores, one per table, directly from the SpatialData object. By default, we “show everything” that can reasonable be inferred given the information. If you wish to have more control, consider instantiating the object directly. This function will error if something cannot be inferred i.e., the user does not present regions explicitly but there is more than one for a given table.
- clsType[SpatialDataWrapperType]
_description_
- spatialdataSpatialData
_description_
- table_keys_to_image_elemsdict[str, str], optional
which image paths to use for a given table for the visualization, by default None for each table key.
- table_keys_to_regionsdict[str, str], optional
which regions to use for a given table for the visualization, by default None for each table key.
list[SpatialDataWrapperType]
ValueError
vitessce.export
- vitessce.export.export_to_files(config, base_url, out_dir='.')[source]
- Parameters
config (VitessceConfig) – The Vitessce view config to export to files.
out_dir (str) – The path to the output directory. By default, the current directory.
base_url (str) – The URL on which the files will be served.
- Returns
The config as a dict, with urls filled in.
- Return type
- vitessce.export.export_to_s3(config, s3, bucket_name, prefix='')[source]
- Parameters
config (VitessceConfig) – The Vitessce view config to export to S3.
s3 (boto3.resource) – A boto3 S3 resource object with permission to upload to the specified bucket.
bucket_name (str) – The name of the bucket to which to upload.
prefix (str) – The prefix path for the bucket keys (think subdirectory).
- Returns
The config as a dict, with S3 urls filled in.
- Return type
vitessce.data_utils
- vitessce.data_utils.ome.multiplex_img_to_ome_tiff(img_arr, channel_names, output_path, axes='CYX')[source]
Convert a multiplexed image to OME-TIFF.
- vitessce.data_utils.ome.multiplex_img_to_ome_zarr(img_arr, channel_names, output_path, img_name='Image', chunks=(1, 256, 256), axes='cyx', channel_colors=None)[source]
Convert a multiplexed image to OME-Zarr v0.3.
- Parameters
img_arr (np.array) – The image as a 3D, 4D, or 5D array.
channel_names (list[str]) – A list of channel names to include in the omero.channels[].label NGFF metadata field.
output_path (str) – The path to save the Zarr store.
img_name (str) – The name of the image to include in the omero.name NGFF metadata field.
chunks (tuple[int]) – The chunk sizes of each axis. By default, (1, 256, 256).
axes (str) – The array axis ordering. By default, “cyx”
channel_colors (dict or None) – Dict mapping channel names to color strings to use for the omero.channels[].color NGFF metadata field. If provided, keys should match channel_names. By default, None to use “FFFFFF” for all channels.
- vitessce.data_utils.ome.needs_bigtiff(img_arr_shape)[source]
Helper function to determine if an image array is too large for standard TIFF format.
- vitessce.data_utils.ome.rgb_img_to_ome_tiff(img_arr, output_path, img_name='Image', axes='CYX')[source]
Convert an RGB image to OME-TIFF.
- vitessce.data_utils.ome.rgb_img_to_ome_zarr(img_arr, output_path, img_name='Image', chunks=(1, 256, 256), axes='cyx', **kwargs)[source]
Convert an RGB image to OME-Zarr v0.3.
- Parameters
img_arr (np.array) – The image as a 3D array.
output_path (str) – The path to save the Zarr store.
img_name (str) – The name of the image to include in the omero.name NGFF metadata field.
chunks (tuple[int]) – The chunk sizes of each axis. By default, (1, 256, 256).
axes (str) – The array axis ordering. By default, “cyx”
- vitessce.data_utils.anndata.cast_arr(arr)[source]
Try to cast an array to a dtype that takes up less space.
- Parameters
arr (np.array) – The array to cast.
- Returns
The new array.
- Return type
np.array
- vitessce.data_utils.anndata.optimize_adata(adata, obs_cols=None, obsm_keys=None, var_cols=None, varm_keys=None, layer_keys=None, remove_X=False, optimize_X=False, to_dense_X=False, to_sparse_X=False)[source]
Given an AnnData object, optimize for usage with Vitessce and return a new object.
- Parameters
adata (anndata.AnnData) – The AnnData object to optimize.
obs_cols (list[str] or None) – Columns of adata.obs to optimize. Columns not specified will not be included in the returned object.
var_cols (list[str] or None) – Columns of adata.var to optimize. Columns not specified will not be included in the returned object.
obsm_keys (list[str] or None) – Arrays within adata.obsm to optimize. Keys not specified will not be included in the returned object.
varm_keys (list[str] or None) – Arrays within adata.varm to optimize. Keys not specified will not be included in the returned object.
layer_keys (list[str] or None) – Arrays within adata.layers to optimize. Keys not specified will not be included in the returned object.
remove_X (bool) – Should the returned object have its X matrix set to None? By default, False.
optimize_X (bool) – Should the returned object run optimize_arr on adata.X? By default, False.
to_dense_X (bool) – Should adata.X be cast to a dense array in the returned object? By default, False.
to_sparse_X (bool) – Should adata.X be cast to a sparse array in the returned object? By default, False.
- Returns
The new AnnData object.
- Return type
- vitessce.data_utils.anndata.optimize_arr(arr)[source]
Try to cast an array to a dtype that takes up less space, and convert to dense.
- Parameters
arr (np.array) – The array to cast and convert.
- Returns
The new array.
- Return type
np.array
- vitessce.data_utils.anndata.sort_var_axis(adata_X, orig_var_index, full_var_index=None)[source]
Sort the var index by performing hierarchical clustering.
- Parameters
adata_X (np.array) – The matrix to use for clustering. For example, adata.X
orig_var_index (pandas.Index) – The original var index. For example, adata.var.index
full_var_index (pandas.Index or None) – Pass the full adata.var.index to append the var values excluded from sorting, if adata_X and orig_var_index are a subset of the full adata.X matrix. By default, None.
- Returns
The sorted elements of the var index.
- Return type
- vitessce.data_utils.anndata.to_dense(arr)[source]
Convert a sparse array to dense.
- Parameters
arr (np.array) – The array to convert.
- Returns
The converted array (or the original array if it was already dense).
- Return type
np.array
- vitessce.data_utils.anndata.to_diamond(x, y, r)[source]
Convert an (x, y) coordinate to a polygon (diamond) with a given radius.
- vitessce.data_utils.anndata.to_memory(arr)[source]
Try to load a backed AnnData array into memory.
- Parameters
arr (np.array) – The array to load.
- Returns
The loaded array.
- Return type
np.array