Data preparation APIs

Dataset wrapper classes provide functionality for adding in-memory or local data objects to datasets when rendering Vitessce as a Jupyter widget.

We provide default wrapper class implementations for data formats used by popular single-cell and imaging packages.

To write your own custom wrapper class, create a subclass of the AbstractWrapper class, implementing the getter functions for the data types that can be derived from your object.


class vitessce.wrappers.AbstractWrapper(**kwargs)[source]

An abstract class that can be extended when implementing custom dataset object wrapper classes.

Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }


Auto view configuration is intended to be used internally by the VitessceConfig.from_object method. Each subclass of AbstractWrapper may implement this method which takes in a VitessceConfig instance and modifies it by adding datasets, visualization components, and view coordinations. Implementations of this method may create an opinionated view config based on inferred use cases.


vc (VitessceConfig) – The view config instance.

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.


Obtain the file definitions for this wrapper class.


base_url (str) – A base URL to prepend to relative URLs.


A list of file definitions.

Return type


get_local_dir_route(dataset_uid, obj_i, local_dir_path, local_dir_uid)[source]

Obtain the Mount for some local directory

  • dataset_uid (str) – A dataset unique identifier for the Mount

  • obj_i (str) – A index of the current vitessce.wrappers.AbstractWrapper among all other wrappers in the view config

  • local_dir_path (str) – The path to the local directory to serve.

  • local_dir_uid (str) – The UID to include as the route path suffix.


A starlette Mount of the the local_dir_path

Return type


get_out_dir_route(dataset_uid, obj_i)[source]

Obtain the Mount for the out_dir

  • dataset_uid (str) – A dataset unique identifier for the Mount

  • obj_i (str) – A index of the current vitessce.wrappers.AbstractWrapper among all other wrappers in the view config


A starlette Mount of the the out_dir

Return type



Obtain the routes that have been created for this wrapper class.


A list of server routes.

Return type



Obtain the stores that have been created for this wrapper class.


A dictionary that maps file URLs to Zarr Store objects.

Return type

dict[str, zarr.Store]

class vitessce.wrappers.AnnDataWrapper(adata_path=None, adata_url=None, adata_store=None, ref_path=None, ref_url=None, obs_feature_matrix_path=None, feature_filter_path=None, initial_feature_filter_path=None, obs_set_paths=None, obs_set_names=None, obs_locations_path=None, obs_segmentations_path=None, obs_embedding_paths=None, obs_embedding_names=None, obs_embedding_dims=None, obs_spots_path=None, obs_points_path=None, feature_labels_path=None, obs_labels_path=None, convert_to_dense=True, coordination_values=None, obs_labels_paths=None, obs_labels_names=None, **kwargs)[source]

Wrap an AnnData object by creating an instance of the AnnDataWrapper class.

  • adata_path (str) – A path to an AnnData object written to a Zarr store containing single-cell experiment data.

  • adata_url (str) – A remote url pointing to a zarr-backed AnnData store.

  • adata_store (str or zarr.Storage) – A path to pass to zarr.DirectoryStore, or an existing store instance.

  • obs_feature_matrix_path (str) – Location of the expression (cell x gene) matrix, like X or obsm/highly_variable_genes_subset

  • feature_filter_path (str) – A string like var/highly_variable used in conjunction with obs_feature_matrix_path if obs_feature_matrix_path points to a subset of X of the full var list.

  • initial_feature_filter_path (str) – A string like var/highly_variable used in conjunction with obs_feature_matrix_path if obs_feature_matrix_path points to a subset of X of the full var list.

  • obs_set_paths (list[str]) – Column names like [‘obs/louvain’, ‘obs/cellType’] for showing cell sets

  • obs_set_names (list[str]) – Names to display in place of those in obs_set_paths, like [‘Louvain’, ‘Cell Type’]

  • obs_locations_path (str) – Column name in obsm that contains centroid coordinates for displaying centroids in the spatial viewer

  • obs_segmentations_path (str) – Column name in obsm that contains polygonal coordinates for displaying outlines in the spatial viewer

  • obs_embedding_paths (list[str]) – Column names like [‘obsm/X_umap’, ‘obsm/X_pca’] for showing scatterplots

  • obs_embedding_names (list[str]) – Overriding names like [‘UMAP’, ‘PCA’] for displaying above scatterplots

  • obs_embedding_dims (list[str]) – Dimensions along which to get data for the scatterplot, like [[0, 1], [4, 5]] where [0, 1] is just the normal x and y but [4, 5] could be comparing the third and fourth principal components, for example.

  • obs_spots_path (str) – Column name in obsm that contains centroid coordinates for displaying spots in the spatial viewer

  • obs_points_path (str) – Column name in obsm that contains centroid coordinates for displaying points in the spatial viewer

  • feature_labels_path (str) – The name of a column containing feature labels (e.g., alternate gene symbols), instead of the default index in var of the AnnData store.

  • obs_labels_path (str) – (DEPRECATED) The name of a column containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store. Use obs_labels_paths and obs_labels_names instead. This arg will be removed in a future release.

  • obs_labels_paths (list[str]) – The names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store.

  • obs_labels_names (list[str]) – The optional display names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store.

  • convert_to_dense (bool) – Whether or not to convert X to dense the zarr store (dense is faster but takes more disk space).

  • coordination_values (dict or None) – Coordination values for the file definition.

  • **kwargs – Keyword arguments inherited from AbstractWrapper


Auto view configuration is intended to be used internally by the VitessceConfig.from_object method. Each subclass of AbstractWrapper may implement this method which takes in a VitessceConfig instance and modifies it by adding datasets, visualization components, and view coordinations. Implementations of this method may create an opinionated view config based on inferred use cases.


vc (VitessceConfig) – The view config instance.

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.CsvWrapper(csv_path=None, csv_url=None, data_type=None, options=None, coordination_values=None, **kwargs)[source]

Wrap a CSV file by creating an instance of the CsvWrapper class.

  • data_type (str) – The data type of the information contained in the file.

  • csv_path (str) – A local filepath to a CSV file.

  • csv_url (str) – A remote URL of a CSV file.

  • options (dict) – The file options.

  • coordination_values (dict) – The coordination values.

  • **kwargs – Keyword arguments inherited from AbstractWrapper

Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.ImageOmeTiffWrapper(img_path=None, offsets_path=None, img_url=None, offsets_url=None, coordinate_transformations=None, coordination_values=None, **kwargs)[source]

Wrap an OME-TIFF File by creating an instance of the ImageOmeTiffWrapper class. Intended to be used with the spatialBeta and layerControllerBeta views.


Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.ImageOmeZarrWrapper(img_path=None, img_url=None, coordinate_transformations=None, coordination_values=None, **kwargs)[source]

Wrap an OME-NGFF Zarr store by creating an instance of the ImageOmeZarrWrapper class. Intended to be used with the spatialBeta and layerControllerBeta views.


Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.MultiImageWrapper(image_wrappers, use_physical_size_scaling=False, **kwargs)[source]

Wrap multiple imaging datasets by creating an instance of the MultiImageWrapper class.


Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.MultivecZarrWrapper(zarr_path=None, zarr_url=None, **kwargs)[source]

Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.ObsSegmentationsOmeTiffWrapper(img_path=None, offsets_path=None, img_url=None, offsets_url=None, coordinate_transformations=None, obs_types_from_channel_names=None, coordination_values=None, **kwargs)[source]

Wrap an OME-TIFF File by creating an instance of the ObsSegmentationsOmeTiffWrapper class. Intended to be used with the spatialBeta and layerControllerBeta views.

  • img_path (str) – A local filepath to an OME-TIFF file.

  • offsets_path (str) – A local filepath to an offsets.json file.

  • img_url (str) – A remote URL of an OME-TIFF file.

  • offsets_url (str) – A remote URL of an offsets.json file.

  • coordinate_transformations (list) – A column-major ordered matrix for transforming this image (see for more information).

  • obs_types_from_channel_names (bool) – Whether to use the channel names to determine the obs types. Optional.

  • coordination_values (dict) – Optional coordinationValues to be passed in the file definition.

  • **kwargs – Keyword arguments inherited from AbstractWrapper

Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.ObsSegmentationsOmeZarrWrapper(img_path=None, img_url=None, coordinate_transformations=None, coordination_values=None, obs_types_from_channel_names=None, **kwargs)[source]

Wrap an OME-NGFF Zarr store by creating an instance of the ObsSegmentationsOmeZarrWrapper class. Intended to be used with the spatialBeta and layerControllerBeta views.


Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.OmeTiffWrapper(img_path=None, offsets_path=None, img_url=None, offsets_url=None, name='', transformation_matrix=None, is_bitmask=False, **kwargs)[source]

Wrap an OME-TIFF File by creating an instance of the OmeTiffWrapper class.


Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.

class vitessce.wrappers.OmeZarrWrapper(img_path=None, img_url=None, name='', is_bitmask=False, **kwargs)[source]

Wrap an OME-NGFF Zarr store by creating an instance of the OmeZarrWrapper class.

  • img_path (str) – A local filepath to an OME-NGFF Zarr store.

  • img_url (str) – A remote URL of an OME-NGFF Zarr store.

  • **kwargs – Keyword arguments inherited from AbstractWrapper

Abstract constructor to be inherited by dataset wrapper classes.

  • out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.

  • request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }

convert_and_save(dataset_uid, obj_i, base_dir=None)[source]

Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.

  • dataset_uid (str) – A unique identifier for this dataset.

  • obj_i (int) – Within the dataset, the index of this data wrapper object.


vitessce.export.export_to_files(config, base_url, out_dir='.')[source]
  • config (VitessceConfig) – The Vitessce view config to export to files.

  • out_dir (str) – The path to the output directory. By default, the current directory.

  • base_url (str) – The URL on which the files will be served.


The config as a dict, with urls filled in.

Return type


vitessce.export.export_to_s3(config, s3, bucket_name, prefix='')[source]
  • config (VitessceConfig) – The Vitessce view config to export to S3.

  • s3 (boto3.resource) – A boto3 S3 resource object with permission to upload to the specified bucket.

  • bucket_name (str) – The name of the bucket to which to upload.

  • prefix (str) – The prefix path for the bucket keys (think subdirectory).


The config as a dict, with S3 urls filled in.

Return type



vitessce.data_utils.ome.multiplex_img_to_ome_tiff(img_arr, channel_names, output_path, axes='CYX')[source]

Convert a multiplexed image to OME-TIFF.

  • img_arr (np.array) – The image as a 3D, 4D, or 5D array.

  • channel_names (list[str]) – A list of channel names to include in the omero.channels[].label NGFF metadata field.

  • output_path (str) – The path to save the Zarr store.

  • axes (str) – The array axis ordering. By default, “CYX”

vitessce.data_utils.ome.multiplex_img_to_ome_zarr(img_arr, channel_names, output_path, img_name='Image', chunks=(1, 256, 256), axes='cyx', channel_colors=None)[source]

Convert a multiplexed image to OME-Zarr v0.3.

  • img_arr (np.array) – The image as a 3D, 4D, or 5D array.

  • channel_names (list[str]) – A list of channel names to include in the omero.channels[].label NGFF metadata field.

  • output_path (str) – The path to save the Zarr store.

  • img_name (str) – The name of the image to include in the NGFF metadata field.

  • chunks (tuple[int]) – The chunk sizes of each axis. By default, (1, 256, 256).

  • axes (str) – The array axis ordering. By default, “cyx”

  • channel_colors (dict or None) – Dict mapping channel names to color strings to use for the omero.channels[].color NGFF metadata field. If provided, keys should match channel_names. By default, None to use “FFFFFF” for all channels.


Helper function to determine if an image array is too large for standard TIFF format.


img_arr_shape (tuple[int]) – The shape of the image array.


True if the image array is too large for standard TIFF format, False otherwise.

Return type


vitessce.data_utils.ome.rgb_img_to_ome_tiff(img_arr, output_path, img_name='Image', axes='CYX')[source]

Convert an RGB image to OME-TIFF.

  • img_arr (np.array) – The image as a 3D array.

  • output_path (str) – The path to save the Zarr store.

  • img_name (str) – The name of the image to include in the NGFF metadata field.

  • axes (str) – The array axis ordering. By default, “CYX”

vitessce.data_utils.ome.rgb_img_to_ome_zarr(img_arr, output_path, img_name='Image', chunks=(1, 256, 256), axes='cyx', **kwargs)[source]

Convert an RGB image to OME-Zarr v0.3.

  • img_arr (np.array) – The image as a 3D array.

  • output_path (str) – The path to save the Zarr store.

  • img_name (str) – The name of the image to include in the NGFF metadata field.

  • chunks (tuple[int]) – The chunk sizes of each axis. By default, (1, 256, 256).

  • axes (str) – The array axis ordering. By default, “cyx”


Try to cast an array to a dtype that takes up less space.


arr (np.array) – The array to cast.


The new array.

Return type


vitessce.data_utils.anndata.optimize_adata(adata, obs_cols=None, obsm_keys=None, var_cols=None, varm_keys=None, layer_keys=None, remove_X=False, optimize_X=False, to_dense_X=False, to_sparse_X=False)[source]

Given an AnnData object, optimize for usage with Vitessce and return a new object.

  • adata (anndata.AnnData) – The AnnData object to optimize.

  • obs_cols (list[str] or None) – Columns of adata.obs to optimize. Columns not specified will not be included in the returned object.

  • var_cols (list[str] or None) – Columns of adata.var to optimize. Columns not specified will not be included in the returned object.

  • obsm_keys (list[str] or None) – Arrays within adata.obsm to optimize. Keys not specified will not be included in the returned object.

  • varm_keys (list[str] or None) – Arrays within adata.varm to optimize. Keys not specified will not be included in the returned object.

  • layer_keys (list[str] or None) – Arrays within adata.layers to optimize. Keys not specified will not be included in the returned object.

  • remove_X (bool) – Should the returned object have its X matrix set to None? By default, False.

  • optimize_X (bool) – Should the returned object run optimize_arr on adata.X? By default, False.

  • to_dense_X (bool) – Should adata.X be cast to a dense array in the returned object? By default, False.

  • to_sparse_X (bool) – Should adata.X be cast to a sparse array in the returned object? By default, False.


The new AnnData object.

Return type



Try to cast an array to a dtype that takes up less space, and convert to dense.


arr (np.array) – The array to cast and convert.


The new array.

Return type


vitessce.data_utils.anndata.sort_var_axis(adata_X, orig_var_index, full_var_index=None)[source]

Sort the var index by performing hierarchical clustering.

  • adata_X (np.array) – The matrix to use for clustering. For example, adata.X

  • orig_var_index (pandas.Index) – The original var index. For example, adata.var.index

  • full_var_index (pandas.Index or None) – Pass the full adata.var.index to append the var values excluded from sorting, if adata_X and orig_var_index are a subset of the full adata.X matrix. By default, None.


The sorted elements of the var index.

Return type



Convert a sparse array to dense.


arr (np.array) – The array to convert.


The converted array (or the original array if it was already dense).

Return type


vitessce.data_utils.anndata.to_diamond(x, y, r)[source]

Convert an (x, y) coordinate to a polygon (diamond) with a given radius.

  • x (int or float) – The x coordinate.

  • y – The y coordinate.

  • r (int or float) – The radius.


The polygon vertices as an array of coordinate pairs, like [[x1, y1], [x2, y2], …]

Return type



Try to load a backed AnnData array into memory.


arr (np.array) – The array to load.


The loaded array.

Return type


vitessce.data_utils.anndata.to_uint8(arr, norm_along=None)[source]

Convert an array to uint8 dtype.

  • arr (np.array) – The array to convert.

  • norm_along (str or None) – How to normalize the array values. By default, None. Valid values are “global”, “var”, “obs”.


The converted array.

Return type
