Visualization of 3k PBMC reference

1. Import dependencies

We need to import the classes and functions that we will be using from the corresponding packages.

[ ]:
import os
from os.path import join, isfile, isdir
from urllib.request import urlretrieve
from anndata import read_h5ad
import scanpy as sc

from vitessce import (
    VitessceConfig,
    Component as cm,
    CoordinationType as ct,
    AnnDataWrapper,
)
from vitessce.data_utils import (
    optimize_adata,
    VAR_CHUNK_SIZE,
)

2. Download the dataset

Download pbmc3k_final.h5ad from https://seurat.nygenome.org/pbmc3k_final.h5ad

[ ]:
adata_filepath = join("data", "pbmc3k_final.h5ad")
if not isfile(adata_filepath):
    os.makedirs("data", exist_ok=True)
    urlretrieve('https://seurat.nygenome.org/pbmc3k_final.h5ad', adata_filepath)

3. Load the dataset

Load the dataset using AnnData’s read_h5ad function.

[ ]:
adata = read_h5ad(adata_filepath)

3.1 Save the AnnData object to Zarr

[ ]:
zarr_filepath = join("data", "pbmc3k_final.zarr")
if not isdir(zarr_filepath):
    adata = optimize_adata(
        adata,
        obs_cols=["leiden"],
        obsm_keys=["X_umap", "X_pca"],
        optimize_X=True,
    )
    adata.write_zarr(zarr_filepath, chunks=[adata.shape[0], VAR_CHUNK_SIZE])

4. Create a Vitessce view config

Define the data and views you would like to include in the widget.

For more details about how to configure data depending on where the files are located relative to the notebook execution, see https://python-docs.vitessce.io/data_options.html.

[ ]:
vc = VitessceConfig(schema_version="1.0.15", name='PBMC Reference')
dataset = vc.add_dataset(name='PBMC 3k').add_object(AnnDataWrapper(
    adata_store=zarr_filepath,
    obs_set_paths=["obs/leiden"],
    obs_set_names=["Leiden"],
    obs_embedding_paths=["obsm/X_umap", "obsm/X_pca"],
    obs_embedding_names=["UMAP", "PCA"],
    obs_feature_matrix_path="X"
))

umap = vc.add_view(cm.SCATTERPLOT, dataset=dataset, mapping="UMAP")
pca = vc.add_view(cm.SCATTERPLOT, dataset=dataset, mapping="PCA")
cell_sets = vc.add_view(cm.OBS_SETS, dataset=dataset)
genes = vc.add_view(cm.FEATURE_LIST, dataset=dataset)
heatmap = vc.add_view(cm.HEATMAP, dataset=dataset)

vc.layout((umap / pca) | ((cell_sets | genes) / heatmap));

5. Create the Vitessce widget

[ ]:
vw = vc.widget()
vw