Visualization of single-cell RNA seq data using vitessce.io
1. Import dependencies
We need to import the classes and functions that we will be using from the corresponding packages.
[ ]:
import os
from os.path import join
from urllib.request import urlretrieve
from anndata import read_h5ad
import scanpy as sc
from vitessce import (
VitessceConfig,
Component as cm,
CoordinationType as ct,
AnnDataWrapper,
)
2. Download the data
For this example, we need to download a dataset from the COVID-19 Cell Atlas https://www.covid19cellatlas.org/index.healthy.html#habib17.
[ ]:
os.makedirs("data", exist_ok=True)
adata_filepath = join("data", "habib17.processed.h5ad")
urlretrieve('https://covid19.cog.sanger.ac.uk/habib17.processed.h5ad', adata_filepath)
3. Load the data
Note: this function may print a FutureWarning
[ ]:
adata = read_h5ad(adata_filepath)
3.1. Preprocess the Data For Visualization
This dataset contains 25,587 genes. In order to visualize it efficiently, we convert it to CSC sparse format so that we can make fast requests for gene data. We also prepare to visualize the top 50 highly variable genes for the heatmap as ranked by dispersion norm, although one may use any boolean array filter for the heatmap.
[ ]:
top_dispersion = adata.var["dispersions_norm"][
sorted(
range(len(adata.var["dispersions_norm"])),
key=lambda k: adata.var["dispersions_norm"][k],
)[-51:][0]
]
adata.var["top_highly_variable"] = (
adata.var["dispersions_norm"] > top_dispersion
)
4. Create the Vitessce widget configuration
Vitessce needs to know which pieces of data we are interested in visualizing, the visualization types we would like to use, and how we want to coordinate (or link) the views.
4.1. Instantiate a VitessceConfig
object
Use the VitessceConfig(name, description)
constructor to create an instance.
[ ]:
vc = VitessceConfig(schema_version="1.0.15", name='Habib et al', description='COVID-19 Healthy Donor Brain')
4.2. Add a dataset to the VitessceConfig
instance
In Vitessce, a dataset is a container for one file per data type. The .add_dataset(name)
method on the vc
instance sets up and returns a new dataset instance.
Then, we can call the dataset’s .add_object(wrapper_object)
method to attach a “data wrapper” instance to our new dataset. For example, the AnnDataWrapper
class knows how to convert AnnData objects to the corresponding Vitessce data types.
Dataset wrapper classes may require additional parameters to resolve ambiguities. For instance, AnnData
objects may store multiple clusterings or cell type annotation columns in the adata.obs
dataframe. We can use the parameter cell_set_obs_cols
to tell Vitessce which columns of the obs
dataframe correspond to cell sets.
[ ]:
dataset = vc.add_dataset(name='Brain').add_object(AnnDataWrapper(
adata,
obs_embedding_paths=["obsm/X_umap"],
obs_embedding_names=["UMAP"],
obs_set_paths=["obs/CellType"],
obs_set_names=["Cell Type"],
obs_feature_matrix_path="X",
feature_filter_path="var/top_highly_variable"
)
)
4.3. Add visualizations to the VitessceConfig
instance
Now that we have added a dataset, we can configure visualizations. The .add_view(dataset, component_type)
method adds a view (i.e. visualization or controller component) to the configuration.
The Component
enum class (which we have imported as cm
here) can be used to fill in the component_type
parameter.
For convenience, the SCATTERPLOT
component type takes the extra mapping
keyword argument, which specifies which embedding should be used for mapping cells to (x,y) points on the plot.
[ ]:
scatterplot = vc.add_view(cm.SCATTERPLOT, dataset=dataset, mapping="UMAP")
cell_sets = vc.add_view(cm.OBS_SETS, dataset=dataset)
genes = vc.add_view(cm.FEATURE_LIST, dataset=dataset)
heatmap = vc.add_view(cm.HEATMAP, dataset=dataset)
4.4. Define the visualization layout
The vc.layout(view_concat)
method allows us to specify how our views will be arranged in the layout grid in the widget. The |
and /
characters are magic syntax for hconcat(v1, v2)
and vconcat(v1, v2)
, respectively.
[ ]:
vc.layout((scatterplot | cell_sets) / (heatmap | genes));
5. Launch the web application
The vc.web_app()
method serves the processed data locally and opens a web browser to http://vitessce.io/?url={config_as_json}
[ ]:
vc.web_app()