Cell typing with CosMx® SMI Cell Profiles

cell typing
Author
Affiliations

Megan Vandenberg

Bruker Spatial Biology

Github: mgrout81

Published

January 23, 2025

CosMx SMI Cell Profiles

Introduction

Cell typing CosMx Spatial Molecular Imager (SMI) data can be done in several ways. One of these is to employ a reference matrix of known cell type profiles using our Insitutype package (manuscript, GitHub repository, and GitHub FAQ). While spatially-naive scRNAseq-derived profiles work well (e.g., Cell Profile Library), platform differences can extend the iterative process of celltyping (Danaher et al. 2022). To that end, we present CosMx SMI-derived cell profiles, available on the GitHub repository CosMx-Cell-Profiles.

Overview

The CosMx-Cell-Profiles repository contains a library of cell profile matrices with accompanying statistics and metadata. For each featured tissue, the profiles matrix gives the average expression of a variety of relevant cell types. Each matrix in the library was derived from one or more CosMx SMI experiments of a mix of panels. There are profiles from healthy and cancerous adult human samples as well as mouse brain.

Each profile contains the following components:

  1. Cell profiles matrix
  2. Cell type annotations
  3. Basic statistics
  4. Target statistics
  5. Metadata

File Types

Cell Profiles Matrix

Each cell profiles matrix is a CSV file of targets by cell types. Each cell type’s profile is a unique column. Each target is a unique row. Where multiple experiments were combined, only the intersection of targets was used. The profiles were generated using InSituType::Estep(), which removes background readout (negative probes) when calculating the net expression profile for each cell type. For details, refer to the InSituType manual.

Snippet of a cell profiles matrix.

Cell Type Annotations

To put cell types in context, we offer both cell type hierarchies and ontology terms.

R file defining a nested list object so users can group cell type categories. Human-readable ensures non-R users (e.g., Python) can parse and use.

Note

Note that some inner nodes on the hierarchies are both lower-granularity categorizations as well as a final cell type included in the profiles themselves.

Snippet of a cell type hierarchy.

Cell Ontology annotations are also provided for all nodes on the hierarchies. Where applicable, the identified match and/or parent (more general) matches are provided. For other cell types, where no node or parent node matches are found, we instead provide the closest term. Finally, the column in_profiles indicates whether the table row corresponds to a cell type present in the profiles (independent of whether it is an internal or terminal node within the hierarchies).

Basic Statistics

CSV files of basic statistics on the profiles: number of input cells of each type per profile, standard deviation for each target, etc.

Snippet of basic statistics for a set of profiles.

Snippet of input cell counts of each cell type for a set of profiles.

Target Statistics

CSV files of average and standard deviation of targets in profiles so that users may remove targets as desired. Unlike the cell profiles, the average values by cell type and target here are simple means that do not use the negative probe values.

Snippet of input cell standard deviations matrix of each cell type for a set of profiles.

Metadata

JSON file on experimental design and attribution, including collaborators (if applicable), species, tissue type/substructure, CosMx SMI instrument version, input panel, etc.

Important note

If you use the cell profiles in your work, please include citations applicable for the relevant tissue(s).

Snippet of input metadata for a set of cell profiles.

Usage

Important note

If you use the cell profiles in your work, please include citations applicable for the relevant tissue(s). See the Metadata file for more information.

These matrices can be downloaded directly and loaded into environments for analysis with Insitutype.

Caution:

  • We do not recommend combining CosMx SMI-derived cell type profiles with scRNA-seq derived profiles in cell typing. For example, we advise against combining the CosMx SMI IO profiles with scRNA-seq profiles for the tissue type.
  • Note that some inner nodes on the hierarchies are both lower-granularity categorizations as well as a final cell type included in the profiles themselves.
  • If you choose to combine multiple CosMX SMI-derived profiles into a single hybrid reference, please see our Scratch Space post here for guidance.

Methodology

All profiles were derived from CosMx SMI experiments. Projects with high-confidence cell typing were identified and permission obtained from collaborators/customers where necessary. Cell type names were corrected for consistent style. Where necessary, poor-confidence typed cells as well as genes with high discordance between CosMx SMI-derived and scRNA-seq derived profiles were removed. In profiles built from multiple experiments, only the intersection of targets was used. InSituType::Estep() was run to generate mean expression profiles from the raw counts of cells x targets, negative probe counts, and given cell types.

Please note the profiles, while derived from CosMx SMI experiments, may not contain the exact suite of targets of current CosMx SMI panel products.

Contribution

If you would like to contribute to the CosMx Cell Profiles repository with data from your experiments, please contact us at support.spatial@bruker.com. Broadly speaking, the process involves a license agreement, finalized cell typing, and generation of the standardized file types of the repository as outlined above.

References

Danaher, Patrick, Edward Zhao, Zhi Yang, David Ross, Mark Gregory, Zach Reitz, Tae K. Kim, et al. 2022. “Insitutype: Likelihood-Based Cell Typing for Single Cell Spatial Transcriptomics.” bioRxiv. https://doi.org/10.1101/2022.10.19.512902.