1 Introduction

One of the most exciting aspects of CosMx^® Spatial Molecular Imager (SMI) data is the ability to directly observe gene expression in its spatial context at the single cell level. This is a great technological leap from previous single cell transcriptomics methods that lost spatial context while retrieving cells. For analysts looking to perform spatial data analysis, the Seurat R package has continually added features to support CosMx SMI data. Readers are encouraged to take a look at previous vignettes by the Seurat group (Spatial Vignette and Clustering Tutorial) as well as blog posts we’ve provided previously (scratch space). The blog post herein supplements these and provides you with some of the plotting configurations we find most helpful as you explore your CosMx SMI data. This vignette does not cover analysis of data in Seurat but rather tries to address frequently asked questions we’ve received from customers on getting started with their data in Seurat.

For this vignette, we use a Seurat object made from a mouse brain public data set. To download raw data for this dataset, go here. Two options to generate your own Seurat object from the AtoMx^® Spatial Informatics Portal (SIP) are described below.

Like other items in our CosMx Analysis Scratch Space, the usual caveats and license applies. This post will show you how to:

Section 2 Load data into a Seurat object
Section 3 Find important data in a Seurat object
Section 4 Plot CosMx SMI data in space
Section 5 Visualize dimension reduction data
Section 6 Concluding thoughts

2 Data Loading

Note

Many of the below functions require that you are working with Seurat v5 and may not work in earlier versions.

First, load needed libraries:

library(Seurat)
library(ggplot2)

Adjust globals option to avoid an error exceeding max allowed size. We’ve found this is necessary even with relatively small CosMx SMI datasets (30 - 40 FOVs).

options(future.globals.maxSize = 8000 * 1024^2)

There are two approaches to export CosMx SMI data from AtoMx SIP and load them into a Seurat object. Data can be exported from AtoMx SIP directly into a Seurat object. Alternatively, raw data can be exported as flat files and then loaded into a Seurat object. Both approaches are demonstrated below.

Export Seurat object
Export flat files

To export CosMx SMI data from AtoMx SIP directly into a Seurat object, select “Export a Seurat Object” from the export module as shown in Figure 1. Be sure to export the Seurat object with transcript coordinates and polygon coordinates included to access all of the functionality below.

Figure 1: Screenshot showing selection of Seurat object for export from AtoMx SIP.

Once export is complete, download the file using your SFTP application (e.g., cyberduck, FileZilla, WinSCP).

You can then load the Seurat object into R as follows. The Seurat object used for the demonstration here is available on Box.com here.

seu.obj <- readRDS("/path/to/file/seurat_object.Rds")

To export CosMx SMI data from AtoMx SIP into flat files, select all five flat files from the export module as shown in Figure 2. For more information about the flat file formats, please see this post. When taking this approach, some tertiary analyses run in AtoMx SIP may not be included. For example, any PCA or UMAP dimension reductions performed in AtoMx SIP do not get exported in the flat files and so will not be present in the Seurat object when generated from flat files.

Figure 2: Screenshot showing selection of flat files for export from AtoMx SIP.

Once export is complete, download the files using your SFTP application (e.g., cyberduck, FileZilla, WinSCP).

Load the flat files into a Seurat object using the LoadNanostring function from the Seurat package. Within the Seurat object, each slide is stored as a separate ‘image’ or ‘fov’. This is an unfortunate naming convention difference between CosMx SMI nomenclature and the Seurat package. What Seurat refers to as an ‘fov’ is what NanoString refers to as a slide. When you load flat files into a Seurat object, you need to name the Seurat fov / CosMx SMI slide with the argument fov =. This name must be a character, not a numeric. Here we use “slide1”, but more descriptive names are also possible. For an example of how to plot a single CosMx SMI FOV, see Section 4.

seu.obj <- LoadNanostring("/path/to/flatFileDirectory/", fov = "slide1")

There are two main additions that we recommend making after generating this initial Seurat object. First, this initial Seurat object will not have many of the metadata columns included in the cell metadata flat file. Second, this Seurat object will store the expression of all probes, including negative probes and system controls, in one assay named “Nanostring”. We modify the Seurat object from both of these defaults here.

# Add cell metadata
cell_meta <- read.csv("/path/to/flatFileDirectory/SlideName_metadata_file.csv")
row.names(cell_meta) <- paste0(cell_meta$cell_ID, "_", cell_meta$fov)
seu.obj <- AddMetaData(seu.obj, metadata = cell_meta)

# Move control probes to separate assays
if(length(grep("Neg", rownames(seu.obj@assays$Nanostring))) > 0){
  # Extract neg probes to a separate assay
  neg_probes <- grep("Neg", rownames(seu.obj@assays$Nanostring), value = T)
  neg_matrix <- seu.obj@assays$Nanostring$counts[neg_probes,]
  neg_assay <- CreateAssayObject(counts = neg_matrix)
  
  # Remove from original assay
  seu.obj <- subset(seu.obj, features = setdiff(rownames(seu.obj@assays$Nanostring), neg_probes))
  
  # Add to new assay
  seu.obj[["NegProbes"]] <- neg_assay 
  
}

if(length(grep("SystemControl", rownames(seu.obj@assays$Nanostring))) > 0){
  # Extract system control probes to a separate assay
  sysctrl_probes <- grep("SystemControl", rownames(seu.obj@assays$Nanostring), value = T)
  sysctrl_matrix <- seu.obj@assays$Nanostring$counts[sysctrl_probes,]
  sysctrl_assay <- CreateAssayObject(counts = sysctrl_matrix)
  
  # Remove from original assay
  seu.obj <- subset(seu.obj, features = setdiff(rownames(seu.obj@assays$Nanostring), sysctrl_probes))
  
  # Add to new assay
  seu.obj[["SystemControls"]] <- sysctrl_assay 
  
}

For convenience, you may want to save this Seurat object for future use.

saveRDS(seu.obj, file = "seurat_object.Rds")

If you have data from multiple CosMx SMI slides, you can join them all into a single Seurat object with multiple images.

# Make a list of multiple Seurat objects
all_seurat_objects <- list(
  "slide1" = seu.obj_1,
  "slide2" = seu.obj_2,
  "slide3" = seu.obj_3
)

# Merge objects together, appending a slide identifier to each cell ID
joined_seu.obj <- merge(all_seurat_objects[[1]], 
                        y = all_seurat_objects[2:length(all_seurat_objects)], 
                        add.cell.ids = names(all_seurat_objects), 
                        project = "CosMx")

3 Data Structure

Here we’ll show where various key data are stored in the Seurat object.

# Cell metadata
head(seu.obj@meta.data)

                       orig.ident nCount_Nanostring nFeature_Nanostring cell_ID
Run1000.S1.Half_1_1 SeuratProject               216                  95 c_2_1_1
Run1000.S1.Half_2_1 SeuratProject               325                 118 c_2_1_2
Run1000.S1.Half_3_1 SeuratProject               503                 284 c_2_1_3
Run1000.S1.Half_4_1 SeuratProject              1085                 329 c_2_1_4
Run1000.S1.Half_5_1 SeuratProject               935                 349 c_2_1_5
Run1000.S1.Half_6_1 SeuratProject              1705                 487 c_2_1_6
                    fov  Area AspectRatio Width Height Mean.Histone Max.Histone
Run1000.S1.Half_1_1   1  6073        0.47    66    141         7095       42463
Run1000.S1.Half_2_1   1  5675        0.72   101    140         9220       39045
Run1000.S1.Half_3_1   1 12896        1.26   153    121        16993       45967
Run1000.S1.Half_4_1   1  8234        0.51    81    160        12720       31967
Run1000.S1.Half_5_1   1  9852        0.88   117    133        11177       38479
Run1000.S1.Half_6_1   1 13372        0.90   171    191         6009       17648
                    Mean.Blank Max.Blank Mean.rRNA Max.rRNA Mean.GFAP Max.GFAP
Run1000.S1.Half_1_1         70      4044       376     2871        42     3313
Run1000.S1.Half_2_1         82       296       642     1486        36      527
Run1000.S1.Half_3_1         78      1652       109     1538        37     1797
Run1000.S1.Half_4_1        121      3074       664     3284        71     3625
Run1000.S1.Half_5_1         99      3173       444     2946        82     2957
Run1000.S1.Half_6_1        215      2482       687     2429      2775    35102
                    Mean.DAPI Max.DAPI Run_name Slide_name ISH.concentration
Run1000.S1.Half_1_1        65      233  Run1000 Run1000_S1               1nM
Run1000.S1.Half_2_1        88      287  Run1000 Run1000_S1               1nM
Run1000.S1.Half_3_1        35      249  Run1000 Run1000_S1               1nM
Run1000.S1.Half_4_1       219      540  Run1000 Run1000_S1               1nM
Run1000.S1.Half_5_1       251      628  Run1000 Run1000_S1               1nM
Run1000.S1.Half_6_1       255      702  Run1000 Run1000_S1               1nM
                    Beta tissue slide_ID_numeric Run_Tissue_name
Run1000.S1.Half_1_1   12   Half                2 Run1000_S1_Half
Run1000.S1.Half_2_1   12   Half                2 Run1000_S1_Half
Run1000.S1.Half_3_1   12   Half                2 Run1000_S1_Half
Run1000.S1.Half_4_1   12   Half                2 Run1000_S1_Half
Run1000.S1.Half_5_1   12   Half                2 Run1000_S1_Half
Run1000.S1.Half_6_1   12   Half                2 Run1000_S1_Half
                    log10totalcounts   IFcolor nb_clus leiden_clus
Run1000.S1.Half_1_1         2.334454 #BC077CFF     PVM           2
Run1000.S1.Half_2_1         2.511883 #FF06A1FF    VLMC           2
Run1000.S1.Half_3_1         2.705008 #3706FFFF    OLPC          19
Run1000.S1.Half_4_1         3.035430 #FF0CDEFF    VLMC           2
Run1000.S1.Half_5_1         2.970347 #DE0DC3FF    VSMC           2
Run1000.S1.Half_6_1         3.231470 #FFFF69FF     NGF           8
                    nb_clus_final                  id
Run1000.S1.Half_1_1          VLMC Run1000.S1.Half_1_1
Run1000.S1.Half_2_1          VLMC Run1000.S1.Half_2_1
Run1000.S1.Half_3_1          OLPC Run1000.S1.Half_3_1
Run1000.S1.Half_4_1          VLMC Run1000.S1.Half_4_1
Run1000.S1.Half_5_1          VSMC Run1000.S1.Half_5_1
Run1000.S1.Half_6_1           NGF Run1000.S1.Half_6_1

# Transcript counts. Here, transcript counts are in the 'Nanostring' assay but in other objects they may be stored in an 'RNA' assay.
seu.obj@assays$Nanostring$counts[1:5, 1:5]

Loading required package: Matrix

5 x 5 sparse Matrix of class "dgCMatrix"
       Run1000.S1.Half_1_1 Run1000.S1.Half_2_1 Run1000.S1.Half_3_1
Slc6a1                   .                   .                   1
Cd109                    .                   .                   .
Ldha                     .                   .                   1
Aldoc                    .                   .                   2
Drd1                     .                   .                   .
       Run1000.S1.Half_4_1 Run1000.S1.Half_5_1
Slc6a1                   1                   .
Cd109                    .                   .
Ldha                     1                   2
Aldoc                    .                   2
Drd1                     .                   .

# UMAP positions
seu.obj@reductions$umap@cell.embeddings[1:10,]

                        umap_1     umap_2
Run1000.S1.Half_1_1  -6.179202 -22.688357
Run1000.S1.Half_2_1  -6.470077 -23.498900
Run1000.S1.Half_3_1  -7.297677   4.227824
Run1000.S1.Half_4_1  -6.037831 -23.252535
Run1000.S1.Half_5_1  -2.250786 -21.083180
Run1000.S1.Half_6_1  14.308562  27.765420
Run1000.S1.Half_7_1  -6.235466 -22.308980
Run1000.S1.Half_8_1  -6.485635 -22.782622
Run1000.S1.Half_9_1  -7.263406 -21.322018
Run1000.S1.Half_10_1 -7.601895   4.258457

# Image names. Each slide is stored as a separate image within the object.
Images(seu.obj)

[1] "Run1000.S1.Half"    "Run5642.S3.Quarter"

# Positions in space, here shown for one image / slide
seu.obj@images[[Images(seu.obj)[1]]]$centroids@coords[1:10,] # In this object, this is equivalent to: seu.obj@images$Run1000.S1.Half$centroids@coords[1:10,]

              x     y
 [1,] -494161.3 10541
 [2,] -494201.3 10413
 [3,] -496227.3 10339
 [4,] -494275.3 10083
 [5,] -494221.3  9981
 [6,] -494216.3  9776
 [7,] -494375.3  9591
 [8,] -494697.3  9149
 [9,] -494748.3  8939
[10,] -494669.3  8799

4 Plot data in space

As noted above, within the Seurat object, each slide is stored as a separate ‘image’ or ‘fov’. This is an unfortunate naming convention difference between CosMx SMI nomenclature and the Seurat package. What Seurat refers to as an ‘fov’ is what NanoString refers to as a slide. When plotting cells in space, you need to specify the Seurat ‘fov’ to plot, and this is equivalent to choosing which CosMx SMI slide to plot.

Plot all cells on one slide in space, coloring by cell type.

# Get name of the first image
image1 <- Images(seu.obj)[1]

# Plot all cells.
# We recommend setting the border color to 'NA' as the default 'white' often masks all cells when zoomed out, leading to a fully white plot.
ImageDimPlot(seu.obj, fov = image1, axes = TRUE, border.color = NA)

Plot the location of individual transcripts with the ‘molecules’ option.

ImageDimPlot(seu.obj,
             fov = Images(seu.obj)[1],
             border.color = "black",
             alpha = 0.5, # Reduce alpha of cell fills to better visualize the overlaying molcules
             molecules = c("Slc17a7", "Gad1", "Plp1"),
             mols.size = 0.2,
             nmols = 100000, # Set the total number of molecules to visualize
             axes = FALSE)

Plot one CosMx SMI FOV. To do this, we set the cells we’d like to plot to be all those in our target FOV. A similar strategy could be used to plot a subset of FOVs or a subset of cells of interest.

ImageDimPlot(seu.obj,
             fov = Images(seu.obj)[1],
             border.color = "black",
             cells = row.names(seu.obj@meta.data)[which(seu.obj@meta.data$fov == 99)])

By default, cells are colored by the ‘Identity’ set in the Seurat object. We can change this by selecting another column to color by. Here we show coloring by leiden cluster, which we treat as a factor rather than an integer.

# Check the default identities
head(Idents(seu.obj))

# Plot by leiden cluster using the 'group.by' option
ImageDimPlot(seu.obj,
             fov = Images(seu.obj)[1],
             border.color = "black",
             group.by = "leiden_clus",
             cols = "glasbey", # Option to use a different palette for cell colors
             cells = row.names(seu.obj@meta.data)[which(seu.obj@meta.data$fov == 99)])

To color cells by a continuous value, such as the log10totalcounts, or by the expression of a transcript of interest, such as Slc17a7, we use the function ‘ImageFeaturePlot’.

# Color cells by log10totalcounts
ImageFeaturePlot(seu.obj,
             fov = Images(seu.obj)[1],
             border.color = NA,
            features = "log10totalcounts")

# Color cells by the expression of a gene of interest, Slc17a7
ImageFeaturePlot(seu.obj,
             fov = Images(seu.obj)[1],
             border.color = NA,
            features = "Slc17a7")

Seurat can plot cells with either cell shapes shown (‘segmentation’) or with a single point at the center of where they’re located (‘centroids’). Here we show the switch to plotting centroids for one FOV.

# Check what the current default boundary is
DefaultBoundary(seu.obj@images[[Images(seu.obj)[1]]])

# Change the default boundaries for the first slide
DefaultBoundary(seu.obj@images[[Images(seu.obj)[1]]]) <- "centroids"

# Plot one FOV from this slide. Note that cell shapes are no longer shown
ImageDimPlot(seu.obj,
             fov = Images(seu.obj)[1],
             size = 5,
             shuffle.cols = TRUE, # Option to randomly shuffle colors within the palette
             cells = row.names(seu.obj@meta.data)[which(seu.obj@meta.data$fov == 99)])

5 Dimension reduction plots

The CosMx SMI Seurat object contains coordinates for each cell for UMAP dimensional reduction.

Here, we color cells by cell type and overlay cell type labels.

DimPlot(seu.obj, 
        group.by = "nb_clus",
        label = TRUE) +
  theme(legend.position = "none") # Suppress the legend since labels are plotted on top of UMAP

Here, we color cells by a continuous value, using transcript expression for a transcript of interest.

FeaturePlot(seu.obj, 
        features = "Slc17a7",
        order = TRUE) # plots cells in order of expression

6 Conclusions

This vignette serves as an introduction to exploring CosMx SMI data in Seurat, with a primary focus on visualization. Mix and match the functions and options from above to generate new customized visualizations with your data. Once you’re comfortable visualizing your spatial data, you may want to proceed to refining your cell typing, performing differential expression, finding spatially correlated genes, or countless other analysis paths.