In the previous script, we loaded flat files from AtoMx. In most analyses, we’ll want to re-arrange our tissues and FOVs in a layout that’s cleaner or that arranges tissues according to their biological similarities.

Here we show how to rearrange our tissues by hand. We’ll also append the metadata with further columns specifying tissue ID (which for multi-tissue slides is more granular than mere slide ID). For an under-development toolkit to automate this process, see here.

Preliminaries

First, we load in only the data we need, which was output by the earlier script from the workflow vignette:

mydir <- ".."
# load cell metadata, output by the first vignette script, 0.-loading-flat-files.Rmd
metadata <- readRDS(paste0(mydir, "/processed_data/metadata_unfiltered.RDS")) 

Then break out cells’ xy positions in a distinct data object:

xy <- as.matrix(metadata[, c("CenterX_global_px", "CenterY_global_px")])
rownames(xy) <- metadata$cell_id
# rescale to mm:
thisinstrument_nanometers_per_pixel = 120.280945   
# The above value holds for most instruments. Your RunSummary file specifies the value for your instrument. 
xy <- xy * thisinstrument_nanometers_per_pixel / 1000000
colnames(xy) <- paste0(c("x", "y"), "_mm")

Now we make sure that no tissue overlaps another. (In studies with multiple slides, xy positions can overlap.) The script utils/condenseTissues.R holds a useful function for automating this process:

source("utils/condenseTissues.R")
#equivalently, source from github:
#source("https://raw.githubusercontent.com/Nanostring-Biostats/CosMx-Analysis-Scratch-Space/Main/_code/vignette/utils/condenseTissues.R")
set.seed(1)
xy <- condenseTissues(xy = xy, 
                      tissue = metadata$Run_Tissue_name, 
                      tissueorder = NULL,  # optional, specify the order tissues are tiled in
                      buffer = 1, # space between tissues
                      widthheightratio = 4/3) # desired shape of final xy locations 

Now we can see our tissues’ xy coordinates:

# to avoid unnecessarily plotting all the cells, just show a subset:
sub <- sample(1:nrow(xy), round(nrow(xy) / 20), replace = FALSE)
# define a color palette (using a Google palette):
plot(xy[sub, ], pch = 16, cex = 0.2, 
     asp = 1, # important: keep the true aspect ratio
     col = as.numeric(as.factor(metadata$Run_Tissue_name[sub])))
# label tissues:
for (slidename in unique(metadata$Run_Tissue_name)) {
  text(median(xy[metadata$Run_Tissue_name == slidename, 1]), max(xy[metadata$Run_Tissue_name == slidename, 2]), slidename)
}

Custom labeling of tissues

In this dataset we have 2 tissues on each slide. AtoMx didn’t know this, so we’ll have to enter that information ourselves. Using code to record/define this information is simple enough and, more importantly, reproducible. Here’s example code for assigning cells the right tissue ID. To show a more complex example, we’re pretending we have 2 tissues per slide.

# manually assign tissue IDs: 
# (Use whatever logic you need here. In this example, slide and tissue are synonymous, though they often won't be.)
metadata$tissue <- NA
metadata$tissue[(metadata$Run_Tissue_name == "MAR19_SlideAge_SkinCancer_RT_CP_slide2") & (xy[, 2] < 9)] <- "sample1"
metadata$tissue[(metadata$Run_Tissue_name == "MAR19_SlideAge_SkinCancer_RT_CP_slide2") & (xy[, 2] > 9)] <- "sample2"
metadata$tissue[(metadata$Run_Tissue_name == "MAR19_SlideAge_SkinCancer_4C_CP_slide4") & (xy[, 2] < 5.3)] <- "sample3"
metadata$tissue[(metadata$Run_Tissue_name == "MAR19_SlideAge_SkinCancer_4C_CP_slide4") & (xy[, 2] > 5.3)] <- "sample4"

# check for hiccups:
if (any(is.na(metadata$tissue))) {
  stop("some cells didn't get a tissue ID")
}

Now let’s see how our spatial plots look:

# to avoid unnecessarily plotting all the cells, just show a subset:
sub <- sample(1:nrow(xy), round(nrow(xy) / 20), replace = FALSE)
# define a color palette (using a Google palette):
cols <- c("#616161","#4285f4","#db4437","#f4b400","#0f9d58","#ab47bc",
"#00acc1","#ff7043","#9e9d24","#5c6bc0","#f06292","#00796b","#c2185b","#7e57c2",
"#03a9f4","#8bc34a","#fdd835","#fb8c00","#8d6e63","#9e9e9e","#607d8b")
# plot:
plot(xy[sub, ], pch = 16, cex = 0.2, 
     asp = 1, # important: keep the true aspect ratio
     col = cols[as.numeric(as.factor(metadata$tissue[sub])) %% length(cols) + 1])
# label tissues:
for (tiss in unique(metadata$tissue)) {
  text(median(xy[metadata$tissue == tiss, 1]), max(xy[metadata$tissue == tiss, 2]), tiss)
}

We can easily condense our tissues further. Let’s re-run the tissue condensing function using tissue ID instead of slide ID as we did previously:

# condense tissues:
set.seed(0)
xy <- condenseTissues(xy = xy, 
                      tissue = metadata$tissue, 
                      tissueorder = NULL,  # optional, specify the order tissues are tiled in
                      buffer = 1, # space between tissues
                      widthheightratio = 1) # desired shape of final xy locations 
# now how do they look?
# to avoid unnecessarily plotting all the cells, just show a subset:
sub <- sample(1:nrow(xy), round(nrow(xy) / 20), replace = FALSE)
# define a color palette (using a Google palette):
cols <- c("#616161","#4285f4","#db4437","#f4b400","#0f9d58","#ab47bc",
"#00acc1","#ff7043","#9e9d24","#5c6bc0","#f06292","#00796b","#c2185b","#7e57c2",
"#03a9f4","#8bc34a","#fdd835","#fb8c00","#8d6e63","#9e9e9e","#607d8b")
# plot:
plot(xy[sub, ], pch = 16, cex = 0.2, 
     asp = 1, # important: keep the true aspect ratio
     col = cols[as.numeric(as.factor(metadata$tissue[sub])) %% length(cols) + 1])
# label tissues:
for (tiss in unique(metadata$tissue)) {
  text(median(xy[metadata$tissue == tiss, 1]), max(xy[metadata$tissue == tiss, 2]), tiss)
}

Much better. And if we aren’t worried about preserving long-range distances, which is usually the case, we can consolidate sample 3 a little better:

isfarout <- (metadata$tissue == "sample3") & (xy[, 2] > 11)
xy[isfarout, 2] <- xy[isfarout, 2] - 2
sub <- sample(1:nrow(xy), round(nrow(xy) / 20), replace = FALSE)
plot(xy[sub, ], pch = 16, cex = 0.2, 
     asp = 1, # important: keep the true aspect ratio
     col = cols[as.numeric(as.factor(metadata$tissue[sub])) %% length(cols) + 1])

Finally, we save our new coordinate system and move on to the next module.

Please note that these are intermediate results, and not official file formats supported by NanoString.

saveRDS(xy, file = paste0(mydir, "/processed_data/xy_unfiltered.RDS"))
saveRDS(metadata, file = paste0(mydir, "/processed_data/metadata_unfiltered.RDS"))