get_neighborhood_content — get_neighborhood

find neighbor cells with transcripts that are direct neighbor of chosen_cell, check tLLRv2 score under neighbor cell type, return neighborhood information

Usage

get_neighborhood_content(
  chosen_cells = NULL,
  score_GeneMatrix,
  score_baseline = NULL,
  neighbor_distance_xy = NULL,
  distance_cutoff = 2.7,
  transcript_df,
  cellID_coln = "CellId",
  celltype_coln = "cell_type",
  transID_coln = "transcript_id",
  transGene_coln = "target",
  transSpatLocs_coln = c("x", "y", "z")
)

Arguments

chosen_cells: the cell_ID of chosen cells need to be evaluate for re-segmentation
score_GeneMatrix: the gene x cell-type matrix of log-like score of gene in each cell type
score_baseline: a named vector of score baseline for all cell type listed in score_GeneMatrix
neighbor_distance_xy: maximum cell-to-cell distance in x, y between the center of query cells to the center of neighbor cells with direct contact, same unit as input spatial coordinate. Default = NULL to use the 2 times of average 2D cell diameter.
distance_cutoff: maximum molecule-to-molecule distance within connected transcript group, same unit as input spatial coordinate (default = 2.7 micron). If set to NULL, the pipeline would first randomly choose no more than 2500 cells from up to 10 random picked ROIs with search radius to be 5 times of neighbor_distance_xy, and then calculate the minimal molecular distance between picked cells. The pipeline would further use the 5 times of 90% quantile of minimal molecular distance as distance_cutoff. This calculation is slow and is not recommended for large transcript data.frame.
transcript_df: the data.frame with transcript_id, target/geneName, x, y and cell_id
cellID_coln: the column name of cell_ID in transcript_df
celltype_coln: the column name of cell_type in transcript_df
transID_coln: the column name of transcript_ID in transcript_df
transGene_coln: the column name of target or gene name in transcript_df
transSpatLocs_coln: the column name of 1st, 2nd, optional 3rd spatial dimension of each transcript in transcript_df

Value

a data.frame #'

CellId, original cell id of chosen cells
cell_type, original cell type of chosen cells
transcript_num, number of transcripts in chosen cells
self_celltype, cell type give maximum score for query cell only
score_under_self, score in query cell under its own maximum celltype
neighbor_CellId, cell id of neighbor cell whose cell type gives maximum score in query cell among all neighbors, not including query cell itself
neighbor_celltype, cell type that gives maximum score in query cell among all non-self neighbor cells
score_under_neighbor, score in query cell under neighbor_celltype

Details

Locate neighbor cells of each query cell firstly via cell-to-cell distance in 2D plane within neighbor_distance_xy, then via molecule-to-molecule 3D distance within distance_cutoff. If no neighbor cells found for query cell, use the cell id and cell type of query cell to fill in the columns for neighbor cells in returned data.frame