Skip to contents

find neighbor cells with transcripts that are direct neighbor of chosen_cell, check tLLRv2 score under neighbor cell type, return neighborhood information

Usage

get_neighborhood_content(
  chosen_cells = NULL,
  score_GeneMatrix,
  score_baseline = NULL,
  neighbor_distance_xy = NULL,
  distance_cutoff = 2.7,
  transcript_df,
  cellID_coln = "CellId",
  celltype_coln = "cell_type",
  transID_coln = "transcript_id",
  transGene_coln = "target",
  transSpatLocs_coln = c("x", "y", "z")
)

Arguments

chosen_cells

the cell_ID of chosen cells need to be evaluate for re-segmentation

score_GeneMatrix

the gene x cell-type matrix of log-like score of gene in each cell type

score_baseline

a named vector of score baseline for all cell type listed in score_GeneMatrix

neighbor_distance_xy

maximum cell-to-cell distance in x, y between the center of query cells to the center of neighbor cells with direct contact, same unit as input spatial coordinate. Default = NULL to use the 2 times of average 2D cell diameter.

distance_cutoff

maximum molecule-to-molecule distance within connected transcript group, same unit as input spatial coordinate (default = 2.7 micron). If set to NULL, the pipeline would first randomly choose no more than 2500 cells from up to 10 random picked ROIs with search radius to be 5 times of neighbor_distance_xy, and then calculate the minimal molecular distance between picked cells. The pipeline would further use the 5 times of 90% quantile of minimal molecular distance as distance_cutoff. This calculation is slow and is not recommended for large transcript data.frame.

transcript_df

the data.frame with transcript_id, target/geneName, x, y and cell_id

cellID_coln

the column name of cell_ID in transcript_df

celltype_coln

the column name of cell_type in transcript_df

transID_coln

the column name of transcript_ID in transcript_df

transGene_coln

the column name of target or gene name in transcript_df

transSpatLocs_coln

the column name of 1st, 2nd, optional 3rd spatial dimension of each transcript in transcript_df

Value

a data.frame #'

  1. CellId, original cell id of chosen cells

  2. cell_type, original cell type of chosen cells

  3. transcript_num, number of transcripts in chosen cells

  4. self_celltype, cell type give maximum score for query cell only

  5. score_under_self, score in query cell under its own maximum celltype

  6. neighbor_CellId, cell id of neighbor cell whose cell type gives maximum score in query cell among all neighbors, not including query cell itself

  7. neighbor_celltype, cell type that gives maximum score in query cell among all non-self neighbor cells

  8. score_under_neighbor, score in query cell under neighbor_celltype

Details

Locate neighbor cells of each query cell firstly via cell-to-cell distance in 2D plane within neighbor_distance_xy, then via molecule-to-molecule 3D distance within distance_cutoff. If no neighbor cells found for query cell, use the cell id and cell type of query cell to fill in the columns for neighbor cells in returned data.frame