runTranscriptErrorDetection — runTranscriptErrorDetection • FastReseg

modular wrapper to identify transcript groups of poor fit to current cell segments in space

Usage

runTranscriptErrorDetection(
  chosen_cells,
  score_GeneMatrix,
  transcript_df,
  cellID_coln = "CellId",
  transID_coln = "transcript_id",
  transGene_coln = "target",
  score_coln = "score",
  spatLocs_colns = c("x", "y", "z"),
  model_cutoff = 50,
  score_cutoff = -2,
  svm_args = list(kernel = "radial", scale = FALSE, gamma = 0.4),
  groupTranscripts_method = c("dbscan", "delaunay"),
  distance_cutoff = "auto",
  config_spatNW_transcript = NULL,
  seed_transError = NULL
)

Arguments

chosen_cells: the cell_ID of chosen cells
score_GeneMatrix: the gene x cell-type matrix of log-like score of gene in each cell type
transcript_df: the data.frame of transcript_ID, cell_ID, score, spatial coordinates
cellID_coln: the column name of cell_ID in transcript_df
transID_coln: the column name of transcript_ID in transcript_df
transGene_coln: the column name of target or gene name in transcript_df
score_coln: the column name of score in transcript_df
spatLocs_colns: the column names of 1st, 2nd, optional 3rd spatial dimension of each transcript in transcript_df
model_cutoff: the cutoff of transcript number to do spatial modeling (default = 50)
score_cutoff: the cutoff of score to separate between high and low score transcripts (default = -2)
svm_args: a list of arguments to pass to svm function, typically involve kernel, gamma, scale
groupTranscripts_method: use either "dbscan" or "delaunay" method to group transcripts in space (default = "dbscan")
distance_cutoff: maximum molecule-to-molecule distance within same transcript group (default = "auto")
config_spatNW_transcript: configuration list to create spatial network at transcript level, see manual for createSpatialDelaunayNW_from_spatLocs for more details, set to NULL to use default config
seed_transError: seed for transcript error detection step, default = NULL to skip the seed

Value

data frame for transcripts in chosen_cells only, containing information for transcript score classifications and spatial group assignments as well as new cell/group ID for downstream resegmentation.