Skip to contents

Evaluate neighborhood information against score and transcript number cutoff to decide the resegmetation operations. Use either leiden clustering or geometry statistics to determine whether a merge event is allowed.

Usage

decide_ReSegment_Operations(
  neighborhood_df,
  selfcellID_coln = "CellId",
  transNum_coln = "transcript_num",
  selfCellType_coln = "self_celltype",
  selfScore_coln = "score_under_self",
  neighborcellID_coln = "neighbor_CellId",
  neighborCellType_coln = "neighbor_celltype",
  neighborScore_coln = "score_under_neighbor",
  score_baseline = NULL,
  lowerCutoff_transNum = NULL,
  higherCutoff_transNum = NULL,
  transcript_df,
  cellID_coln = "CellId",
  transID_coln = "transcript_id",
  transSpatLocs_coln = c("x", "y", "z"),
  spatialMergeCheck_method = c("leidenCut", "geometryDiff"),
  cutoff_spatialMerge = 0.5,
  leiden_config = list(objective_function = c("CPM", "modularity"), resolution_parameter
    = 1, beta = 0.01, n_iterations = 200),
  config_spatNW_transcript = NULL
)

Arguments

neighborhood_df

the data.frame containing neighborhood information for each query cells, expected to be output of get_neighborhood_content function.

selfcellID_coln

the column name of cell_ID of query cell in neighborhood_df

transNum_coln

the column name of transcript number of query cell in neighborhood_df

selfCellType_coln

the column name of cell_type under query cell in neighborhood_df

selfScore_coln

the column name of average transcript score under query cell in neighborhood_df

neighborcellID_coln

the column name of cell_ID of neighbor cell in neighborhood_df

neighborCellType_coln

the column name of cell_type under neighbor cell in neighborhood_df

neighborScore_coln

the column name of average transcript score under neighbor cell in neighborhood_df

score_baseline

a named vector of score baseline for all cell type listed in neighborhood_df such that per cell transcript score higher than the baseline is required to call a cell type of high enough confidence

lowerCutoff_transNum

a named vector of transcript number cutoff under each cell type such that higher than the cutoff is required to keep query cell as it is

higherCutoff_transNum

a named vector of transcript number cutoff under each cell type such that lower than the cutoff is required to keep query cell as it is when there is neighbor cell of consistent cell type.

transcript_df

the data.frame with transcript_id, target/geneName, x, y and cell_id

cellID_coln

the column name of cell_ID in transcript_df

transID_coln

the column name of transcript_ID in transcript_df

transSpatLocs_coln

the column name of 1st, 2nd, optional 3rd spatial dimension of each transcript in transcript_df

spatialMergeCheck_method

use either "leidenCut" (in 2D or 3D) or "geometryDiff" (in 2D only) method to determine whether a cell pair merging event is allowed in space (default = "leidenCut")

cutoff_spatialMerge

spatial constraint on a valid merging event between two source transcript groups, default = 0.5 for 50% cutoff, set to 0 to skip spatial constraint evaluation for merging. For spatialMergeCheck_method = "leidenCut", this is the minimal percentage of transcripts shared membership between query cell and neighbor cells in leiden clustering results for a valid merging event. For spatialMergeCheck_method = "geometryDiff", this is the maximum percentage of white space change upon merging of query cell and neighbor cell for a valid merging event.

leiden_config

(leidenCut) a list of configuration to pass to reticulate and igraph::cluster_leiden function, including objective_function, resolution_parameter, beta, n_iterations.

config_spatNW_transcript

(leidenCut) configuration list to create spatial network at transcript level, see manual for createSpatialDelaunayNW_from_spatLocs for more details, set to NULL to use default config

Value

a list

  1. cells_to_discard, a vector of cell ID that should be discarded during resegmentation

  2. cells_to_update, a named vector of cell ID where the cell_ID in name would be replaced with cell_ID in value.

  3. cells_to_keep, a vector of cell ID that should be kept as it is.

  4. reseg_full_converter, a single named vector of cell ID to update the original cell ID, assign NA for cells_to_discard.

Details

Evaluate neighborhood information against score and transcript number cutoff to decide the resegmetation operations like the following:

  • merge query to neighbor if consist cell type and fewer than average transcript number cutoff, higherCutoff_transNum;

  • keep query as new cell id if no consist neighbor cell type, but high self score and higher than minimal transcript number, lowerCutoff_transNum;

  • discard the rest of query cells that have no consistent neighbor cell type, fewer transcript number based on lowerCutoff_transNum, and/or low self score. The function uses network component analysis to resolve any conflict due to merging multiple query cells into one. When cutoff_spatialMerge > 0, the function applies additional spatial constraint on a valid merging event of query cell into neighbor cell.

  • In case of spatialMergeCheck_method = "leidenCut", the function builds spatial network at transcript level, does leiden clustering on the spatial network, and then decides whether the merge should be allowed based on the observed shared leiden membership of the two source transcript groups for a putative merging event; the provided cutoff_spatialMerge gives the minimal values of shared leiden memberhsip for a valid merging event.

  • In case of spatialMergeCheck_method = "geometryDiff", the function would first calculate white space, i.e. the area difference between convex and concave hulls, respectively, for query cell, neighbor cell, and the corresponding merged cell; and then calculate the white space difference between the merged cell and two separate cells and normalize that value with respect to the concave area of query and neighbor cells, respectively; lastly, allow a valid merging when the normalized white space difference upon merging for both query and neighbor cells are smaller than the provided cutoff_spatialMerge.