check input formats for transcript data.frame file list and load 1st fov
Usage
checkTransFileInputsAndLoadFirst(
transcript_df = NULL,
transDF_fileInfo = NULL,
filepath_coln = "file_path",
prefix_colns = c("slide", "fov"),
fovOffset_colns = c("stage_X", "stage_Y"),
pixel_size = 0.18,
zstep_size = 0.8,
transID_coln = NULL,
transGene_coln = "target",
cellID_coln = "CellId",
spatLocs_colns = c("x", "y", "z"),
extracellular_cellID = NULL
)
Arguments
- transcript_df
the data.frame of transcript level information with unique CellId, set to NULL if read from the
transDF_fileInfo
- transDF_fileInfo
a data.frame with each row for each individual file of per FOV transcript data.frame within which the coordinates and CellId are unique, columns include the file path of per FOV transcript data.frame file, annotation columns like slide and fov to be used as prefix when creating unique cell_ID across entire data set; when NULL, use the provided
transcript_df
directly- filepath_coln
the column name of each individual file of per FOV transcript data.frame in
transDF_fileInfo
- prefix_colns
the column names of annotation in
transDF_fileInfo
, to be added to the CellId as prefix when creating unique cell_ID for entire data set; set to NULL if use the originaltransID_coln
orcellID_coln
- fovOffset_colns
the column name of coordinate offsets in 1st and 2nd dimension for each per FOV transcript data.frame in
transDF_fileInfo
, unit in micron Notice that some assays like SMI has XY axes swapped between stage and each FOV such thatfovOffset_colns
should be c("stage_Y", "stage_X").- pixel_size
the micrometer size of image pixel listed in 1st and 2nd dimension of
spatLocs_colns
of eachtranscript_df
- zstep_size
the micrometer size of z-step for the optional 3rd dimension of
spatLocs_colns
of eachtranscript_df
- transID_coln
the column name of transcript_ID in
transcript_df
, default = NULL to use row index of transcript in eachtranscript_df
; whenprefix_colns
!= NULL, unique transcript_id would be generated fromprefix_colns
andtransID_coln
in eachtranscript_df
- transGene_coln
the column name of target or gene name in
transcript_df
- cellID_coln
the column name of cell_ID in
transcript_df
; whenprefix_colns
!= NULL, unique cell_ID would be generated fromprefix_colns
andcellID_coln
in eachtranscript_df
- spatLocs_colns
column names for 1st, 2nd and optional 3rd dimension of spatial coordinates in
transcript_df
- extracellular_cellID
a vector of cell_ID for extracellular transcripts which would be removed from the resegmention pipeline (default = NULL)
Value
a list contains transcript_df for downstream process and extracellular transcript data.frame '
- intraC
a data.frame for intracellular transcript,
UMI_transID
andUMI_cellID
as column names for unique transcript_id and cell_id,target
as column name for target gene name- extraC
a data.frame for extracellular transcript, same structure as the
intraC
data.frame in returned list