Skip to contents

FindRascalSolutions() finds the best ploidy and cellularity pairs from the relative copy number profiles of each sample. Makes use of the find_best_fit_solutions() function from rascal.

Usage

FindRascalSolutions(
  cnobj,
  min_ploidy = 1.5,
  max_ploidy = 5.5,
  ploidy_step = 0.01,
  min_cellularity = 0.2,
  max_cellularity = 1,
  cellularity_step = 0.01,
  distance_function = c("MAD", "RMSD"),
  distance_filter_scale_factor = 1.25,
  max_proportion_zero = 0.05,
  min_proportion_close_to_whole_number = 0.5,
  max_distance_from_whole_number = 0.15,
  solution_proximity_threshold = 5,
  keep_all = FALSE
)

Arguments

cnobj

An S4 object of type QDNAseqCopyNumbers.

min_ploidy, max_ploidy

the range of ploidies.

ploidy_step

the stepwise increment of ploidies along the grid.

min_cellularity, max_cellularity

the range of cellularities.

cellularity_step

the stepwise increment of cellularities along the grid.

distance_function

the distance function to use, either "MAD" for the mean absolute difference or "RMSD" for the root mean square difference, where differences are between the fitted absolute copy number values and the nearest whole number.

distance_filter_scale_factor

the distance threshold above which solutions will be discarded as a multiple of the solution with the smallest distance.

max_proportion_zero

the maximum proportion of fitted absolute copy number values in the zero copy number state.

min_proportion_close_to_whole_number

the minimum proportion of fitted absolute copy number values sufficiently close to a whole number.

max_distance_from_whole_number

the maximum distance from a whole number that a fitted absolute copy number can be to be considered sufficiently close.

solution_proximity_threshold

how close two solutions can be before one will be filtered; reduces the number of best fit solutions where there are many minima in close proximity.

keep_all

set to TRUE to return all solutions but with additional best_fit column to indicate which are the local minima that are acceptable solutions (may be useful to avoid computing the distance grid twice)

Value

a dataframe containing the calculated solutions. Each solution includes: sample_id, ploidy, celluarity, and distance.