
Fit Mixture Models for each CN-Feature
FitMixtureModels.Rd
Perform mixture modelling on CN-features using either a mixture of gaussians or poissons.
Usage
FitMixtureModels(
CN_features,
seed = 77777,
min_comp = 2,
max_comp = 10,
min_prior = 0.001,
model_selection = "BIC",
nrep = 1,
niter = 1000,
cores = 1,
featsToFit = seq(1, 6),
multi_seed = FALSE,
num_seed = 100
)
Arguments
- CN_features
A list. The output from either the ExtractRelativeCopyNumberFeatures or ExtractCopyNumberFeatures functions.
- seed
Integer. (flexmix param) The random seed to use while modelling.
- min_comp
Integer. (flexmix param) The minimum number of components for each CN-feature to consider.
- max_comp
Integer. (flexmix param) The maximum number of components for each CN-feature to consider.
- min_prior
Numeric. (flexmix param) Minimum prior probability of clusters, components falling below this threshold are removed during the iteration.
- model_selection
Integer or character. (flexmix param) Which model to get. Choose by number or name of the information criterion.
- nrep
Integer. (flexmix param) The number of times flexmix is run for each k (number of components).
- niter
Integer. (flexmix param) The maximum number of iterations for the EM-algorithm.
- cores
Integer. The number of cores to use for parallel processing.
- featsToFit
Integer vector. The CN-features to fit.
- multi_seed
Logical. If TRUE, the function is run multiple times over different seeds to find the best mixtures. It is highly recommended to use multiple cores.
- num_seed
Integer. The number of seeds to use when
multi_seed = TRUE
.
Value
A list containing flexmix
objects for each CN-feature. If multi_seed = TRUE
, the function will return a nested list with two components: one containing the flexmix
objects and the other containing BIC values for each CN-feature. The BIC values are organized such that the rows represent different seeds and the columns represent the number of components. This structure allows for easy identification of the optimal seed and component configuration.
Details
The segment size, changepoint copy number, and segment copy-number value CN-features are modelled with a mixture of Gaussians. For the breakpoint count per 10MB, length of segments with oscillating copy-number, and breakpoint count per chromosome a mixture of Poissons is used instead. Mixture modelling is done using the FlexMix package.