Skip to contents

Extract genome-wide copy-number features from either a list of dataframes or a QDNAseq S4 object for 1 or more samples. This function is intended to be run on relative CN data.

Usage

ExtractRelativeCopyNumberFeatures(
  CN_data,
  genome,
  cores = 1,
  log_features = FALSE,
  extra_features = FALSE
)

Arguments

CN_data

List of datafames or S4 QDNAseq object. Segmented relative copy-number data for 1 or more samples. If input is a list of dataframes, columns should be:

  1. chromosome

  2. start

  3. end

  4. segVal

genome

Character string. The reference genome used for alignment.
Options: 'hg19', 'hg38'

cores

Integer. The number of cores to use for parallel processing.

log_features

FALSE or char vector. If a vector of feature names is provided, take the log1p of these extracted CN-features.

extra_features

Logical. If TRUE, extracts CN-feature data for two more features: nc50, and cdist.

Value

A list. Each list element contains feature data for a single feature.

Details

This function is identical to the absolute calling equivalent other than for three features. The osCN, changepoint, and copynumber features require slightly different modelling at the relative scale.

The extracted copy-number features are:

  1. Breakpoint count per 10MB - bp10MB

  2. Copy-number value of each segment - copynumber

  3. Copy-number difference between adjacent segments - changepoint

  4. Breakpoint count per chromosome arm - bpchrarm

  5. Lengths of oscillating CN segment chains - osCN

  6. Size of copy-number segments in base-pairs - segsize

Extra features:
7. Minimum number of chromosomes (a count) needed to account for 50% of CN changes in a sample - nc50
8. Distance in base pairs of each breakpoint to the centromere - cdist