Skip to contents

Removes each chromosome's centromere and telomeres. The positions of these centromeres/telomeres is specified in the given arrays. So, the start positions of the 23 centromeres is in centromere_starts, the end positions of the 23 centromeres is in centromere_ends, and so on. By 'telomere 2' we mean the second (bottom) telomere, while telomere 1 woud be the first (top) one. Note that there are parameters for telomere 2 (telomere_2_starts and telomere_2_ends), but not for telomere 1. The reason for this is that, in the algorithm, the start position and end position of telomere 1 is the same for ALl chromosomes (base 0 up to 10000).

Usage

RemoveCentromereTelomeres(
  df,
  include_chr_X,
  centromere_starts,
  centromere_ends,
  telomere_2_starts,
  telomere_2_ends
)

Arguments

df

A Data Frame with segment data. Requirements:

  1. Column 1 is chromosome number

  2. Column 2 is the start position of the segment

  3. column 3 is the end position of the segment.

include_chr_X

A boolean: True if you want its spurious regions removed. False otherwise.

centromere_starts

An array: contains the start positions of all 23 chromosome's centromeres IN ORDER from 1 to 23.

centromere_ends

An array: contains the end positions of all 23 chromosome's centromeres IN ORDER from 1 to 23.

telomere_2_starts

An array: contains the start positions of all 23 chromosome's second telomere IN ORDER from 1 to 23.

telomere_2_ends

An array: contains the end positions of all 23 chromosome's second telomere IN ORDER from 1 to 23.