Skip to contents

We add small segments from small_segments to large_segments in 6 cases, each depending on where the small segment fits in between the large ones. We merge segments if the ratio difference between the small and the large are <= threshold. Otherwise, we keep small segment as its own segment in large_segments.

Visually, these are the 6 cases:

  1. |–small-| |—–large—-|

  2. |—– previous large —–| |—-small—-| |———- large segment———-|

  3. |————- large —————| |—-small—-|

  4. |———large———-| |————- next large——–| |——–small——–|

  5. |——-large——–| |————–next large————-| |—-small—-|

  6. chr N |#| chr N+1 |——– large ——–| |#| |——– next_large ——–| |—— small ——| |#|

Usage

FinalizeSmallSegments(large_segments, small_segments, threshold, granges_obj)

Arguments

large_segments

A data frame. Segments with size >= 3Mb.

small_segments

A data frame. Segments with 3Mb >= size >= 0.1Mb.

threshold

A float: the estimated threshold for ratio_median difference via KDE. Used to determine whether we insert the small segment or not.

granges_obj

A GRanges object: is used as reference to check whenever we have an overlap of segments and get the ratio_median of this overlap.