Ranges and GenomicRanges Module
This module provides classes for working with numeric and genomic intervals, supporting operations such as overlap detection, subtraction, and coverage calculation.
Classes Overview
Range
: Represents a simple interval with a start and end.RangesList
: Manages a list ofRange
objects and provides basic operations.RangesDict
: A dictionary-like container mapping keys toRangesList
objects.GenomicRange
: ExtendsRange
with chromosome and strand information for genomic data.GenomicRangesList
: Manages a list ofGenomicRange
objects, supporting chromosome and strand-aware operations.GenomicRangesDict
: A dictionary-like container mapping chromosomes (and optionally strands) toGenomicRangesList
objects.
Range
Represents a closed interval [start, end] (inclusive).
Example:
from benchmate.ranges.ranges import Range
r1 = Range(10, 20)
r2 = Range(15, 25)
print(len(r1)) # 11
print(r1.overlaps(r2)) # True
print(r1.distance(r2)) # 0 (overlapping)
print(r1.merge(r2)) # Range(10, 25)
print(r1.split(n)) # split into 2 equal parts return a RangesList: [Range(10, 15), Range(15, 20)]
RangesList
A collection of Range
objects with basic operations.
Example:
from benchmate.ranges.ranges import RangesList, Range
ranges = RangesList([Range(1, 5), Range(3, 7), Range(10, 12)])
# Check for overlaps
for r in ranges:
print(r)
# Calculate coverage: number of ranges covering each position from min(start) to max(end)
coverage = ranges.coverage() # e.g., [1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1]
Additional operations include: pop, append, extend, remove, find_overlaps, as well as some basic list operations like len, index, contains string, and iterating, seting, and getting items and in/equality checks. Reduce method is also available to reduce the list of ranges to a single range.
RangesDict
A dictionary-like container mapping keys (e.g., names, categories) to RangesList
objects.
Example:
from benchmate.ranges.ranges import RangesDict, Range, RangesList
rdict = RangesDict()
rdict["A"] = RangesList([Range(1, 10), Range(15, 20)])
rdict["B"] = RangesList([Range(5, 12)])
# Access ranges for a key
print(rdict["A"]) # RangesList([Range(1, 10), Range(15, 20)])
# Iterate over keys
for key in rdict:
print(key, rdict[key])
GenomicRange
Represents a genomic interval with chromosome and strand.
Example:
from benchmate.ranges.genomicranges import GenomicRange
gr = GenomicRange("chr1", 100, 200, "+")
print(gr.chrom) # "chr1"
print(gr.start) # 100
print(gr.end) # 200
print(gr.strand) # "+"
GenomicRangesList
A collection of GenomicRange
objects, supporting chromosome and strand-aware operations.
Example:
from benchmate.ranges.genomicranges import GenomicRangesList, GenomicRange
granges = GenomicRangesList([
GenomicRange("chr1", 100, 200, "+"),
GenomicRange("chr1", 150, 250, "+"),
GenomicRange("chr1", 175, 300, "-"),
GenomicRange("chr2", 100, 200, "+")
])
# Coverage per chromosome and strand
cov = granges.coverage(ignore_strand=False)
# {
# "chr1": {
# "+": [...], # coverage array for + strand
# "-": [...] # coverage array for - strand
# },
# "chr2": {
# "+": [...],
# "-": []
# }
# }
# Coverage per chromosome, ignoring strand
cov_all = granges.coverage(ignore_strand=True)
# {
# "chr1": [...], # combined coverage array
# "chr2": [...]
# }
GenomicRangesDict
A dictionary-like container mapping chromosomes (and optionally strands) to GenomicRangesList
objects.
Example:
from benchmate.ranges.genomicranges import GenomicRangesDict, GenomicRange, GenomicRangesList
gdict = GenomicRangesDict()
gdict["chr1"] = GenomicRangesList([
GenomicRange("chr1", 100, 200, "+"),
GenomicRange("chr1", 300, 400, "-")
])
gdict["chr2"] = GenomicRangesList([
GenomicRange("chr2", 500, 600, "+")
])
# Access ranges for a chromosome
print(gdict["chr1"]) # GenomicRangesList([...])
# Iterate over chromosomes
for chrom in gdict:
print(chrom, gdict[chrom])
Functionality Summary
- Range: Overlap, merge, subtract, length, distance.
- RangesList: Store and operate on multiple
Range
objects, compute coverage. - RangesDict: Map keys to
RangesList
objects for grouped operations. - GenomicRange: Chromosome and strand-aware intervals.
- GenomicRangesList: Chromosome/strand-aware storage and coverage (with or without strand).
- GenomicRangesDict: Map chromosomes (and optionally strands) to
GenomicRangesList
objects.
Notes
- All intervals are 1-based and inclusive.
- There is no check for chromosome lengths; we assume that you know how big your chromosomes are.
- Coverage arrays start at the minimum start and end at the maximum end for each chromosome (and strand, if applicable).
- For large datasets, consider memory usage when using coverage methods.