About Interval Trees
Binary trees are simple data structures. However, multiple enhancements can be done to make them a lot more interesting. Interval trees only require a simple binary tree augmentation, but can be used to solve non-trivial problems.
##The problem
One can define an interval as a structure containing a low and a high value.
class Interval:
def __init__(self, low, high):
self.low = low
self.high = high
def overlaps(self, other):
return self.low <= other.high and other.low <= self.high
Given a set of intervals, how can one find all the intervals overlapping with a specific interval?
A naive solution, which would work in simple cases, is to go through all the intervals and check if they overlap. One can then return a list of overlapping intervals.
def search_overlapping_intervals(intervals, interval):
return [i for i in intervals if interval.overlaps(i)]
However, this solution has a query time in O(n), which is not efficient if a lot of queries are required.
The solution
It is instead possible to augment a self-balancing binary search tree to maintain the set of intervals so that the search can be done in O(log n).
To do so, in addition to a left and a right node, each node must contain an interval and the maximum value that is under it.
class INode:
def __init__(self, interval):
self.interval = interval
self.max = interval.high
self.left = None
self.right = None
We can therefore represent the interval set in the following way:
Nodes can be inserted into the tree in an analogous way as how we add them in a BST. If the lowest value of the new interval is lower than the lowest value of the node’s interval, the new interval must go to the node’s left, otherwise, to its right. The max value of the nodes must then be updated. This property will be useful for the querying process.
class IntervalTree:
def __init__(self):
self.root = None
def insert(self, interval):
self.root = self._insert(self.root, interval)
def _insert(self, root, interval):
if root is None:
return INode(interval)
low = root.interval.low
if interval.low < low:
root.left = self._insert(root.left, interval)
else:
root.right = self._insert(root.right, interval)
# Update max value
root.max = max(root.max, interval.high)
return root
Once the interval set is built, it is Query Time. The search is done in a classic binary search fashion. There are 4 cases:
- The node is null: The search down this branch is done
- The node overlaps with the interval: We found an overlapping node
- The node’s left node’s max is higher than the lowest value of the interval: The left branch must be searched
- The node’s right nodes’ low is lower than the highest value of the interval: The right branch must be searched
These 4 cases result in the following method:
def _overval_search(self, root, interval, overlaps):
if root is None:
return
if root.interval.overlaps(interval):
overlaps.append(root.interval)
if root.left is not None and root.left.max >= interval.low:
self._overval_search(root.left, interval, overlaps)
if root.right is not None and root.right.interval.low <= interval.high:
self._overval_search(root.right, interval, overlaps)
Interval trees are convenient as they are easy to understand since they are extensions of BST. Furthermore, adding and removing intervals, as well as searching for overlaps can all be done in O(log n).