Multi-Dimensional Data Study Guide
Author: Kartik Kapur

Check-in Exercise

Linked here.

Overview

Additional Set Operations There are many other operations we might be interested in supporting on a set. For example, we might have a select(int i) method that returns the ith smallest item in the set. Or we might have a subSet(T from, T to) operation that returns all items in the set between from and to. Or if we have some notion of distance, we might have a nearest(T x) method that finds the closest item in the set to x.

On 1D data, it is relatively straightforward to support such operations efficiently. If we use only one of the coordinates (e.g. X or Y coordinate), the structure of our data will fail to reflect the full ordering of the data.

QuadTrees

A natural approach is to make a new type of Tree– the QuadTree. The QuadTree has 4 neighbors, Northwest,Northeast, Southwest, and Southeast. As you move your way down the tree to support queries, it is possible to prune branches that do not contain a useful result.

K-D Trees One final data structure that we have for dealing with 2 dimensional data is the K-d Tree. Essentially the idea of a K-D tree is that it’s a normal Binary Search Tree, except we alternate what value we’re looking at when we traverse through the tree. For example at the root everything to the left has an X value less than the root and everything to the right has a X value greater than the root. Then on the next level, every item to the left of some node has a Y value less than that item and everything to the right has a Y value greater than it. Somewhat surprisingly, KdTrees are quite efficient.