Opentopia Directory Encyclopedia Tools

Kd-tree

Encyclopedia : K : KD : KDT : Kd-tree


The correct title of this } is }}}. The initial letter is capitalized due to [Naming conventions #Lower case first lettertechnical restrictions].
right
Enlarge
right

In computer science, a kd-tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. kd-trees are a special case of BSP trees.

A kd-tree uses only splitting planes that are perpendicular to one of the coordinate system axes. This differs from BSP trees, in which arbitrary splitting planes can be used. In addition, every node of a kd-tree, from the root to the leaves, stores a point. This differs from BSP trees, in which leaves are typically the only nodes that contain points (or other geometric primitives). As a consequence, each splitting plane must go through one of the points in the kd-tree. kd-tries are a variant that store data only in leaf nodes.

Nomenclature

Technically, the letter k refers to the number of dimensions. A 3-dimensional kd-tree would be called a 3d-tree. However, the phrase "3-dimensional kd-tree" is more commonly used. (It is also more descriptive, since a 3-dimensional tree could be any of a variety of different things, but the term kd-tree refers to a specific type of space-partitioning tree.) The letters k and d are both lowercase, even when the term comes at the beginning of a sentence, and the k is in italics. However, variant spellings are common, such as KD-tree and Kd-tree.

Operations on kd-trees

Constructing a kd-tree

Since there are many possible ways to choose axis-aligned splitting planes, there are many different ways to construct kd-trees. The canonical method of kd-tree construction has the following constraints: This method leads to a balanced kd-tree, in which each leaf node is about the same distance from the root. However, balanced trees are not necessarily optimal for all applications.

Given a list of n points, the following algorithm will construct a balanced kd-tree containing those points.

function kdtree (list of points pointList, int depth)

}
It is common that points "after" the median include only ones that are greater than or equal to the median. Another approach is to define a "superkey" function that compares the points in other dimensions. Lastly, it may be acceptable to let points equal to the median lie on either side.

This algorithm implemented in the Python programming language is as follows:

class Node:
pass

def kdtree(pointList, depth=0): if not pointList: return

# Select axis based on depth so that axis cycles through all valid values k = len(pointList[0]) # Assumes all points have the same dimension axis = depth % k

# Sort point list to select median pointList.sort(cmp=lambda x, y: cmp(x[axis], y[axis])) median = len(pointList)//2 # Choose median

# Create node and construct subtrees node = Node() node.location = pointList[median] node.leftChild = kdtree(pointList[:median], depth+1) node.rightChild = kdtree(pointList[median+1:], depth+1) return node

Example usage would be:

pointList = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)]
tree = kdtree(pointList)
The tree generated is shown on the right.

This algorithm creates the invariant that for any node, all the nodes in the left subtree are on one side of a splitting plane, and all the nodes in the right subtree are on the other side. Points that lie on the splitting plane may appear on either side. The splitting plane of a node goes through the point associated with that node (referred to in the code as node.location).

Adding elements to a kd-tree

One adds a new point to a kd-tree in the same way as one adds an element to any other tree. First, traverse the tree, starting from the root and moving to either the left or the right child depending on whether the point to be inserted is on the "left" or "right" side of the splitting plane. Once you get to a leaf node, add the new point as either the left or right child of the leaf node, again depending on which side of the node's splitting plane contains the new point.

Removing elements from a kd-tree

Remove a point from an existing kd-tree, without breaking the invariant. (Undone)

Balancing a kd-tree

Balancing a kd-tree requires care. Because kd-trees are sorted in multiple dimensions, the tree rotation technique cannot be used to balance them — this may break the invariant.

Uses for kd-trees

Orthogonal range search in a kd-tree

Use a kd-tree to find all the points that lie within a given rectangle (or higher-dimensional region analogous to a rectangle). This operation is also called an orthogonal range search. (Undone)

Determining where to evaluate a surface

In local regression, it is common to evaluate the fitted surface directly only at the vertices of a kd-tree and to interpolate elsewhere. This use, which is pictured in the image above, is to ensure that only as many direct evaluations are performed as are necessary. Since the kd-tree "adapts" itself to the space of predictor variables, this method can provide an excellent approximation to the true loess surface. If the approximation is poor, it can be improved by further subdivision of the tree's cells.

Complexity

External links

References

 


From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.

Search Titles
0123456789
ABCDEFGHIJ
KLMNOPQRST
UVWXYZ?

E-mail this article to:

Personal Message: