Introduction to Trees

September 22, 2019 4 minutes

A tree is an abstract data structure that represents hierarchal data structure. The individual nodes of a tree maintain the parent-child hierarchal relationship. The tree data structure is built of individual nodes where each node is made up of:

  1. value: the value represented by the node
  2. children: list of children of this node. For binary trees as there are at most 2 children, the common practice is to refer to those as left and right children.

Some common terms

  1. Root Node: The top node of the tree that has no parent node.
  2. Leaf Node: A node with no child.
  3. Siblings: All children nodes with same parent node.
  4. Branch/Internal node: A node which is neither a root node, nor a leaf node.
  5. Degree: The degree of a node denotes the number of its children. Thus a leaf node has a degree 0.
  6. Level: The level of a node is 1 greater than its parent node with the level of the root node defaulting to 1.
  7. Height: The height of a tree is equal to the maximum level of any node in the tree.
  8. Size: Denoted by number of nodes in the tree.

Tree Variants

A tree can be categorized into of the following specific types based on its features and properties.

Binary Trees

A tree where each node has a maximum of two children nodes. Following are thre sub categories of this:

  1. Full Binary Tree: Every node has either 0 or 2 children. In other words, every node except the leaf nodes have 2 children.
  2. Complete Binary Tree: A binary tree with all levels completely filled but (not necessarily) the last level. Also, for the last level, the tree maintains the property that all nodes are as left as possible.
  3. Perfect Binary Tree: All nodes have two children except for the last level. Also, all leaves are at the same level.
  4. Degenerate Binary Tree: An unbalanced tree in which each node has only 1 child and thus essentially behaving as a linked list.
  5. Binary Search Tree: A binary tree with additional features that state that the value present in the left child will always be less than or equal to the value at parent node. Similarly, the value present in the right child will always be greater than the root node’s value.


Balanced Trees

Height balanced trees in which the left and right subtrees of every node differ no more than 1 in their heights. The height of such trees is essentially $O(log_2 {n})$

  1. Red Black Tree: A self-balancing tree (BST to be precise) that maintains the order of common operations like addition, deletion and searching as $O(log_2{n})$
  2. AVL Tree: Another binary search tree where the subtrees of every node differ in height by a maximum of 1. Due to this property these are optimized version of plain BSTs keeping the order of common operations like addition, deletion and searching as $O(log_2{n})$. Commonly used in databases to support fast search operations due to more rigid balancing
  3. Splay Tree: These are also self-balancing binary search tree with an additional property that the most frequently accessed nodes will move closer to the root of the node for faster retrieval.

N-ary Tree

A tree where each node can have at a maximum of N nodes. As clear from the definition, a binary tree is a special case of a N-ary tree with 2 nodes.

Prefix Tree

A prefix tree (also known as Trie) is a N-ary tree where the existing nodes present in the tree are used to define the prefixes for the newly created nodes. These are commonly used for building dictionaries and reference caches.

Suffix Tree

In simple words, a suffix tree is a Trie that for a given word will contain the nodes for all possible suffixes of the word (in addition to containing the node representing a word)

Heap Structures

An almost complete-tree, a heap is a special case of tree data structure holding the following invariants for specific types:

  1. MaxHeap: For any parent node P, it will have greater than or equal value than all of its children nodes.
  2. MinHeap: For any parent node P, it will have less than or equal value than all of its children nodes.

The same is used to implement some other common data structures like PriorityQueues and algorithms like HeapSort.


comments powered by Disqus