Permutations, Substrings and Subsequences

June 18, 2019 5 minutes

Let’s start this article by defining these terms:

1. Permutation: A permutation, also called an “arrangement number” or “order” is a re-arrangement of the elements of an ordered list S into a one-to-one correspondence with S itself.
2. Substring: A substring is a contiguous sequence of characters within a string. The word contiguous is important here as you cannot skip words or characters between the selected words/characters in a substring.
3. Subsequence: A subsequence is a sequence that can be derived from another sequence (sequence of characters in case of a string) by deleting some or no elements without changing the order of the remaining elements.
As strings can be represented as char arrays as well, the terminology is equially applicable to arrays as well.

Examples

1. Permutations of the word ABC: {“ABC”, “ACB”, “BAC”, “BCA”, “CBA”, “CAB”}
2. Substrings of the word ABC are represented as : {“A”, “B”, “C”, “AB”, “BC”, “ABC”}.
3. Subsequences of the word ABC are represented as : {“A”, “B”, “C”, “AB”, “BC”, “AC”, “ABC”, “”}

Fact Check

1. The number of permutations of a String of length n is given equal to: $n!$
2. The permutations of a string are also known as Anagrams of that string.
3. The number of non-empty substrings of a string of length n is equal to: $\frac{n(n+1)}{2}$.
4. A prefix is a substring that is present at the begening of the string and a suffix is a substring present at the end of a string.
5. The number of non-empty subsequences of a string of length n is equal to: $2^{n}-1$
6. All the subsequences of a string (including the empty subsequence) together form the power set of that string.

Java Implementation

The following is a java implementation to generate all the permutations of a string:

//total number of permutations returned is input.length()!
public List<String> permutations(String input) {
List<String> permutations = new ArrayList<>();
printPermutations(input, "", permutations);
return permutations;
}

private void printPermutations(String input, String prefix, List<String> permutations) {
if (input.length() == 0) {
//System.out.println(prefix);
return;
}
for (int i = 0; i < input.length(); i++) {
char ch = input.charAt(i);
String ros = input.substring(0, i) +
input.substring(i + 1);
printPermutations(ros, prefix + ch, permutations);
}
}

The following is a java implementation to generate all substrings of a string:

// the total number of substrings of a string of length n is n(n+1)/2
// so the best case complexity will be n^2; even to list all the substrings
public List<String> generateSubstrings(String input) {
List<String> substrings = new ArrayList<>();
for (int i = 0; i < input.length(); i++) {
for (int j = i + 1; j <= input.length(); j++) {
}
}
return substrings;
}

There are multiple approaches to generate the possible subsequences of a string/array. The following are a few java implementations to generate all subsequences of a string:

// approach-1
// number of substrings for a string of length n is 2^n including blank substring
public List<String> listSubSequences(String input) {
return generate(input, -1, "", new ArrayList<>());
}
private List<String> generate(String input, int currentIndex, String prefix, List<String> subsequences){
if (currentIndex == input.length()) {
return subsequences;
}
for (int i = currentIndex + 1; i < input.length(); i++) {
prefix += input.charAt(i);
generateSubSequences(input, i, prefix, subsequences);
// Once all subsets beginning with initial "curr" are printed, remove
// last character to consider a different prefix of subsets.
prefix = prefix.substring(0, prefix.length() - 1);
}
return subsequences;
}

The second approachh is also a recursive approach and is based on the condition to include and not include the current character:

//approach-2
public List<String> calculatePowerSet(String input) {
List<String> powerSet = new ArrayList<>();
powerSet(input, 0, "", powerSet);
return powerSet;
}
private void powerSet(String input, int index, String prefix, List<String> powerSet) {
int stringLength = input.length();
// base case
if (index == stringLength) {
return;
}
// Two cases for every character :
// 1. We consider the character as part of current subset
// 2. We do not consider current character as part of current subset
powerSet(input, index + 1, prefix + input.charAt(index), powerSet);
powerSet(input, index + 1, prefix, powerSet);
}

There is another approach to generate the subsequences using bitwise operations. As it can be observed that different subsequences of a string are in accordance with the location of setbits in their decimal equivalent numbers generated from 0 to $2^{n}$.

Let’s consider the an example to better understand this. For input=“ABC” with length 3, there are 8 ($2^3$) possible subsequences. These can be represented in terms of the corresponding binary representation of numbers from 0 to 7.

# Binary Representation Subsequence
0 000
1 001 A
2 010 B
3 100 C
4 011 AB
5 101 AC
6 110 BC
7 111 ABC

A java program to implement the same can be written as follows:

//approach-3
private Set<Set<Integer>> generate(int[] input, Set<Set<Integer>> subsets) {
//as number of elements in power set is 2^n
//loop through all the numbers from 0 to 2^n - 1
int numberOfElementsInInputSet = input.length;
for (int currentNumber = 0; currentNumber < (1 << numberOfElementsInInputSet); currentNumber++) {
Set<Integer> subset = new LinkedHashSet<>();
//for every number generated from the first loop,
//check if ith bit is set where 0 <= i < n (n being the size of input array)
for (int currentIndex = 0; currentIndex < numberOfElementsInInputSet; currentIndex++) {
// (1<<j) is a number with jth bit set as 1
// anding the same with i will resultin > 0 only if jth bit is set in i
if ((currentNumber & (1 << currentIndex)) > 0) {