How to solve Edit Distance Problem using dynamic programming?
💻 coding

How to solve Edit Distance Problem using dynamic programming?

3 min read 533 words
3 min read
ShareWhatsAppPost on X
  • 1Edit distance quantifies the dissimilarity between two strings by counting the minimum operations needed to transform one into the other.
  • 2The Levenshtein distance is the most common metric for edit distance, involving insertion, deletion, or substitution of characters.
  • 3Dynamic programming optimizes the edit distance calculation, reducing the time complexity significantly compared to a recursive approach.

AI-generated summary · May not capture all nuances

Key Insight
AskGif

"Edit distance quantifies the dissimilarity between two strings by counting the minimum operations needed to transform one into the other."

How to solve Edit Distance Problem using dynamic programming?

In computational linguistics and computer science, edit distance is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations required to transform one string into the other. Edit distances find applications in natural language processing, where automatic spelling correction can determine candidate corrections for a misspelt word by selecting words from a dictionary that have a low distance to the word in question. In bioinformatics, it can be used to quantify the similarity of DNA sequences, which can be viewed as strings of the letters A, C, G and T.

Different definitions of an edit distance use different sets of string operations. The Levenshtein distance operations are the removal, insertion, or substitution of a character in the string. Being the most common metric, the Levenshtein distance is usually what is meant by "edit distance".

Will First Solve the problem using Recursion :

package dp;

public class EditDistance {

	public static void main(String[] args) {
		String str1 = "sunday";
	 String str2 = "saturday";
	 
	 long startTime = System.nanoTime();
	 
	 System.out.println(EditDistanceCount(str1,str2, str1.length(), str2.length()));
	 
	 long endTime = System.nanoTime();
		long totalTime = endTime - startTime;
		System.out.println("Total Time (nanoseconds) : " + (totalTime));

	}

	private static int EditDistanceCount(String str1, String str2, int length1, int length2) {
		if(length1 == 0)
			return length2;
		if(length2 == 0)
			return length1;
		
		if(str1.charAt(length1-1) == str2.charAt(length2-1))
			return EditDistanceCount(str1, str2, length1-1, length2-1);
		
		return 1 + min(
				EditDistanceCount(str1, str2, length1, length2 -1 ), //insert
				EditDistanceCount(str1, str2, length1-1, length2), //Remove
				EditDistanceCount(str1, str2, length1-1, length2-1) //Replace
				);
	}

	private static int min(int a, int b, int c) {
		if( a < b)
		{
			return a < c ? a : c;
		}
		else
		{
			return b < c ? b : c;
		}
	}

}
output:

3
Total Time (nanoseconds) : 419115

The time complexity of the above solution is exponential. in the worst case, it will be 3^n.

In the above approach, we can see that many subproblems are solved again and again. We can use the previously solved subproblem to be used in solving a new problem which will reduce the time complexity considerably.

Let's solve the problem using dynamic programming :

package dp;

public class EditDistance {

	public static void main(String[] args) {
		String str1 = "sunday";
	 String str2 = "saturday";
	 
	 long startTime = System.nanoTime();
	 
	 System.out.println(EditDistanceCount(str1,str2, str1.length(), str2.length()));
	 
	 long endTime = System.nanoTime();
		long totalTime = endTime - startTime;
		System.out.println("Total Time (nanoseconds) : " + (totalTime));

	}

	private static int EditDistanceCount(String str1, String str2, int length1, int length2) {
		if(length1 == 0)
			return length2;
		if(length2 == 0)
			return length1;
		
		int[][] dp = new int[length1+1][length2+1];
		for(int i=0;i<length1+1;i++) {
			for(int j=0;j<length2+1;j++) {
				if(i==0)
					dp[i][j] = j;
				else if(j==0)
					dp[i][j] = i;
				else if(str1.charAt(i-1) == str2.charAt(j-1))
					dp[i][j]=dp[i-1][j-1]; // if same characters do nothing
				else
					dp[i][j]= 1 + min(dp[i][j-1], dp[i-1][j], dp[i-1][j-1]);
				
			}
		}
		
		return dp[length1][length2];
	}

	private static int min(int a, int b, int c) {
		if( a < b)
		{
			return a < c ? a : c;
		}
		else
		{
			return b < c ? b : c;
		}
	}

}
output:

3
Total Time (nanoseconds) : 177840

The time complexity of the above solution is O(mxn) where m and n are the length of respective strings to be compared.

Enjoyed this article?

Share it with someone who'd find it useful.

ShareWhatsAppPost on X

AskGif

Published on 31 July 2018 · 3 min read · 533 words

Part of AskGif Blog · coding

You might also like

How to solve Edit Distance Problem using dynamic programming? | AskGif Blog