Determine the shortest edit distance between two strings, a source and a target, that transforms the source into the target. There are three allowable operations that obey a "forward cursor model", i.e. there's a cursor that begins on the first character in the source string and always advances forward to the next character after an edit.
Let the source string be represented by an array of characters S[1..n] and the target string be represented by an array of characters T[1..m]. Then, let the alignment table A be a table with n+1 rows and m+1 columns [0..n, 0..m] filled according to the following rules:
A[i,j] = 0 if i=0 and j=0
= A[i,j-1]+1 if i=0
= A[i-1,j]+1 if j=0
= min{A[i,j-1]+1, else
A[i-1,j]+1,
A[i-1,j-1]+match(S[i],T[j])}
match(S[i],T[j]) = 0 if S[i] = T[j]
= 1 otherwise
The value A[i,j] is interpreted as the shortest edit distance between the substrings S[1..i] and T[1..j]. When i=0 or j=0, the corresponding substring is interpreted as empty.
Similar to the knapsack algorithm, we can keep a B table consisting of back pointers. This is similar to the knapsack problem, but we need a different approach since the pointers point to one of two rows and one of two columns. However, there are only three possibilities. One is as follows:
B[i,j] = 1 if the value of A[i,j] was taken as A[i-1,j] + 1 = 2 if the value of A[i,j] was taken as A[i,j-1] + 1 = 3 if the value of A[i,j] was taken as A[i-1,j-1] + match
Fill in the B table concurrently with the A table. If there are multiple possible ways to fill in an entry in the B table, favor 3 over 1, and favor 1 over 2 (i.e. favor changes over deletes, and deletes over inserts). The fill order, again, is left to right, top to bottom.
Similar to the knapsack algorithm, we follow the pointers from the B table starting at B[n,m] and going until we find a path back to B[0,0]. Each move corresponds to a different edit operation.
Following the pointers backwards from B[n,m] to B[0,0] yields the edit operations in reverse (i.e. the first pointer you follow is the last edit operation, the second is the second-to-last, etc.). You can also correlate the edit operations to the line numbers from the original files by using the "forward cursor model" described above. Keep two line number counters, one for the source and one for the target. Both start at 1.
If you forget how to use, or need a quick reference for, any functions and classes in Java and C++, many good resources are available on the Internet. For Java programmers, I highly recommend always having the JDK1.3 API Reference handy at any given time. For C++, I personally like cplusplus.com's online reference, but your tastes may vary. Both links above will open in a new window.