Understanding Diff, Merge, and Patch
There are several terms used in document comparisons. Here's a simple guide to help you understand them:
Diff
A "diff" results from comparing two or three documents. It encompasses the original document and all the changes required to transform it into the other document. It comprises three basic elements: keep, insert, remove. For example:
The quick red fox jumpsover the sleeping dog.
The quick brown fox jumpsover the lazy dog.
The Diff:
The quick redbrown fox jumps over the sleepinglazy dog.
In other words, the diff is like a document with annotations showing all differences to another version.
Merge
When two authors independently edit a shared initial document and make changes, a merge combines their updates into a single document by automatically accepting all changes from both people into the initial document.
The changes that the merge automatically accepts are always calculated using a three-way diff between the initial document and both authors' changes. (Learn more about three-way diffs here.)
If conflicts arise, such as when both authors change the same passage, a reviewer must resolve the conflicts by deciding which version to accept.
Merging is typically used in software programming, where the two authors work on different parts of the software and want to combine their work.
While redlining can be a related process, there is typically no need for automatically accepting others' changes, and merging is not usually relevant there. (Learn more about redlining here.)
Patch
Like diffs, patches are another way to represent changes between documents.
They are lists of instructions on how to go from one document version to another, allowing for describing contents to be inserted, removed, styled and even relocated.
Unlike diffs, which are integrated into a document version, patches are stored separately from the documents and are applied to the document when needed. Therefore they always contain pointers to where they will apply.
With patches, the jumping fox example from above would roughly look like this:
- Replace word 3 ("red") with "brown"
- Replace word 8 ("sleeping") with "lazy"
This approach offers several advantages, including reduced storage requirements and the ability to easily represent moves and other custom changes.
For this reason, TreeDiff comparisons generate patches, enabling your document management system to keep documents editable during patch application and simplifying the integration of new features like patch counting or filtering.
In other words, patches list the changes needed to transform a (separate) document Version 1 into Version 2.
When implementing patches, keep in mind to first set markers at all update locations and only then start applying them: pointers will get outdated as patches are applied.