Note: There are many changes to the HtmlDiff class in this release. A new AbstractDiff class was created and HtmlDiff extends this new class, and many of the common functions have been moved into the AbstractClass in preparation for some upcoming features and enhancements for the diffing of lists and tables.
Isolated Diffing of Special HTML Tags
Purpose
The purpose of these changes is to isolate certain HTML elements in the old and new texts and diff them against the matching element (if one exists), in order to prevent issues with matches in old and new text overlapping certain HTML elements.
The Problem
As an example, let's say you have the following inputs:
Old Text
Testing text with <sup>superscript</sup>
New Text
Testing text with superscript
Output Before These Changes
Testing some text with <ins class="mod">superscript</ins>
Output After These Changes
Testing some text with <sup class="diffmod"><del class="diffmod">superscript</del></sup><ins class="diffmod">superscript</ins>
As you can see in this example, before these changes were made it was seeing the match on the word superscript in both, but was not accounting for the fact that the word was in superscript in the old text and is no longer in superscript in the new text.
Solution
The solution that was implemented in this release was to encapsulate certain HTML elements and diff them separately from the rest of the text, in order to prevent issues with matches overlapping HTML elements.
Currently, the elements to isolate and diff are: ol, ul, dl, sup, sub, and table.
These are set on a protected property on the HtmlDiff class: $isolatedDiffTags
. In this release, this property is not exposed to be configured, but there are additional updates coming soon that will allow configuration of the elements to be handled in this manner.