package patience_diff

  1. Overview
  2. Docs
type elt = string
val get_matching_blocks : transform:('a -> elt) -> ?big_enough:int -> prev:'a array -> next:'a array -> unit -> Patience_diff_lib__.Matching_block.t list

Get_matching_blocks not only aggregates the data from matches a b but also attempts to remove random, semantically meaningless matches ("semantic cleanup"). The value of big_enough governs how aggressively we do so. See get_hunks below for more details.

val matches : elt array -> elt array -> (int * int) list

matches a b returns a list of pairs (i,j) such that a.(i) = b.(j) and such that the list is strictly increasing in both its first and second coordinates. This is essentially a "unfolded" version of what get_matching_blocks returns. Instead of grouping the consecutive matching block using length this function would return all the pairs (prev_start * next_start).

val match_ratio : elt array -> elt array -> float

match_ratio ~compare a b computes the ratio defined as:

2 * len (matches a b) / (len a + len b)

It is an indication of how much alike a and b are. A ratio closer to 1.0 will indicate a number of matches close to the number of elements that can potentially match, thus is a sign that a and b are very much alike. On the next hand, a low ratio means very little match.

val get_hunks : transform:('a -> elt) -> context:int -> ?big_enough:int -> prev:'a array -> next:'a array -> unit -> 'a Patience_diff_lib__.Hunk.t list

get_hunks ~transform ~context ~prev ~next will compare the arrays prev and next and produce a list of hunks. (The hunks will contain Same ranges of at most context elements.) Negative context is equivalent to infinity (producing a singleton hunk list). The value of big_enough governs how aggressively we try to clean up spurious matches, by restricting our attention to only matches of length less than big_enough. Thus, setting big_enough to a higher value results in more aggressive cleanup, and the default value of 1 results in no cleanup at all. When this function is called by Patdiff_core, the value of big_enough is 3 at the line level, and 7 at the word level.

type 'a segment =
  1. | Same of 'a array
  2. | Different of 'a array array
type 'a merged_array = 'a segment list
val merge : elt array array -> elt merged_array
OCaml

Innovation. Community. Security.