Import Upstream version 3a1upstream/3a1

author: Bryan Newbold <bnewbold@robocracy.org> 2017-02-20 00:05:29 -0800
committer: Bryan Newbold <bnewbold@robocracy.org> 2017-02-20 00:05:29 -0800
commit: 8466d8cfa486fb30d1755c4261b781135083787b (patch)
tree: c8c12c67246f543c3cc4f64d1c07e003cb1d45ae /differ.txi
parent: 87b82b5822ca54228cfa6df29be3ad9d4bc47d16 (diff)
download: slib-8466d8cfa486fb30d1755c4261b781135083787b.tar.gz
slib-8466d8cfa486fb30d1755c4261b781135083787b.zip
1 files changed, 105 insertions, 0 deletions
diff --git a/differ.txi b/differ.txi
new file mode 100644
index 0000000..c46fc2a
--- /dev/null
+++ b/differ.txi
@@ -0,0 +1,105 @@
+@noindent
+@code{diff:edit-length} implements the algorithm:
+
+@ifinfo
+@example
+S. Wu, E. Myers, U. Manber, and W. Miller,
+   "An O(NP) Sequence Comparison Algorithm,"
+   Information Processing Letters 35, 6 (1990), 317-323.
+   @url{http://www.cs.arizona.edu/people/gene/vita.html}
+@end example
+@end ifinfo
+@ifset html
+S. Wu, <A HREF="http://www.cs.arizona.edu/people/gene/vita.html">
+E. Myers,</A> U. Manber, and W. Miller,
+<A HREF="http://www.cs.arizona.edu/people/gene/PAPERS/np_diff.ps">
+"An O(NP) Sequence Comparison Algorithm,"</A>
+Information Processing Letters 35, 6 (1990), 317-323.
+@end ifset
+
+@noindent
+The values returned by @code{diff:edit-length} can be used to gauge
+the degree of match between two sequences.
+
+@noindent
+Surprisingly, "An O(NP) Sequence Comparison Algorithm" does not
+derive the edit sequence; only the sequence length.  Developing this
+linear-space sub-quadratic-time algorithm for computing the edit
+sequence required hundreds of hours of work.  I have submitted a
+paper describing the algorithm to the Journal of Computational
+Biology.
+
+@noindent
+If the items being sequenced are text lines, then the computed
+edit-list is equivalent to the output of the @dfn{diff} utility
+@cindex diff
+program.  If the items being sequenced are words, then it is like the
+lesser known @dfn{spiff} program.
+@cindex spiff
+
+
+@defun diff:longest-common-subsequence array1 array2 =? p-lim
+
+
+@defunx diff:longest-common-subsequence array1 array2 =?
+@var{array1} and @var{array2} are one-dimensional arrays.  The procedure @var{=?} is used
+to compare sequence tokens for equality.
+
+The non-negative integer @var{p-lim}, if provided, is maximum number of
+deletions of the shorter sequence to allow.  @code{diff:longest-common-subsequence} will return @code{#f}
+if more deletions would be necessary.
+
+@code{diff:longest-common-subsequence} returns a one-dimensional array of length @code{(quotient (- (+
+len1 len2) (diff:edit-length @var{array1} @var{array2})) 2)} holding the longest sequence
+common to both @var{array}s.
+@end defun
+
+@defun diff:edits array1 array2 =? p-lim
+
+
+@defunx diff:edits array1 array2 =?
+@var{array1} and @var{array2} are one-dimensional arrays.  The procedure @var{=?} is used
+to compare sequence tokens for equality.
+
+The non-negative integer @var{p-lim}, if provided, is maximum number of
+deletions of the shorter sequence to allow.  @code{diff:edits} will return @code{#f}
+if more deletions would be necessary.
+
+@code{diff:edits} returns a vector of length @code{(diff:edit-length @var{array1} @var{array2})} composed
+of a shortest sequence of edits transformaing @var{array1} to @var{array2}.
+
+Each edit is an integer:
+@table @asis
+@item @var{k} > 0
+Inserts @code{(array-ref @var{array1} (+ -1 @var{j}))} into the sequence.
+@item @var{k} < 0
+Deletes @code{(array-ref @var{array2} (- -1 @var{k}))} from the sequence.
+@end table
+@end defun
+
+@defun diff:edit-length array1 array2 =? p-lim
+
+
+@defunx diff:edit-length array1 array2 =?
+@var{array1} and @var{array2} are one-dimensional arrays.  The procedure @var{=?} is used
+to compare sequence tokens for equality.
+
+The non-negative integer @var{p-lim}, if provided, is maximum number of
+deletions of the shorter sequence to allow.  @code{diff:edit-length} will return @code{#f}
+if more deletions would be necessary.
+
+@code{diff:edit-length} returns the length of the shortest sequence of edits transformaing
+@var{array1} to @var{array2}.
+@end defun
+@example
+(diff:longest-common-subsequence "fghiejcklm" "fgehijkpqrlm" eqv?)
+@result{} "fghijklm"
+
+(diff:edit-length "fghiejcklm" "fgehijkpqrlm" eqv?)
+@result{} 6
+
+(diff:edits "fghiejcklm" "fgehijkpqrlm" eqv?)
+@result{} #As32(3 -5 -7 8 9 10)
+       ; e  c  h p q  r
+@end example
+
author	Bryan Newbold <bnewbold@robocracy.org>	2017-02-20 00:05:29 -0800
committer	Bryan Newbold <bnewbold@robocracy.org>	2017-02-20 00:05:29 -0800
commit	8466d8cfa486fb30d1755c4261b781135083787b (patch)
tree	c8c12c67246f543c3cc4f64d1c07e003cb1d45ae /differ.txi
parent	87b82b5822ca54228cfa6df29be3ad9d4bc47d16 (diff)
download	slib-8466d8cfa486fb30d1755c4261b781135083787b.tar.gz slib-8466d8cfa486fb30d1755c4261b781135083787b.zip