aboutsummaryrefslogtreecommitdiffstats
path: root/differ.txi
diff options
context:
space:
mode:
authorSteve Langasek <vorlon@debian.org>2005-01-10 08:53:33 +0000
committerBryan Newbold <bnewbold@robocracy.org>2017-02-20 00:05:30 -0800
commite33f9eb9cf5cc29c36ce2aa7e10cd0f37ae0cc8e (patch)
treeabbf06041619e445f9d0b772b0d58132009d8234 /differ.txi
parentf559c149c83da84d0b1c285f0298c84aec564af9 (diff)
parent8466d8cfa486fb30d1755c4261b781135083787b (diff)
downloadslib-e33f9eb9cf5cc29c36ce2aa7e10cd0f37ae0cc8e.tar.gz
slib-e33f9eb9cf5cc29c36ce2aa7e10cd0f37ae0cc8e.zip
Import Debian changes 3a1-4.2debian/3a1-4.2
slib (3a1-4.2) unstable; urgency=low * Non-maintainer upload. * Add guile.init.local for use within the build dir, since otherwise we have an (earlier unnoticed) circular build-dep due to a difference between scm and guile. slib (3a1-4.1) unstable; urgency=low * Non-maintainer upload. * Build-depend on guile-1.6 instead of scm, since the new version of scm is wedged in unstable (closes: #281809). slib (3a1-4) unstable; urgency=low * Also check for expected creation on slibcat. (Closes: #240096) slib (3a1-3) unstable; urgency=low * Also check for /usr/share/guile/1.6/slib before installing for guile 1.6. (Closes: #239267) slib (3a1-2) unstable; urgency=low * Add format.scm back into slib until gnucash stops using it. * Call guile-1.6 new-catalog (Closes: #238231) slib (3a1-1) unstable; urgency=low * New upstream release * Remove Info section from doc-base file (Closes: #186950) * Remove period from end of description (linda, lintian) * html gen fixed upstream (Closes: #111778) slib (2d4-2) unstable; urgency=low * Fix url for upstream source (Closes: #144981) * Fix typo in slib.texi (enquque->enqueue) (Closes: #147475) * Add build depends. slib (2d4-1) unstable; urgency=low * New upstream. slib (2d3-1) unstable; urgency=low * New upstream. * Remove texi2html call in debian/rules. Now done upstream. Add make html instead. * Changes to rules and doc-base to conform to upstream html gen * Clean up upstream makefile to make sure it cleans up after itself.
Diffstat (limited to 'differ.txi')
-rw-r--r--differ.txi90
1 files changed, 50 insertions, 40 deletions
diff --git a/differ.txi b/differ.txi
index f7b1f75..c46fc2a 100644
--- a/differ.txi
+++ b/differ.txi
@@ -1,5 +1,5 @@
@noindent
-This package implements the algorithm:
+@code{diff:edit-length} implements the algorithm:
@ifinfo
@example
@@ -18,6 +18,18 @@ Information Processing Letters 35, 6 (1990), 317-323.
@end ifset
@noindent
+The values returned by @code{diff:edit-length} can be used to gauge
+the degree of match between two sequences.
+
+@noindent
+Surprisingly, "An O(NP) Sequence Comparison Algorithm" does not
+derive the edit sequence; only the sequence length. Developing this
+linear-space sub-quadratic-time algorithm for computing the edit
+sequence required hundreds of hours of work. I have submitted a
+paper describing the algorithm to the Journal of Computational
+Biology.
+
+@noindent
If the items being sequenced are text lines, then the computed
edit-list is equivalent to the output of the @dfn{diff} utility
@cindex diff
@@ -25,71 +37,69 @@ program. If the items being sequenced are words, then it is like the
lesser known @dfn{spiff} program.
@cindex spiff
-@noindent
-The values returned by @code{diff:edit-length} can be used to gauge
-the degree of match between two sequences.
-@noindent
-I believe that this algorithm is currently the fastest for these
-tasks, but genome sequencing applications fuel extensive research in
-this area.
+@defun diff:longest-common-subsequence array1 array2 =? p-lim
-@defun diff:longest-common-subsequence array1 array2 =?
+@defunx diff:longest-common-subsequence array1 array2 =?
+@var{array1} and @var{array2} are one-dimensional arrays. The procedure @var{=?} is used
+to compare sequence tokens for equality.
+The non-negative integer @var{p-lim}, if provided, is maximum number of
+deletions of the shorter sequence to allow. @code{diff:longest-common-subsequence} will return @code{#f}
+if more deletions would be necessary.
-@defunx diff:longest-common-subsequence array1 array2
-@var{array1} and @var{array2} are one-dimensional arrays. The procedure @var{=?} is used
-to compare sequence tokens for equality. @var{=?} defaults to @code{eqv?}.
@code{diff:longest-common-subsequence} returns a one-dimensional array of length @code{(quotient (- (+
-len1 len2) (fp:edit-length @var{array1} @var{array2})) 2)} holding the longest sequence
+len1 len2) (diff:edit-length @var{array1} @var{array2})) 2)} holding the longest sequence
common to both @var{array}s.
@end defun
-@defun diff:edits array1 array2 =?
+@defun diff:edits array1 array2 =? p-lim
-@defunx diff:edits array1 array2
+@defunx diff:edits array1 array2 =?
@var{array1} and @var{array2} are one-dimensional arrays. The procedure @var{=?} is used
-to compare sequence tokens for equality. @var{=?} defaults to @code{eqv?}.
-@code{diff:edits} returns a list of length @code{(fp:edit-length @var{array1} @var{array2})} composed of
-a shortest sequence of edits transformaing @var{array1} to @var{array2}.
+to compare sequence tokens for equality.
-Each edit is a list of an integer and a symbol:
+The non-negative integer @var{p-lim}, if provided, is maximum number of
+deletions of the shorter sequence to allow. @code{diff:edits} will return @code{#f}
+if more deletions would be necessary.
+
+@code{diff:edits} returns a vector of length @code{(diff:edit-length @var{array1} @var{array2})} composed
+of a shortest sequence of edits transformaing @var{array1} to @var{array2}.
+
+Each edit is an integer:
@table @asis
-@item (@var{j} insert)
-Inserts @code{(array-ref @var{array1} @var{j})} into the sequence.
-@item (@var{k} delete)
-Deletes @code{(array-ref @var{array2} @var{k})} from the sequence.
+@item @var{k} > 0
+Inserts @code{(array-ref @var{array1} (+ -1 @var{j}))} into the sequence.
+@item @var{k} < 0
+Deletes @code{(array-ref @var{array2} (- -1 @var{k}))} from the sequence.
@end table
@end defun
-@defun diff:edit-length array1 array2 =?
+@defun diff:edit-length array1 array2 =? p-lim
-@defunx diff:edit-length array1 array2
+@defunx diff:edit-length array1 array2 =?
@var{array1} and @var{array2} are one-dimensional arrays. The procedure @var{=?} is used
-to compare sequence tokens for equality. @var{=?} defaults to @code{eqv?}.
+to compare sequence tokens for equality.
+
+The non-negative integer @var{p-lim}, if provided, is maximum number of
+deletions of the shorter sequence to allow. @code{diff:edit-length} will return @code{#f}
+if more deletions would be necessary.
+
@code{diff:edit-length} returns the length of the shortest sequence of edits transformaing
@var{array1} to @var{array2}.
@end defun
@example
-(diff:longest-common-subsequence '#(f g h i e j c k l m)
- '#(f g e h i j k p q r l m))
- @result{} #(f g h i j k l m)
+(diff:longest-common-subsequence "fghiejcklm" "fgehijkpqrlm" eqv?)
+@result{} "fghijklm"
-(diff:edit-length '#(f g h i e j c k l m)
- '#(f g e h i j k p q r l m))
+(diff:edit-length "fghiejcklm" "fgehijkpqrlm" eqv?)
@result{} 6
-(pretty-print (diff:edits '#(f g h i e j c k l m)
- '#(f g e h i j k p q r l m)))
-@print{}
-((3 insert) ; e
- (4 delete) ; c
- (6 delete) ; h
- (7 insert) ; p
- (8 insert) ; q
- (9 insert)) ; r
+(diff:edits "fghiejcklm" "fgehijkpqrlm" eqv?)
+@result{} #As32(3 -5 -7 8 9 10)
+ ; e c h p q r
@end example