[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6. Database Packages

6.1 Relational Database  'relational-database
6.2 Relational Infrastructure  
6.3 Weight-Balanced Trees  'wt-tree


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1 Relational Database

(require 'relational-database)

This package implements a database system inspired by the Relational Model (E. F. Codd, A Relational Model of Data for Large Shared Data Banks). An SLIB relational database implementation can be created from any 6.2.1 Base Table implementation.

Why relational database? For motivations and design issues see
http://swissnet.ai.mit.edu/~jaffer/DBManifesto.html.

6.1.1 Using Databases  'databases
6.1.2 Table Operations  
6.1.3 Database Interpolation  'database-interpolate
6.1.4 Embedded Commands  'database-commands
6.1.5 Database Macros  'within-database
6.1.6 Database Browser  'database-browse


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.1 Using Databases

(require 'databases)

This enhancement wraps a utility layer on relational-database which provides:

Database Sharing

Auto-sharing refers to a call to the procedure open-database returning an already open database (procedure), rather than opening the database file a second time.

Note: Databases returned by open-database do not include wrappers applied by packages like 6.1.4 Embedded Commands. But wrapped databases do work as arguments to these functions.

When a database is created, it is mutable by the creator and not auto-sharable. A database opened mutably is also not auto-sharable. But any number of readers can (open) share a non-mutable database file.

This next set of procedures mirror the whole-database methods in 6.2.4 Database Operations. Except for create-database, each procedure will accept either a filename or database procedure for its first argument.

Function: create-database filename base-table-type

filename should be a string naming a file; or #f. base-table-type must be a symbol naming a feature which can be passed to require. create-database returns a new, open relational database (with base-table type base-table-type) associated with filename, or a new ephemeral database if filename is #f.

create-database is the only run-time use of require in SLIB which crosses module boundaries. When base-table-type is required by create-database; it adds an association of base-table-type with its relational-system procedure to mdbm:*databases*.

alist-table is the default base-table type:

 
(require 'databases)
(define my-rdb (create-database "my.db" 'alist-table))
Only alist-table and base-table modules which have been required will dispatch correctly from the open-database procedures. Therefore, either pass two arguments to open-database, or require the base-table of your database file uses before calling open-database with one argument.

Procedure: open-database! rdb base-table-type

Returns mutable open relational database or #f.

Function: open-database rdb base-table-type

Returns an open relational database associated with rdb. The database will be opened with base-table type base-table-type).

Function: open-database rdb
Returns an open relational database associated with rdb. open-database will attempt to deduce the correct base-table-type.

Function: write-database rdb filename

Writes the mutable relational-database rdb to filename.

Function: sync-database rdb

Writes the mutable relational-database rdb to the filename it was opened with.

Function: solidify-database rdb

Syncs rdb and makes it immutable.

Function: close-database rdb

rdb will only be closed when the count of open-database - close-database calls for rdb (and its filename) is 0. close-database returns #t if successful; and #f otherwise.

Function: mdbm:report

Prints a table of open database files. The columns are the base-table type, number of opens, `!' for mutable, the filename, and the lock certificate (if locked).

 
(mdbm:report)
-|
  alist-table 003   /usr/local/lib/slib/clrnamdb.scm
  alist-table 001 ! sdram.db jaffer@aubrey.jaffer.3166:1038628199

Opening Tables

Function: open-table rdb table-name

rdb must be a relational database and table-name a symbol.

open-table returns a "methods" procedure for an existing relational table in rdb if it exists and can be opened for reading, otherwise returns #f.

Procedure: open-table! rdb table-name

rdb must be a relational database and table-name a symbol.

open-table! returns a "methods" procedure for an existing relational table in rdb if it exists and can be opened in mutable mode, otherwise returns #f.

Defining Tables

Function: define-domains rdb row5 ...

Adds the domain rows row5 ... to the `*domains-data*' table in rdb. The format of the row is given in 6.2.2 Catalog Representation.

 
(define-domains rdb '(permittivity #f complex? c64 #f))

Function: add-domain rdb row5

Use define-domains instead.

Function: define-tables rdb spec-0 ...

Adds tables as specified in spec-0 ... to the open relational-database rdb. Each spec has the form:

 
(<name> <descriptor-name> <descriptor-name> <rows>)
or
 
(<name> <primary-key-fields> <other-fields> <rows>)

where <name> is the table name, <descriptor-name> is the symbol name of a descriptor table, <primary-key-fields> and <other-fields> describe the primary keys and other fields respectively, and <rows> is a list of data rows to be added to the table.

<primary-key-fields> and <other-fields> are lists of field descriptors of the form:

 
(<column-name> <domain>)
or
 
(<column-name> <domain> <column-integrity-rule>)

where <column-name> is the column name, <domain> is the domain of the column, and <column-integrity-rule> is an expression whose value is a procedure of one argument (which returns #f to signal an error).

If <domain> is not a defined domain name and it matches the name of this table or an already defined (in one of spec-0 ...) single key field table, a foreign-key domain will be created for it.

Listing Tables

Function: list-table-definition rdb table-name

If symbol table-name exists in the open relational-database rdb, then returns a list of the table-name, its primary key names and domains, its other key names and domains, and the table's records (as lists). Otherwise, returns #f.

The list returned by list-table-definition, when passed as an argument to define-tables, will recreate the table.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.2 Table Operations

These are the descriptions of the methods available from an open relational table. A method is retrieved from a table by calling the table with the symbol name of the operation. For example:

 
((plat 'get 'processor) 'djgpp) => i386

Some operations described below require primary key arguments. Primary keys arguments are denoted key1 key2 .... It is an error to call an operation for a table which takes primary key arguments with the wrong number of primary keys for that table.

Operation: relational-table get column-name
Returns a procedure of arguments key1 key2 ... which returns the value for the column-name column of the row associated with primary keys key1, key2 ... if that row exists in the table, or #f otherwise.

 
((plat 'get 'processor) 'djgpp) => i386
((plat 'get 'processor) 'be-os) => #f

6.1.2.1 Single Row Operations  
6.1.2.2 Match-Keys  
6.1.2.3 Multi-Row Operations  
6.1.2.4 Indexed Sequential Access Methods  
6.1.2.5 Sequential Index Operations  
6.1.2.6 Table Administration  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.2.1 Single Row Operations

The term row used below refers to a Scheme list of values (one for each column) in the order specified in the descriptor (table) for this table. Missing values appear as #f. Primary keys must not be missing.

Operation: relational-table row:insert
Adds the row row to this table. If a row for the primary key(s) specified by row already exists in this table an error is signaled. The value returned is unspecified.

 
(define telephone-table-desc
        ((my-database 'create-table) 'telephone-table-desc))
(define ndrp (telephone-table-desc 'row:insert))
(ndrp '(1 #t name #f string))
(ndrp '(2 #f telephone
          (lambda (d)
            (and (string? d) (> (string-length d) 2)
                 (every
                  (lambda (c)
                    (memv c '(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9
                                  #\+ #\( #\  #\) #\-)))
                  (string->list d))))
          string))

Operation: relational-table row:update
Returns a procedure of one argument, row, which adds the row, row, to this table. If a row for the primary key(s) specified by row already exists in this table, it will be overwritten. The value returned is unspecified.

Operation: relational-table row:retrieve
Returns a procedure of arguments key1 key2 ... which returns the row associated with primary keys key1, key2 ... if it exists, or #f otherwise.

 
((plat 'row:retrieve) 'linux) => (linux i386 linux gcc)
((plat 'row:retrieve) 'multics) => #f

Operation: relational-table row:remove
Returns a procedure of arguments key1 key2 ... which removes and returns the row associated with primary keys key1, key2 ... if it exists, or #f otherwise.

Operation: relational-table row:delete
Returns a procedure of arguments key1 key2 ... which deletes the row associated with primary keys key1, key2 ... if it exists. The value returned is unspecified.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.2.2 Match-Keys

The (optional) match-key1 ... arguments are used to restrict actions of a whole-table operation to a subset of that table. Those procedures (returned by methods) which accept match-key arguments will accept any number of match-key arguments between zero and the number of primary keys in the table. Any unspecified match-key arguments default to #f.

The match-key1 ... restrict the actions of the table command to those records whose primary keys each satisfy the corresponding match-key argument. The arguments and their actions are:

#f
The false value matches any key in the corresponding position.
an object of type procedure
This procedure must take a single argument, the key in the corresponding position. Any key for which the procedure returns a non-false value is a match; Any key for which the procedure returns a #f is not.
other values
Any other value matches only those keys equal? to it.

Operation: relational-table get* column-name
Returns a procedure of optional arguments match-key1 ... which returns a list of the values for the specified column for all rows in this table. The optional match-key1 ... arguments restrict actions to a subset of the table.

 
((plat 'get* 'processor)) =>
(i386 i8086 i386 i8086 i386 i386 i8086 m68000
 m68000 m68000 m68000 m68000 powerpc)

((plat 'get* 'processor) #f) =>
(i386 i8086 i386 i8086 i386 i386 i8086 m68000
 m68000 m68000 m68000 m68000 powerpc)

(define (a-key? key)
   (char=? #\a (string-ref (symbol->string key) 0)))

((plat 'get* 'processor) a-key?) =>
(m68000 m68000 m68000 m68000 m68000 powerpc)

((plat 'get* 'name) a-key?) =>
(atari-st-turbo-c atari-st-gcc amiga-sas/c-5.10
 amiga-aztec amiga-dice-c aix)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.2.3 Multi-Row Operations

Operation: relational-table row:retrieve*
Returns a procedure of optional arguments match-key1 ... which returns a list of all rows in this table. The optional match-key1 ... arguments restrict actions to a subset of the table. For details see See section 6.1.2.2 Match-Keys.

 
((plat 'row:retrieve*) a-key?) =>
((atari-st-turbo-c m68000 atari turbo-c)
 (atari-st-gcc m68000 atari gcc)
 (amiga-sas/c-5.10 m68000 amiga sas/c)
 (amiga-aztec m68000 amiga aztec)
 (amiga-dice-c m68000 amiga dice-c)
 (aix powerpc aix -))

Operation: relational-table row:remove*
Returns a procedure of optional arguments match-key1 ... which removes and returns a list of all rows in this table. The optional match-key1 ... arguments restrict actions to a subset of the table.

Operation: relational-table row:delete*
Returns a procedure of optional arguments match-key1 ... which Deletes all rows from this table. The optional match-key1 ... arguments restrict deletions to a subset of the table. The value returned is unspecified. The descriptor table and catalog entry for this table are not affected.

Operation: relational-table for-each-row
Returns a procedure of arguments proc match-key1 ... which calls proc with each row in this table. The optional match-key1 ... arguments restrict actions to a subset of the table. For details see See section 6.1.2.2 Match-Keys.

Note that row:insert* and row:update* do not use match-keys.

Operation: relational-table row:insert*
Returns a procedure of one argument, rows, which adds each row in the list of rows, rows, to this table. If a row for the primary key specified by an element of rows already exists in this table, an error is signaled. The value returned is unspecified.

Operation: relational-table row:update*
Returns a procedure of one argument, rows, which adds each row in the list of rows, rows, to this table. If a row for the primary key specified by an element of rows already exists in this table, it will be overwritten. The value returned is unspecified.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.2.4 Indexed Sequential Access Methods

Indexed Sequential Access Methods are a way of arranging database information so that records can be accessed both by key and by key sequence (ordering). ISAM is not part of Codd's relational model. Hardcore relational programmers might use some least-upper-bound join for every row to get them into an order.

Associative memory in B-Trees is an example of a database implementation which can support a native key ordering. SLIB's alist-table implementation uses sort to implement for-each-row-in-order, but does not support isam-next and isam-prev.

The multi-primary-key ordering employed by these operations is the lexicographic collation of those primary-key fields in their given order. For example:

 
(12 a 34) < (12 a 36) < (12 b 1) < (13 a 0)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.2.5 Sequential Index Operations

The following procedures are individually optional depending on the base-table implememtation. If an operation is not supported, then calling the table with that operation symbol will return false.

Operation: relational-table for-each-row-in-order
Returns a procedure of arguments proc match-key1 ... which calls proc with each row in this table in the (implementation-dependent) natural, repeatable ordering for rows. The optional match-key1 ... arguments restrict actions to a subset of the table. For details see See section 6.1.2.2 Match-Keys.

Operation: relational-table isam-next
Returns a procedure of arguments key1 key2 ... which returns the key-list identifying the lowest record higher than key1 key2 ... which is stored in the relational-table; or false if no higher record is present.

Operation: relational-table isam-next column-name
The symbol column-name names a key field. In the list returned by isam-next, that field, or a field to its left, will be changed. This allows one to skip over less significant key fields.

Operation: relational-table isam-prev
Returns a procedure of arguments key1 key2 ... which returns the key-list identifying the highest record less than key1 key2 ... which is stored in the relational-table; or false if no lower record is present.

Operation: relational-table isam-prev index
The symbol column-name names a key field. In the list returned by isam-next, that field, or a field to its left, will be changed. This allows one to skip over less significant key fields.

For example, if a table has key fields:

 
(col1 col2)
(9 5)
(9 6)
(9 7)
(9 8)
(12 5)
(12 6)
(12 7)

Then:

 
((table 'isam-next)       '(9 5))       => (9 6)
((table 'isam-next 'col2) '(9 5))       => (9 6)
((table 'isam-next 'col1) '(9 5))       => (12 5)
((table 'isam-prev)       '(12 7))      => (12 6)
((table 'isam-prev 'col2) '(12 7))      => (12 6)
((table 'isam-prev 'col1) '(12 7))      => (9 8)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.2.6 Table Administration

Operation: relational-table column-names
Operation: relational-table column-foreigns
Operation: relational-table column-domains
Operation: relational-table column-types
Return a list of the column names, foreign-key table names, domain names, or type names respectively for this table. These 4 methods are different from the others in that the list is returned, rather than a procedure to obtain the list.

Operation: relational-table primary-limit
Returns the number of primary keys fields in the relations in this table.

Operation: relational-table close-table
Subsequent operations to this table will signal an error.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.3 Database Interpolation

(require 'database-interpolate)

Indexed sequential access methods allow finding the keys (having associations) closest to a given value. This facilitates the interpolation of associations between those in the table.

Function: interpolate-from-table table column
Table should be a relational table with one numeric primary key field which supports the isam-prev and isam-next operations. column should be a symbol or exact positive integer designating a numerically valued column of table.

interpolate-from-table calculates and returns a value proportionally intermediate between its values in the next and previous key records contained in table. For keys larger than all the stored keys the value associated with the largest stored key is used. For keys smaller than all the stored keys the value associated with the smallest stored key is used.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.4 Embedded Commands

(require 'database-commands)

This enhancement wraps a utility layer on relational-database which provides:

When an enhanced relational-database is called with a symbol which matches a name in the *commands* table, the associated procedure expression is evaluated and applied to the enhanced relational-database. A procedure should then be returned which the user can invoke on (optional) arguments.

The command *initialize* is special. If present in the *commands* table, open-database or open-database! will return the value of the *initialize* command. Notice that arbitrary code can be run when the *initialize* procedure is automatically applied to the enhanced relational-database.

Note also that if you wish to shadow or hide from the user relational-database methods described in 6.2.4 Database Operations, this can be done by a dispatch in the closure returned by the *initialize* expression rather than by entries in the *commands* table if it is desired that the underlying methods remain accessible to code in the *commands* table.

6.1.4.1 Database Extension  
6.1.4.2 Command Intrinsics  
6.1.4.3 Define-tables Example  
6.1.4.4 The *commands* Table  
6.1.4.5 Command Service  
6.1.4.6 Command Example  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.4.1 Database Extension

Function: wrap-command-interface rdb
Returns relational database rdb wrapped with additional commands defined in its *commands* table.

Function: add-command-tables rdb
The relational database rdb must be mutable. add-command-tables adds a *command* table to rdb; then returns (wrap-command-interface rdb).

Function: define-*commands* rdb spec-0 ...

Adds commands to the *commands* table as specified in spec-0 ... to the open relational-database rdb. Each spec has the form:

 
((<name> <rdb>) "comment" <expression1> <expression2> ...)
or
 
((<name> <rdb>) <expression1> <expression2> ...)

where <name> is the command name, <rdb> is a formal passed the calling relational database, "comment" describes the command, and <expression1>, <expression1>, ... are the body of the procedure.

define-*commands* adds to the *commands* table a command <name>:

 
(lambda (<name> <rdb>) <expression1> <expression2> ...)

Function: open-command-database filename
Function: open-command-database filename base-table-type
Returns an open enhanced relational database associated with filename. The database will be opened with base-table type base-table-type) if supplied. If base-table-type is not supplied, open-command-database will attempt to deduce the correct base-table-type. If the database can not be opened or if it lacks the *commands* table, #f is returned.

Function: open-command-database! filename
Function: open-command-database! filename base-table-type
Returns mutable open enhanced relational database ...

Function: open-command-database database
Returns database if it is an immutable relational database; #f otherwise.

Function: open-command-database! database
Returns database if it is a mutable relational database; #f otherwise.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.4.2 Command Intrinsics

Some commands are defined in all extended relational-databases. The are called just like 6.2.4 Database Operations.

Operation: relational-database add-domain domain-row
Adds domain-row to the domains table if there is no row in the domains table associated with key (car domain-row) and returns #t. Otherwise returns #f.

For the fields and layout of the domain table, See section 6.2.2 Catalog Representation. Currently, these fields are

The following example adds 3 domains to the `build' database. `Optstring' is either a string or #f. filename is a string and build-whats is a symbol.

 
(for-each (build 'add-domain)
          '((optstring #f
                       (lambda (x) (or (not x) (string? x)))
                       string
                       #f)
            (filename #f #f string #f)
            (build-whats #f #f symbol #f)))

Operation: relational-database delete-domain domain-name
Removes and returns the domain-name row from the domains table.

Operation: relational-database domain-checker domain
Returns a procedure to check an argument for conformance to domain domain.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.4.3 Define-tables Example

The following example shows a new database with the name of `foo.db' being created with tables describing processor families and processor/os/compiler combinations. The database is then solidified; saved and changed to immutable.

 
(require 'databases)
(define my-rdb (create-database "foo.db" 'alist-table))
(define-tables my-rdb
  '(processor-family
    ((family    atom))
    ((also-ran  processor-family))
    ((m68000           #f)
     (m68030           m68000)
     (i386             i8086)
     (i8086            #f)
     (powerpc          #f)))

  '(platform
    ((name      symbol))
    ((processor processor-family)
     (os        symbol)
     (compiler  symbol))
    ((aix              powerpc aix     -)
     (amiga-dice-c     m68000  amiga   dice-c)
     (amiga-aztec      m68000  amiga   aztec)
     (amiga-sas/c-5.10 m68000  amiga   sas/c)
     (atari-st-gcc     m68000  atari   gcc)
     (atari-st-turbo-c m68000  atari   turbo-c)
     (borland-c-3.1    i8086   ms-dos  borland-c)
     (djgpp            i386    ms-dos  gcc)
     (linux            i386    linux   gcc)
     (microsoft-c      i8086   ms-dos  microsoft-c)
     (os/2-emx         i386    os/2    gcc)
     (turbo-c-2        i8086   ms-dos  turbo-c)
     (watcom-9.0       i386    ms-dos  watcom))))

(solidify-database my-rdb)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.4.4 The *commands* Table

The table *commands* in an enhanced relational-database has the fields (with domains):

 
PRI name        symbol
    parameters  parameter-list
    procedure   expression
    documentation string

The parameters field is a foreign key (domain parameter-list) of the *catalog-data* table and should have the value of a table described by *parameter-columns*. This parameter-list table describes the arguments suitable for passing to the associated command. The intent of this table is to be of a form such that different user-interfaces (for instance, pull-down menus or plain-text queries) can operate from the same table. A parameter-list table has the following fields:

 
PRI index       ordinal
    name        symbol
    arity       parameter-arity
    domain      domain
    defaulter   expression
    expander    expression
    documentation string

The arity field can take the values:

single
Requires a single parameter of the specified domain.
optional
A single parameter of the specified domain or zero parameters is acceptable.
boolean
A single boolean parameter or zero parameters (in which case #f is substituted) is acceptable.
nary
Any number of parameters of the specified domain are acceptable. The argument passed to the command function is always a list of the parameters.
nary1
One or more of parameters of the specified domain are acceptable. The argument passed to the command function is always a list of the parameters.

The domain field specifies the domain which a parameter or parameters in the indexth field must satisfy.

The defaulter field is an expression whose value is either #f or a procedure of one argument (the parameter-list) which returns a list of the default value or values as appropriate. Note that since the defaulter procedure is called every time a default parameter is needed for this column, sticky defaults can be implemented using shared state with the domain-integrity-rule.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.4.5 Command Service

Function: make-command-server rdb table-name
Returns a procedure of 2 arguments, a (symbol) command and a call-back procedure. When this returned procedure is called, it looks up command in table table-name and calls the call-back procedure with arguments:
command
The command
command-value
The result of evaluating the expression in the procedure field of table-name and calling it with rdb.
parameter-name
A list of the official name of each parameter. Corresponds to the name field of the command's parameter-table.
positions
A list of the positive integer index of each parameter. Corresponds to the index field of the command's parameter-table.
arities
A list of the arities of each parameter. Corresponds to the arity field of the command's parameter-table. For a description of arity see table above.
types
A list of the type name of each parameter. Correspnds to the type-id field of the contents of the domain of the command's parameter-table.
defaulters
A list of the defaulters for each parameter. Corresponds to the defaulters field of the command's parameter-table.
domain-integrity-rules
A list of procedures (one for each parameter) which tests whether a value for a parameter is acceptable for that parameter. The procedure should be called with each datum in the list for nary arity parameters.
aliases
A list of lists of (alias parameter-name). There can be more than one alias per parameter-name.

For information about parameters, See section 4.4.4 Parameter lists.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.4.6 Command Example

Here is an example of setting up a command with arguments and parsing those arguments from a getopt style argument list (see section 4.4.1 Getopt).

 
(require 'database-commands)
(require 'databases)
(require 'getopt-parameters)
(require 'parameters)
(require 'getopt)
(require 'fluid-let)
(require 'printf)

(define my-rdb (add-command-tables (create-database #f 'alist-table)))

(define-tables my-rdb
  '(foo-params
    *parameter-columns*
    *parameter-columns*
    ((1 single-string single string
        (lambda (pl) '("str")) #f "single string")
     (2 nary-symbols nary symbol
        (lambda (pl) '()) #f "zero or more symbols")
     (3 nary1-symbols nary1 symbol
        (lambda (pl) '(symb)) #f "one or more symbols")
     (4 optional-number optional ordinal
        (lambda (pl) '()) #f "zero or one number")
     (5 flag boolean boolean
        (lambda (pl) '(#f)) #f "a boolean flag")))
  '(foo-pnames
    ((name string))
    ((parameter-index ordinal))
    (("s" 1)
     ("single-string" 1)
     ("n" 2)
     ("nary-symbols" 2)
     ("N" 3)
     ("nary1-symbols" 3)
     ("o" 4)
     ("optional-number" 4)
     ("f" 5)
     ("flag" 5)))
  '(my-commands
    ((name symbol))
    ((parameters parameter-list)
     (parameter-names parameter-name-translation)
     (procedure expression)
     (documentation string))
    ((foo
      foo-params
      foo-pnames
      (lambda (rdb) (lambda args (print args)))
      "test command arguments"))))

(define (dbutil:serve-command-line rdb command-table command argv)
  (set! *argv* (if (vector? argv) (vector->list argv) argv))
  ((make-command-server rdb command-table)
   command
   (lambda (comname comval options positions
                    arities types defaulters dirs aliases)
     (apply comval (getopt->arglist options positions
                    arities types defaulters dirs aliases)))))

(define (cmd . opts)
  (fluid-let ((*optind* 1))
    (printf "%-34s => "
            (call-with-output-string
             (lambda (pt) (write (cons 'cmd opts) pt))))
    (set! opts (cons "cmd" opts))
    (force-output)
    (dbutil:serve-command-line
     my-rdb 'my-commands 'foo (length opts) opts)))

(cmd)                              => ("str" () (symb) () #f)
(cmd "-f")                         => ("str" () (symb) () #t)
(cmd "--flag")                     => ("str" () (symb) () #t)
(cmd "-o177")                      => ("str" () (symb) (177) #f)
(cmd "-o" "177")                   => ("str" () (symb) (177) #f)
(cmd "--optional" "621")           => ("str" () (symb) (621) #f)
(cmd "--optional=621")             => ("str" () (symb) (621) #f)
(cmd "-s" "speciality")            => ("speciality" () (symb) () #f)
(cmd "-sspeciality")               => ("speciality" () (symb) () #f)
(cmd "--single" "serendipity")     => ("serendipity" () (symb) () #f)
(cmd "--single=serendipity")       => ("serendipity" () (symb) () #f)
(cmd "-n" "gravity" "piety")       => ("str" () (piety gravity) () #f)
(cmd "-ngravity" "piety")          => ("str" () (piety gravity) () #f)
(cmd "--nary" "chastity")          => ("str" () (chastity) () #f)
(cmd "--nary=chastity" "")         => ("str" () ( chastity) () #f)
(cmd "-N" "calamity")              => ("str" () (calamity) () #f)
(cmd "-Ncalamity")                 => ("str" () (calamity) () #f)
(cmd "--nary1" "surety")           => ("str" () (surety) () #f)
(cmd "--nary1=surety")             => ("str" () (surety) () #f)
(cmd "-N" "levity" "fealty")       => ("str" () (fealty levity) () #f)
(cmd "-Nlevity" "fealty")          => ("str" () (fealty levity) () #f)
(cmd "--nary1" "surety" "brevity") => ("str" () (brevity surety) () #f)
(cmd "--nary1=surety" "brevity")   => ("str" () (brevity surety) () #f)
(cmd "-?")
-|
Usage: cmd [OPTION ARGUMENT ...] ...

  -f, --flag
  -o, --optional[=]<number>
  -n, --nary[=]<symbols> ...
  -N, --nary1[=]<symbols> ...
  -s, --single[=]<string>

ERROR: getopt->parameter-list "unrecognized option" "-?"


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.5 Database Macros

(require 'within-database)

The object-oriented programming interface to SLIB relational databases has failed to support clear, understandable, and modular code-writing for database applications.

This seems to be a failure of the object-oriented paradigm where the type of an object is not manifest (or even traceable) in source code.

within-database, along with the `databases' package, reorganizes high-level database functions toward a more declarative style. Using this package, one can tag database table and command declarations for emacs:

 
etags -lscheme -r'/ *(define-\(command\|table\) (\([^; \t]+\)/\2/' \
      source1.scm ...

6.1.5.1 Within-database Example  

Function: within-database database statement-1 ...

within-database creates a lexical scope in which the commands define-table and define-command create tables and *commands*-table entries respectively in open relational database database.

within-database Returns database.

Syntax: define-command (<name> <rdb>) "comment" <expression1> <expression2> ...
Syntax: define-command (<name> <rdb>) <expression1> <expression2> ...

Adds to the *commands* table a command <name>:

 
(lambda (<name> <rdb>) <expression1> <expression2> ...)

Syntax: define-table <name> <descriptor-name> <descriptor-name> <rows>
Syntax: define-table <name> <primary-key-fields> <other-fields> <rows>

where <name> is the table name, <descriptor-name> is the symbol name of a descriptor table, <primary-key-fields> and <other-fields> describe the primary keys and other fields respectively, and <rows> is a list of data rows to be added to the table.

<primary-key-fields> and <other-fields> are lists of field descriptors of the form:

 
(<column-name> <domain>)
or
 
(<column-name> <domain> <column-integrity-rule>)

where <column-name> is the column name, <domain> is the domain of the column, and <column-integrity-rule> is an expression whose value is a procedure of one argument (which returns #f to signal an error).

If <domain> is not a defined domain name and it matches the name of this table or an already defined (in one of spec-0 ...) single key field table, a foreign-key domain will be created for it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.5.1 Within-database Example

Here is an example of within-database macros:

 
(require 'within-database)

(define my-rdb
  (add-command-tables
   (create-database "foo.db" 'alist-table)))

(within-database my-rdb
  (define-command (*initialize* rdb)
    "Print Welcome"
    (display "Welcome")
    (newline)
    rdb)
  (define-command (without-documentation rdb)
    (display "without-documentation called")
    (newline))
  (define-table (processor-family
                 ((family   atom))
                 ((also-ran processor-family)))
    (m68000  #f)
    (m68030  m68000)
    (i386    i8086)
    (i8086   #f)
    (powerpc #f))
  (define-table (platform
                 ((name symbol))
                 ((processor processor-family)
                  (os        symbol)
                  (compiler  symbol)))
    (aix              powerpc aix     -)
    ;; ...
    (amiga-aztec      m68000  amiga   aztec)
    (amiga-sas/c-5.10 m68000  amiga   sas/c)
    (atari-st-gcc     m68000  atari   gcc)
    ;; ...
    (watcom-9.0       i386    ms-dos  watcom))
  (define-command (get-processor rdb)
    "Get processor for given platform."
    (((rdb 'open-table) 'platform #f) 'get 'processor)))

(close-database my-rdb)

(set! my-rdb (open-command-database! "foo.db"))
-|
Welcome

(my-rdb 'without-documentation)
-|
without-documentation called

((my-rdb 'get-processor) 'amiga-sas/c-5.10)
=> m68000

(close-database my-rdb)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1.6 Database Browser

(require 'database-browse)

Procedure: browse database

Prints the names of all the tables in database and sets browse's default to database.

Procedure: browse

Prints the names of all the tables in the default database.

Procedure: browse table-name

For each record of the table named by the symbol table-name, prints a line composed of all the field values.

Procedure: browse pathname

Opens the database named by the string pathname, prints the names of all its tables, and sets browse's default to the database.

Procedure: browse database table-name

Sets browse's default to database and prints the records of the table named by the symbol table-name.

Procedure: browse pathname table-name

Opens the database named by the string pathname and sets browse's default to it; browse prints the records of the table named by the symbol table-name.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2 Relational Infrastructure

6.2.1 Base Table  
6.2.2 Catalog Representation  
6.2.3 Relational Database Objects  
6.2.4 Database Operations  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1 Base Table

A base-table is the primitive database layer upon which SLIB relational databases are built. At the minimum, it must support the types integer, symbol, string, and boolean. The base-table may restrict the size of integers, symbols, and strings it supports.

A base table implementation is available as the value of the identifier naming it (eg. alist-table) after requiring the symbol of that name.

Feature: alist-table
(require 'alist-table)

Association-list base tables support all Scheme types and are suitable for small databases. In order to be retrieved after being written to a file, the data stored should include only objects which are readable and writeable in the Scheme implementation.

The alist-table base-table implementation is included in the SLIB distribution.

WB is a B-tree database package with SCM interfaces. Being disk-based, WB databases readily store and access hundreds of megabytes of data. WB comes with two base-table embeddings.

Feature: wb-table
(require 'wb-table)

wb-table supports scheme expressions for keys and values whose text representations are less than 255 characters in length. See section `wb-table' in WB.

Feature: rwb-isam
(require 'rwb-isam)

rwb-isam is a sophisticated base-table implementation built on WB and SCM which uses binary numerical formats for key and non-key fields. It supports IEEE floating-point and fixed-precision integer keys with the correct numerical collation order.

This rest of this section documents the interface for a base table implementation from which the 6.1 Relational Database package constructs a Relational system. It will be of interest primarily to those wishing to port or write new base-table implementations.

Variable: *base-table-implementations*
To support automatic dispatch for open-database, each base-table module adds an association to *base-table-implementations* when loaded. This association is the list of the base-table symbol and the value returned by (make-relational-system base-table).

6.2.1.1 The Base  
6.2.1.2 Base Tables  
6.2.1.3 Base Field Types  
6.2.1.4 Composite Keys  
6.2.1.5 Base Record Operations  
6.2.1.6 Match Keys  
6.2.1.7 Aggregate Base Operations  
6.2.1.8 Base ISAM Operations  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.1 The Base

All of these functions are accessed through a single procedure by calling that procedure with the symbol name of the operation. A procedure will be returned if that operation is supported and #f otherwise. For example:

 
(require 'alist-table)
(define my-base (alist-table 'make-base))
my-base         => *a procedure*
(define foo (alist-table 'foo))
foo             => #f

Operation: base-table make-base filename key-dimension column-types
Returns a new, open, low-level database (collection of tables) associated with filename. This returned database has an empty table associated with catalog-id. The positive integer key-dimension is the number of keys composed to make a primary-key for the catalog table. The list of symbols column-types describes the types of each column for that table. If the database cannot be created as specified, #f is returned.

Calling the close-base method on this database and possibly other operations will cause filename to be written to. If filename is #f a temporary, non-disk based database will be created if such can be supported by the base table implelentation.

Operation: base-table open-base filename mutable
Returns an open low-level database associated with filename. If mutable is #t, this database will have methods capable of effecting change to the database. If mutable is #f, only methods for inquiring the database will be available. If the database cannot be opened as specified #f is returned.

Calling the close-base (and possibly other) method on a mutable database will cause filename to be written to.

Operation: base-table write-base lldb filename
Causes the low-level database lldb to be written to filename. If the write is successful, also causes lldb to henceforth be associated with filename. Calling the close-database (and possibly other) method on lldb may cause filename to be written to. If filename is #f this database will be changed to a temporary, non-disk based database if such can be supported by the underlying base table implelentation. If the operations completed successfully, #t is returned. Otherwise, #f is returned.

Operation: base-table sync-base lldb
Causes the file associated with the low-level database lldb to be updated to reflect its current state. If the associated filename is #f, no action is taken and #f is returned. If this operation completes successfully, #t is returned. Otherwise, #f is returned.

Operation: base-table close-base lldb
Causes the low-level database lldb to be written to its associated file (if any). If the write is successful, subsequent operations to lldb will signal an error. If the operations complete successfully, #t is returned. Otherwise, #f is returned.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.2 Base Tables

Operation: base-table make-table lldb key-dimension column-types
Returns the ordinal base-id for a new base table, otherwise returns #f. The base table can then be opened using (open-table lldb base-id). The positive integer key-dimension is the number of keys composed to make a primary-key for this table. The list of symbols column-types describes the types of each column.

Operation: base-table open-table lldb base-id key-dimension column-types
Returns a handle for an existing base table in the low-level database lldb if that table exists and can be opened in the mode indicated by mutable, otherwise returns #f.

As with make-table, the positive integer key-dimension is the number of keys composed to make a primary-key for this table. The list of symbols column-types describes the types of each column.

Operation: base-table kill-table lldb base-id key-dimension column-types
Returns #t if the base table associated with base-id was removed from the low level database lldb, and #f otherwise.

Operation: base-table catalog-id
A constant base-id ordinal suitable for passing as a parameter to open-table. catalog-id will be used as the base table for the system catalog.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.3 Base Field Types

Operation: base-table supported-type? symbol
Returns #t if symbol names a type allowed as a column value by the implementation, and #f otherwise. At a minimum, an implementation must support the types integer, ordinal, symbol, string, and boolean.

Operation: base-table supported-key-type? symbol
Returns #t if symbol names a type allowed as a key value by the implementation, and #f otherwise. At a minimum, an implementation must support the types ordinal, and symbol.

An ordinal is an exact positive integer. The other types are standard Scheme.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.4 Composite Keys

Operation: base-table make-keyifier-1 type
Returns a procedure which accepts a single argument which must be of type type. This returned procedure returns an object suitable for being a key argument in the functions whose descriptions follow.

Any 2 arguments of the supported type passed to the returned function which are not equal? must result in returned values which are not equal?.

Operation: base-table make-list-keyifier key-dimension types
The list of symbols types must have at least key-dimension elements. Returns a procedure which accepts a list of length key-dimension and whose types must corresopond to the types named by types. This returned procedure combines the elements of its list argument into an object suitable for being a key argument in the functions whose descriptions follow.

Any 2 lists of supported types (which must at least include symbols and non-negative integers) passed to the returned function which are not equal? must result in returned values which are not equal?.

Operation: base-table make-key-extractor key-dimension types column-number
Returns a procedure which accepts objects produced by application of the result of (make-list-keyifier key-dimension types). This procedure returns a key which is equal? to the column-numberth element of the list which was passed to create composite-key. The list types must have at least key-dimension elements.

Operation: base-table make-key->list key-dimension types
Returns a procedure which accepts objects produced by application of the result of (make-list-keyifier key-dimension types). This procedure returns a list of keys which are elementwise equal? to the list which was passed to create composite-key.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.5 Base Record Operations

In the following functions, the key argument can always be assumed to be the value returned by a call to a keyify routine.

Operation: base-table present? handle key
Returns a non-#f value if there is a row associated with key in the table opened in handle and #f otherwise.

Operation: base-table make-getter key-dimension types
Returns a procedure which takes arguments handle and key. This procedure returns a list of the non-primary values of the relation (in the base table opened in handle) whose primary key is key if it exists, and #f otherwise.

make-getter-1 is a new operation. The relational-database module works with older base-table implementations by using make-getter.

Operation: base-table make-getter-1 key-dimension types index
Returns a procedure which takes arguments handle and key. This procedure returns the value of the indexth field (in the base table opened in handle) whose primary key is key if it exists, and #f otherwise.

index must be larger than key-dimension.

Operation: base-table make-putter key-dimension types
Returns a procedure which takes arguments handle and key and value-list. This procedure associates the primary key key with the values in value-list (in the base table opened in handle) and returns an unspecified value.

Operation: base-table delete handle key
Removes the row associated with key from the table opened in handle. An unspecified value is returned.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.6 Match Keys

A match-keys argument is a list of length equal to the number of primary keys. The match-keys restrict the actions of the table command to those records whose primary keys all satisfy the corresponding element of the match-keys list. The elements and their actions are:

#f
The false value matches any key in the corresponding position.
an object of type procedure
This procedure must take a single argument, the key in the corresponding position. Any key for which the procedure returns a non-false value is a match; Any key for which the procedure returns a #f is not.
other values
Any other value matches only those keys equal? to it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.7 Aggregate Base Operations

The key-dimension and column-types arguments are needed to decode the composite-keys for matching with match-keys.

Operation: base-table delete* handle key-dimension column-types match-keys
Removes all rows which satisfy match-keys from the table opened in handle. An unspecified value is returned.

Operation: base-table for-each-key handle procedure key-dimension column-types match-keys
Calls procedure once with each key in the table opened in handle which satisfy match-keys in an unspecified order. An unspecified value is returned.

Operation: base-table map-key handle procedure key-dimension column-types match-keys
Returns a list of the values returned by calling procedure once with each key in the table opened in handle which satisfy match-keys in an unspecified order.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.1.8 Base ISAM Operations

These operations are optional for a Base-Table implementation.

Operation: base-table ordered-for-each-key handle procedure key-dimension column-types match-keys
Calls procedure once with each key in the table opened in handle which satisfy match-keys in the natural order for the types of the primary key fields of that table. An unspecified value is returned.

Operation: base-table make-nexter handle key-dimension column-types index
Returns a procedure of arguments key1 key2 ... which returns the key-list identifying the lowest record higher than key1 key2 ... which is stored in the base-table and which differs in column index or a lower indexed key; or false if no higher record is present.

Operation: base-table make-prever handle key-dimension column-types index
Returns a procedure of arguments key1 key2 ... which returns the key-list identifying the highest record less than key1 key2 ... which is stored in the base-table and which differs in column index or a lower indexed key; or false if no higher record is present.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.2 Catalog Representation

Each database (in an implementation) has a system catalog which describes all the user accessible tables in that database (including itself).

The system catalog base table has the following fields. PRI indicates a primary key for that table.

 
PRI table-name
    column-limit            the highest column number
    coltab-name             descriptor table name
    bastab-id               data base table identifier
    user-integrity-rule
    view-procedure          A scheme thunk which, when called,
                            produces a handle for the view.  coltab
                            and bastab are specified if and only if
                            view-procedure is not.

Descriptors for base tables (not views) are tables (pointed to by system catalog). Descriptor (base) tables have the fields:

 
PRI column-number           sequential integers from 1
    primary-key?            boolean TRUE for primary key components
    column-name
    column-integrity-rule
    domain-name

A primary key is any column marked as primary-key? in the corresponding descriptor table. All the primary-key? columns must have lower column numbers than any non-primary-key? columns. Every table must have at least one primary key. Primary keys must be sufficient to distinguish all rows from each other in the table. All of the system defined tables have a single primary key.

A domain is a category describing the allowable values to occur in a column. It is described by a (base) table with the fields:

 
PRI domain-name
    foreign-table
    domain-integrity-rule
    type-id
    type-param

The type-id field value is a symbol. This symbol may be used by the underlying base table implementation in storing that field.

If the foreign-table field is non-#f then that field names a table from the catalog. The values for that domain must match a primary key of the table referenced by the type-param (or #f, if allowed). This package currently does not support composite foreign-keys.

The types for which support is planned are:

 
    atom
    symbol
    string                  [<length>]
    number                  [<base>]
    money                   <currency>
    date-time
    boolean

    foreign-key             <table-name>
    expression
    virtual                 <expression>


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.3 Relational Database Objects

This object-oriented interface is deprecated for typical database applications; 6.1.1 Using Databases provides an application programmer interface which is easier to understand and use.

Function: make-relational-system base-table-implementation

Returns a procedure implementing a relational database using the base-table-implementation.

All of the operations of a base table implementation are accessed through a procedure defined by requireing that implementation. Similarly, all of the operations of the relational database implementation are accessed through the procedure returned by make-relational-system. For instance, a new relational database could be created from the procedure returned by make-relational-system by:

 
(require 'alist-table)
(define relational-alist-system
        (make-relational-system alist-table))
(define create-alist-database
        (relational-alist-system 'create-database))
(define my-database
        (create-alist-database "mydata.db"))

What follows are the descriptions of the methods available from relational system returned by a call to make-relational-system.

Operation: relational-system create-database filename

Returns an open, nearly empty relational database associated with filename. The only tables defined are the system catalog and domain table. Calling the close-database method on this database and possibly other operations will cause filename to be written to. If filename is #f a temporary, non-disk based database will be created if such can be supported by the underlying base table implelentation. If the database cannot be created as specified #f is returned. For the fields and layout of descriptor tables, 6.2.2 Catalog Representation

Operation: relational-system open-database filename mutable?

Returns an open relational database associated with filename. If mutable? is #t, this database will have methods capable of effecting change to the database. If mutable? is #f, only methods for inquiring the database will be available. Calling the close-database (and possibly other) method on a mutable? database will cause filename to be written to. If the database cannot be opened as specified #f is returned.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2.4 Database Operations

This object-oriented interface is deprecated for typical database applications; 6.1.1 Using Databases provides an application programmer interface which is easier to understand and use.

These are the descriptions of the methods available from an open relational database. A method is retrieved from a database by calling the database with the symbol name of the operation. For example:

 
(define my-database
        (create-alist-database "mydata.db"))
(define telephone-table-desc
        ((my-database 'create-table) 'telephone-table-desc))

Operation: relational-database close-database
Causes the relational database to be written to its associated file (if any). If the write is successful, subsequent operations to this database will signal an error. If the operations completed successfully, #t is returned. Otherwise, #f is returned.

Operation: relational-database write-database filename
Causes the relational database to be written to filename. If the write is successful, also causes the database to henceforth be associated with filename. Calling the close-database (and possibly other) method on this database will cause filename to be written to. If filename is #f this database will be changed to a temporary, non-disk based database if such can be supported by the underlying base table implelentation. If the operations completed successfully, #t is returned. Otherwise, #f is returned.

Operation: relational-database sync-database
Causes any pending updates to the database file to be written out. If the operations completed successfully, #t is returned. Otherwise, #f is returned.

Operation: relational-database solidify-database
Causes any pending updates to the database file to be written out. If the writes completed successfully, then the database is changed to be immutable and #t is returned. Otherwise, #f is returned.

Operation: relational-database table-exists? table-name
Returns #t if table-name exists in the system catalog, otherwise returns #f.

Operation: relational-database open-table table-name mutable?
Returns a methods procedure for an existing relational table in this database if it exists and can be opened in the mode indicated by mutable?, otherwise returns #f.

These methods will be present only in mutable databases.

Operation: relational-database delete-table table-name
Removes and returns the table-name row from the system catalog if the table or view associated with table-name gets removed from the database, and #f otherwise.

Operation: relational-database create-table table-desc-name
Returns a methods procedure for a new (open) relational table for describing the columns of a new base table in this database, otherwise returns #f. For the fields and layout of descriptor tables, See section 6.2.2 Catalog Representation.

Operation: relational-database create-table table-name table-desc-name
Returns a methods procedure for a new (open) relational table with columns as described by table-desc-name, otherwise returns #f.

Operation: relational-database create-view ??
Operation: relational-database project-table ??
Operation: relational-database restrict-table ??
Operation: relational-database cart-prod-tables ??
Not yet implemented.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3 Weight-Balanced Trees

(require 'wt-tree)

Balanced binary trees are a useful data structure for maintaining large sets of ordered objects or sets of associations whose keys are ordered. MIT Scheme has an comprehensive implementation of weight-balanced binary trees which has several advantages over the other data structures for large aggregates:

These features make weight-balanced trees suitable for a wide range of applications, especially those that require large numbers of sets or discrete maps. Applications that have a few global databases and/or concentrate on element-level operations like insertion and lookup are probably better off using hash-tables or red-black trees.

The size of a tree is the number of associations that it contains. Weight balanced binary trees are balanced to keep the sizes of the subtrees of each node within a constant factor of each other. This ensures logarithmic times for single-path operations (like lookup and insertion). A weight balanced tree takes space that is proportional to the number of associations in the tree. For the current implementation, the constant of proportionality is six words per association.

Weight balanced trees can be used as an implementation for either discrete sets or discrete maps (associations). Sets are implemented by ignoring the datum that is associated with the key. Under this scheme if an associations exists in the tree this indicates that the key of the association is a member of the set. Typically a value such as (), #t or #f is associated with the key.

Many operations can be viewed as computing a result that, depending on whether the tree arguments are thought of as sets or maps, is known by two different names. An example is wt-tree/member?, which, when regarding the tree argument as a set, computes the set membership operation, but, when regarding the tree as a discrete map, wt-tree/member? is the predicate testing if the map is defined at an element in its domain. Most names in this package have been chosen based on interpreting the trees as sets, hence the name wt-tree/member? rather than wt-tree/defined-at?.

The weight balanced tree implementation is a run-time-loadable option. To use weight balanced trees, execute

 
(load-option 'wt-tree)

once before calling any of the procedures defined here.

6.3.1 Construction of Weight-Balanced Trees  
6.3.2 Basic Operations on Weight-Balanced Trees  
6.3.3 Advanced Operations on Weight-Balanced Trees  
6.3.4 Indexing Operations on Weight-Balanced Trees  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3.1 Construction of Weight-Balanced Trees

Binary trees require there to be a total order on the keys used to arrange the elements in the tree. Weight balanced trees are organized by types, where the type is an object encapsulating the ordering relation. Creating a tree is a two-stage process. First a tree type must be created from the predicate which gives the ordering. The tree type is then used for making trees, either empty or singleton trees or trees from other aggregate structures like association lists. Once created, a tree `knows' its type and the type is used to test compatibility between trees in operations taking two trees. Usually a small number of tree types are created at the beginning of a program and used many times throughout the program's execution.

procedure+: make-wt-tree-type key<?
This procedure creates and returns a new tree type based on the ordering predicate key<?. Key<? must be a total ordering, having the property that for all key values a, b and c:

 
(key<? a a)                         => #f
(and (key<? a b) (key<? b a))       => #f
(if (and (key<? a b) (key<? b c))
    (key<? a c)
    #t)                             => #t

Two key values are assumed to be equal if neither is less than the other by key<?.

Each call to make-wt-tree-type returns a distinct value, and trees are only compatible if their tree types are eq?. A consequence is that trees that are intended to be used in binary tree operations must all be created with a tree type originating from the same call to make-wt-tree-type.

variable+: number-wt-type
A standard tree type for trees with numeric keys. Number-wt-type could have been defined by

 
(define number-wt-type (make-wt-tree-type  <))

variable+: string-wt-type
A standard tree type for trees with string keys. String-wt-type could have been defined by

 
(define string-wt-type (make-wt-tree-type  string<?))

procedure+: make-wt-tree wt-tree-type
This procedure creates and returns a newly allocated weight balanced tree. The tree is empty, i.e. it contains no associations. Wt-tree-type is a weight balanced tree type obtained by calling make-wt-tree-type; the returned tree has this type.

procedure+: singleton-wt-tree wt-tree-type key datum
This procedure creates and returns a newly allocated weight balanced tree. The tree contains a single association, that of datum with key. Wt-tree-type is a weight balanced tree type obtained by calling make-wt-tree-type; the returned tree has this type.

procedure+: alist->wt-tree tree-type alist
Returns a newly allocated weight-balanced tree that contains the same associations as alist. This procedure is equivalent to:

 
(lambda (type alist)
  (let ((tree (make-wt-tree type)))
    (for-each (lambda (association)
                (wt-tree/add! tree
                              (car association)
                              (cdr association)))
              alist)
    tree))


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3.2 Basic Operations on Weight-Balanced Trees

This section describes the basic tree operations on weight balanced trees. These operations are the usual tree operations for insertion, deletion and lookup, some predicates and a procedure for determining the number of associations in a tree.

procedure+: wt-tree/empty? wt-tree
Returns #t if wt-tree contains no associations, otherwise returns #f.

procedure+: wt-tree/size wt-tree
Returns the number of associations in wt-tree, an exact non-negative integer. This operation takes constant time.

procedure+: wt-tree/add wt-tree key datum
Returns a new tree containing all the associations in wt-tree and the association of datum with key. If wt-tree already had an association for key, the new association overrides the old. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/add! wt-tree key datum
Associates datum with key in wt-tree and returns an unspecified value. If wt-tree already has an association for key, that association is replaced. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/member? key wt-tree
Returns #t if wt-tree contains an association for key, otherwise returns #f. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/lookup wt-tree key default
Returns the datum associated with key in wt-tree. If wt-tree doesn't contain an association for key, default is returned. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/delete wt-tree key
Returns a new tree containing all the associations in wt-tree, except that if wt-tree contains an association for key, it is removed from the result. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/delete! wt-tree key
If wt-tree contains an association for key the association is removed. Returns an unspecified value. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3.3 Advanced Operations on Weight-Balanced Trees

In the following the size of a tree is the number of associations that the tree contains, and a smaller tree contains fewer associations.

procedure+: wt-tree/split< wt-tree bound
Returns a new tree containing all and only the associations in wt-tree which have a key that is less than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of the size of wt-tree.

procedure+: wt-tree/split> wt-tree bound
Returns a new tree containing all and only the associations in wt-tree which have a key that is greater than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of size of wt-tree.

procedure+: wt-tree/union wt-tree-1 wt-tree-2
Returns a new tree containing all the associations from both trees. This operation is asymmetric: when both trees have an association for the same key, the returned tree associates the datum from wt-tree-2 with the key. Thus if the trees are viewed as discrete maps then wt-tree/union computes the map override of wt-tree-1 by wt-tree-2. If the trees are viewed as sets the result is the set union of the arguments. The worst-case time required by this operation is proportional to the sum of the sizes of both trees. If the minimum key of one tree is greater than the maximum key of the other tree then the time required is at worst proportional to the logarithm of the size of the larger tree.

procedure+: wt-tree/intersection wt-tree-1 wt-tree-2
Returns a new tree containing all and only those associations from wt-tree-1 which have keys appearing as the key of an association in wt-tree-2. Thus the associated data in the result are those from wt-tree-1. If the trees are being used as sets the result is the set intersection of the arguments. As a discrete map operation, wt-tree/intersection computes the domain restriction of wt-tree-1 to (the domain of) wt-tree-2. The time required by this operation is never worse that proportional to the sum of the sizes of the trees.

procedure+: wt-tree/difference wt-tree-1 wt-tree-2
Returns a new tree containing all and only those associations from wt-tree-1 which have keys that do not appear as the key of an association in wt-tree-2. If the trees are viewed as sets the result is the asymmetric set difference of the arguments. As a discrete map operation, it computes the domain restriction of wt-tree-1 to the complement of (the domain of) wt-tree-2. The time required by this operation is never worse that proportional to the sum of the sizes of the trees.

procedure+: wt-tree/subset? wt-tree-1 wt-tree-2
Returns #t iff the key of each association in wt-tree-1 is the key of some association in wt-tree-2, otherwise returns #f. Viewed as a set operation, wt-tree/subset? is the improper subset predicate. A proper subset predicate can be constructed:

 
(define (proper-subset? s1 s2)
  (and (wt-tree/subset? s1 s2)
       (< (wt-tree/size s1) (wt-tree/size s2))))

As a discrete map operation, wt-tree/subset? is the subset test on the domain(s) of the map(s). In the worst-case the time required by this operation is proportional to the size of wt-tree-1.

procedure+: wt-tree/set-equal? wt-tree-1 wt-tree-2
Returns #t iff for every association in wt-tree-1 there is an association in wt-tree-2 that has the same key, and vice versa.

Viewing the arguments as sets wt-tree/set-equal? is the set equality predicate. As a map operation it determines if two maps are defined on the same domain.

This procedure is equivalent to

 
(lambda (wt-tree-1 wt-tree-2)
  (and (wt-tree/subset? wt-tree-1 wt-tree-2
       (wt-tree/subset? wt-tree-2 wt-tree-1)))

In the worst-case the time required by this operation is proportional to the size of the smaller tree.

procedure+: wt-tree/fold combiner initial wt-tree
This procedure reduces wt-tree by combining all the associations, using an reverse in-order traversal, so the associations are visited in reverse order. Combiner is a procedure of three arguments: a key, a datum and the accumulated result so far. Provided combiner takes time bounded by a constant, wt-tree/fold takes time proportional to the size of wt-tree.

A sorted association list can be derived simply:

 
(wt-tree/fold  (lambda (key datum list)
                 (cons (cons key datum) list))
               '()
               wt-tree))

The data in the associations can be summed like this:

 
(wt-tree/fold  (lambda (key datum sum) (+ sum datum))
               0
               wt-tree)

procedure+: wt-tree/for-each action wt-tree
This procedure traverses the tree in-order, applying action to each association. The associations are processed in increasing order of their keys. Action is a procedure of two arguments which take the key and datum respectively of the association. Provided action takes time bounded by a constant, wt-tree/for-each takes time proportional to in the size of wt-tree. The example prints the tree:

 
(wt-tree/for-each (lambda (key value)
                    (display (list key value)))
                  wt-tree))


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3.4 Indexing Operations on Weight-Balanced Trees

Weight balanced trees support operations that view the tree as sorted sequence of associations. Elements of the sequence can be accessed by position, and the position of an element in the sequence can be determined, both in logarthmic time.

procedure+: wt-tree/index wt-tree index
procedure+: wt-tree/index-datum wt-tree index
procedure+: wt-tree/index-pair wt-tree index
Returns the 0-based indexth association of wt-tree in the sorted sequence under the tree's ordering relation on the keys. wt-tree/index returns the indexth key, wt-tree/index-datum returns the datum associated with the indexth key and wt-tree/index-pair returns a new pair (key . datum) which is the cons of the indexth key and its datum. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

These operations signal an error if the tree is empty, if index<0, or if index is greater than or equal to the number of associations in the tree.

Indexing can be used to find the median and maximum keys in the tree as follows:

 
median:  (wt-tree/index wt-tree (quotient (wt-tree/size wt-tree) 2))

maximum: (wt-tree/index wt-tree (-1+ (wt-tree/size wt-tree)))

procedure+: wt-tree/rank wt-tree key
Determines the 0-based position of key in the sorted sequence of the keys under the tree's ordering relation, or #f if the tree has no association with for key. This procedure returns either an exact non-negative integer or #f. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

procedure+: wt-tree/min wt-tree
procedure+: wt-tree/min-datum wt-tree
procedure+: wt-tree/min-pair wt-tree
Returns the association of wt-tree that has the least key under the tree's ordering relation. wt-tree/min returns the least key, wt-tree/min-datum returns the datum associated with the least key and wt-tree/min-pair returns a new pair (key . datum) which is the cons of the minimum key and its datum. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

These operations signal an error if the tree is empty. They could be written

 
(define (wt-tree/min tree)        (wt-tree/index tree 0))
(define (wt-tree/min-datum tree)  (wt-tree/index-datum tree 0))
(define (wt-tree/min-pair tree)   (wt-tree/index-pair tree 0))

procedure+: wt-tree/delete-min wt-tree
Returns a new tree containing all of the associations in wt-tree except the association with the least key under the wt-tree's ordering relation. An error is signalled if the tree is empty. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree. This operation is equivalent to

 
(wt-tree/delete wt-tree (wt-tree/min wt-tree))

procedure+: wt-tree/delete-min! wt-tree
Removes the association with the least key under the wt-tree's ordering relation. An error is signalled if the tree is empty. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree. This operation is equivalent to

 
(wt-tree/delete! wt-tree (wt-tree/min wt-tree))


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Steve Langasek on January, 10 2005 using texi2html