- Author: Christian, Eliot J.
- Tape: 1 of 2
- 0:00 Opening slides
- 1:00 Tim Gauslin: overview of 4 hours on WAIS
- 2:10 Jim Fulton: Document delivery system, Public Domain
Release with UNIX server and several clients
- 4:00 Relevance feedback, text searcher, clients
- 4:45 Changes to original release at various places: spatial data
server, Postgres, Ingres, access to 1-2 Tb databases
- 6:30 Clients: Simple WAIS, for Microsoft Windows,
HyperWAIS, Motif, OpenLook, Mosaic
- 9:10 Managing a Server: content (break into smallest unit
feasible: rationale, cohesive, useful to users), format (to allow indexing) role
of info provider (understands data), proper strategy (trial and error
component since follow law of averages with many users, lots of data)
- 13:00 Server: integrated with text search engine; can access
other servers (better text search, spatial data, relational DB), setting up a
server under UNIX
- 14:50Options in running WAIS server: -p = port (def. 210), -d
= directory holding index files, -e = logging file, -l = logging level so none if
zero or all if 10, -s for stdin as expected by inetd, -u user to run as to handle
security concerns
- 22:10 inetd starts up a version of server for each searcher
connected, /etc/services describes connections through port
- 24:30 Clients: regular, toolkits so can create own interface
(HyperCard, Visual Basic, ToolBook), ex. phone book client
- 27:00 Data types supported by public domain text searching
engine -- tell how to extract key parts of info: headline, para, refer
bibliographic entry, fields separated by dashed lines, etc. (END OF
CLASS SHOWING)
- 30:1516 Mbytes default memory for indexing, tradeoff of
memory and speed, disk space
- 33:00 Other flags and settings, catalog (none else gets too
large), content (not if image, so just do file name - or will index from top in
each file till finds non-text), types, viewers
- 37:10 Primary and secondary types, multiple versions (text,
PostScript), format (gif, mail, netnews, one-line, ...)
- 40:00 Running indexer, syn file, flushing intermediate
files
- 42:15 Search engine: stemming, synonyms, phrases, weighting,
length normalization
- 44:10 Boolean support, discussion of user training and their
need to know format assumed, discussion regarding handling high recall
searches; future support of SGML (e.g., SGML viewer)
- 49:15 WAIS Inc. provides server and services; Mark's company
is also providing services/integration
- 51:00 CNIDR and its other roles/responsibilities; FreeWAIS,
full Z39.50 support
- 51:45 Hardware required: depends on data size, number of
users; memory is important if lots of simultaneous searches (since each runs a
copy)
- 54:30 Confidentiality: log files, how data is organized -- protect
clients!
- 55:15 Questions: stemming
- 1:02:00 Tim Gauslin: list of pre-defined types for WAIS
indexer
- 1:07:10 Sections for customizing (long detailed
discussion)
- much laterJim Fulton: settings for indexing ...