• Author: Christian, Eliot J.
  • Tape: 1 of 2
  • 0:00 Opening slides
  • 1:00 Tim Gauslin: overview of 4 hours on WAIS
  • 2:10 Jim Fulton: Document delivery system, Public Domain Release with UNIX server and several clients
  • 4:00 Relevance feedback, text searcher, clients
  • 4:45 Changes to original release at various places: spatial data server, Postgres, Ingres, access to 1-2 Tb databases
  • 6:30 Clients: Simple WAIS, for Microsoft Windows, HyperWAIS, Motif, OpenLook, Mosaic
  • 9:10 Managing a Server: content (break into smallest unit feasible: rationale, cohesive, useful to users), format (to allow indexing) role of info provider (understands data), proper strategy (trial and error component since follow law of averages with many users, lots of data)
  • 13:00 Server: integrated with text search engine; can access other servers (better text search, spatial data, relational DB), setting up a server under UNIX
  • 14:50Options in running WAIS server: -p = port (def. 210), -d = directory holding index files, -e = logging file, -l = logging level so none if zero or all if 10, -s for stdin as expected by inetd, -u user to run as to handle security concerns
  • 22:10 inetd starts up a version of server for each searcher connected, /etc/services describes connections through port
  • 24:30 Clients: regular, toolkits so can create own interface (HyperCard, Visual Basic, ToolBook), ex. phone book client
  • 27:00 Data types supported by public domain text searching engine -- tell how to extract key parts of info: headline, para, refer bibliographic entry, fields separated by dashed lines, etc. (END OF CLASS SHOWING)
  • 30:1516 Mbytes default memory for indexing, tradeoff of memory and speed, disk space
  • 33:00 Other flags and settings, catalog (none else gets too large), content (not if image, so just do file name - or will index from top in each file till finds non-text), types, viewers
  • 37:10 Primary and secondary types, multiple versions (text, PostScript), format (gif, mail, netnews, one-line, ...)
  • 40:00 Running indexer, syn file, flushing intermediate files
  • 42:15 Search engine: stemming, synonyms, phrases, weighting, length normalization
  • 44:10 Boolean support, discussion of user training and their need to know format assumed, discussion regarding handling high recall searches; future support of SGML (e.g., SGML viewer)
  • 49:15 WAIS Inc. provides server and services; Mark's company is also providing services/integration
  • 51:00 CNIDR and its other roles/responsibilities; FreeWAIS, full Z39.50 support
  • 51:45 Hardware required: depends on data size, number of users; memory is important if lots of simultaneous searches (since each runs a copy)
  • 54:30 Confidentiality: log files, how data is organized -- protect clients!
  • 55:15 Questions: stemming
  • 1:02:00 Tim Gauslin: list of pre-defined types for WAIS indexer
  • 1:07:10 Sections for customizing (long detailed discussion)
  • much laterJim Fulton: settings for indexing ...