Edward A. Fox
Department of Computer Science
Virginia Tech, Blacksburg VA 24061-0106
On July 20-21, 1992, the National Science Foundation sponsored a workshop on digital libraries, prompted by an earlier proposal prepared by Lesk, Fox, and McGill, that called for a National Electronic Library for Science, Engineering, and Technology. NSF will fund a good deal of R&D in this area in the 1990s, helping bring to fruition the dreams of such visionaries as Vannevar Bush and J.C.R. Licklider.
At the workshop, David Hartzband of DEC recounted experiences of a major multinational corporation involved in office and factory automation. They found that technology alone is not enough, that social and anthropological knowledge is also needed to effect change. Thus, in this unit, which introduces the course and our theme of digital libraries, there will be readings and discussion about the legal issues and essential characteristics of digital media and digital libraries.
To give concreteness to the idea of digital libraries, you will use computer networks to access netlib, an electronic archive for numerical and mathematical software and related information. This will also illustrate how semi-interactive querying can be carried out using electronic mail (or xnetlib).
Since during this course we will be making use of the rapidly expanding digital library that is being developed in Project Envision, it is important to understand the background to that effort. The article on ACM Press Database and Electronic Products describes earlier work toward an ACM digital library, relates it to products and services for ACM members and other users, and discusses some of the financial and pragmatic aspects of such an archive.
This unit paves the way for discussions of technology, methodology, theory, commercial and research systems for IS&R. It sets the stage for detailed discussions, and introduces the key theme of digital libraries, that was part of legislation introduced by Senator Gore in 1992, and re-introduced in Congress in 1993.
From the Course Objectives a key point is to prepare students to discuss and explain the main issues relating to developing digital libraries and related services. Toward that end, this unit will deal with digital library efforts by ACM and by experts in numerical software, and will explore the legal and conceptual issues relating to the fundamental properties of digital media.
Other, specific objectives include being able to:
There are three main types of effort required. First, the readings (see section 5) should be carefully studied, keeping unit objectives in mind (see section 3). Second, students should work together in groups on the exercise and the digital libraries tutorial (along with the long quiz on it). Third, students should independently take the final quiz for the unit (a separate and shorter one than that on the tutorial) - this must be done separately by each student according to the honor code.
The class will break into groups. Please send the instructor the list of students in each group, so all can get credit together. The groups will discuss two WWW resources related to digital libraries.
The first is a tutorial on digital libraries available in Adobe's Portable Document Format (PDF). For the sake of the class, the short version of that will suffice. However, those interested in in-depth reading are encouraged to look at the long version. Once this has been studied, groups can then work together to answer the long multi-part self-study quiz available from the QUIZIT system.
The second is more fun, and should lead to a good discussion. Please read Raj Reddy's proposal for a Universal Library and prepare a group response, sending it to the instructor, who may send it along to Dr. Reddy if deemed of interest (e.g., making good suggestions, asking good questions).
In this exercise you will send email to netlib, following the instructions in the article [2]. A newer version of the instructions involve sending mail to the netlib@ornl.gov system. However, it is much better to click here with Mosaic or Netscape, now that WWW browsers can access information directly.
Please send a copy of all results you receive to the instructor for review.
Here are some hints:
Note: For information on all articles for the course, click here with Mosaic or Netscape.
Note: An online report on digital libraries and related research opportunities may be of interest. This includes definitions and use scenarios. Look for it in the long tutorial or use a WWW browser to access Interoperability, Scaling, and the Digital Libraries Research Agenda.
This article describes the early work on ACM Press Database and Electronic Products, and plans for the future. The research aspects of this program have been carried forward into Project Envision, and ACM Headquarters and the Publications Board, along with the Electronic Publishing Volunteer Advisory Committee, are coordinating work on a plan that will include an electronic archive and electronic submissions.
Paragraph 1 gives a snapshot of the status, which is amplified at the end of the Introduction. The earlier part of the Introduction describes in general terms what technological and related advances have made digital libraries possible.
The last 3 paragraphs of Vision are important, dealing with collection building, standards, and the main classes of services. The Challenges section calls for more vision and ideas (such as those given in the first paragraph of the Opportunities section), then for focused R&D, and finally for work on economic, social and legal issues. Funding will be needed from ACM SIG's (paragraph 2) and from partnerships (paragraph 3).
The Organization and Acknowledgments sections are not relevant. However, the Proposals section gives important guidance regarding what to look for in developing information products, services, or even multimedia information packages.
This article gives valuable insight into copyright and legal matters, but is especially helpful in pointing out important characteristics of digital media. The six characteristics should be carefully studies and pondered. The last one, on nonlinearity, should be re-read after completion of Unit 8 on Hypertext. Note that Ms. Samuelson's husband is Robert Glusko, who has been very active in R&D relating to hypertext.
The second section, on Replication, explains an influential court case and its implications, and discusses several clever schemes for generating revenue in connection with electronic publishing.
Key to our course are the remaining sections. Transmission and Multiple Use are essential parts of digital libraries, but there are serious dangers of piracy. Plasticity is one of the key added values of digital media, but protecting authors' rights will demand careful balancing of this benefit with the need for extending copyright protection to changes. Equivalence issues relate to multimedia, but are not clearly explained here. Compactness is not really the theme of the next section - rather it is about storage cost-effectiveness and storage hierarchies, an idea which dates back to such discussions as [3]. Issues of nonlinearity will be dealt with later in the course, but are previewed in an interesting way here.
The netlib system was one of the first to provide electronic mail-based access to archives, and serves the scientific computing community. This short article describes how the system works and can be used (though details and addresses have changed!), summarizes the contents of the archive, briefly explains the server, gives advantages and disadvantages, and closes with a list of needed future enhancements and opportunities.