Edward A. Fox
Department of Computer Science,
Virginia Tech, Blacksburg VA 24061-0106
Information can be represented in various media. Multimedia information systems may make use of audio, video, images, graphics, animations, and other types of media in addition to numbers and texts. While searching of multimedia databases often is based on searching of text descriptions, or involves vectors of features of the various representations, there are many new issues that must be faced when moving from text to a multiplicity of media. First, there is the problem of representation of each media type. There are special characteristics, and often requirements for large scale data compression. Second, there are issues related to coordinating and synchronizing the multiple media types. This relates to real time scheduling, network issues, and operating system support. Third, there are special computer systems (e.g., standalone systems like CD-TV or CD-I), new software/hardware technologies (e.g., DVI), and numerous standards (e.g., JPEG and MPEG for coding, HyTime for document architecture). These must be understood and related to given user requirements. Finally, there is need for methodologies for developing multimedia applications and for managing such projects. Here object-oriented methods are crucial, human-computer interaction guidelines must be followed, and new, useful metaphors and systems must be developed and refined.
This Unit has four articles that should be studied. In addition, the videotape on Interactive Digital Video can be viewed in class or in McB 110. In the lecture a general overview will be given using KMS or Acrobat. The demonstrations and videotapes help round out the picture.
The area of multimedia is rapidly emerging, in part because it allows computers to communicate in ways more convenient and effective with humans, and in part because of rapid improvements in technology. Improvements have enabled media such as audio and video to be presented by computers, then to be captured in small amounts, then to be compressed in larger amounts, and finally to be more easily managed by computers and networks.
As is discussed in the first reading, many areas of computer science can and must be applied. A whole new set of jargon has emerged, relating to storage units, compression techniques, networking approaches, and special computer systems.
Of crucial importance is compression. When images are compressed, say by using the new JPEG standard (see 2nd article), approximately 20:1 savings in space is realized. This allows images to be handled on current computers, and large image collections to be managed using CD-ROMs or network servers. JPEG makes use of the discrete cosine transform (DCT), Huffman or arithmetic coding, and some other special tricks that altogether can be handled by special chips or fast host processors.
Standards for video (e.g., MPEG and px64) build upon the JPEG techniques, adding in other methods to remove temporal redundancy (i.e., repetition from one frame of data to the next).
For multimedia to catch on, the next real barrier is software. Without real time operating systems and fast software, multimedia information cannot be presented and manipulated interactively by users. Without good software, developing multimedia applications is prohibitively expensive and time consuming. Without the use of modern object-oriented development efforts, both of the above types of efforts are doomed to early demise because of failures to keep up with rapid change in hardware, extension to myriad operating systems, and cross-platform portability requirements. The third (and fourth, optional) reading deals with many of these issues.
Though just exposed briefly to these concepts, students will find this area to be quite exciting, and will be equipped with some of the key knowledge and concepts needed to comprehend and work in this emerging industry.
From the Course Objectives an important goal is: read and understand research contributions ...; you will gain experience by reading the three CACM articles, by the in-class discussion relating to those, and by the videotape presentations and demonstrations of research systems.
This Unit has the following objectives, for students to be able to:
There are two main types of effort required, besides the usual exercises (see below) and quiz. First, the readings (see the next section) should be carefully studied, keeping unit objectives in mind (see the previous section). Second, if possible, the videotapes and lectures, and the demonstrations, should supplement the readings.
First, you should examine a number of images compressed to varying degrees according to the JPEG standard. The easiest way is for you to study the WWW version. Alternatively, if you want some hands on experience, use some computer that can handle X display, connect to video.cs.vt.edu, and run the program
Based on the file sizes and your assessment of quality, which of the versions of this image would you recommend putting out on WWW? Please explain briefly.
Students should run the program mpeg_play on the
Send to the instructor the name of the X terminal used, and the average frames per second for each of 5 files. (Hint: you may want to pick the smallest files, since that will take less time.) If you are not running X, but work with the WWW, see if you can determine how close to normal speed (30fps) these are. Please explain why these values are not 30.
Also, view a movie about Hawaii and tell the instructor briefly what that MPEG movie is about.
Please be sure to look at the WWW course notes for this unit and if your computer supports it, experiment with some of the audio and video files. Tell the instructor the 3 WWW pages you visited that you found most interesting, and explain briefly why.
You should be able to answer each of the following questions.
The four articles are relatively diverse, covering various aspects of the field. Respectively, they provide an overview, detailed discussion on compression methods and standards, one approach to systems software support, and finally an approach to authoring software.
This article provides an overview. It should be read carefully, and students should study and be able to refer to the definitions of acronyms, terms, and phrases. Note particularly the various standards and the approaches to compression.
This article should be carefully read up through page 35. The rest can be skimmed. It is important to understand the goals of JPEG and how DCT and Huffman coding work with quantization and zig zag encoding to yield a (variably reduced) compressed bitstream. If the images on page 42 cannot be clearly seen, you may wish to look at the original journal issue, on reserve in the library.
Intel's software for DVI is discussed in this article. It is quite interesting, tying in with work on object oriented system, software development, co-processor architectures, and general operating system issues. Please study the glossary carefully. Indeed, the whole article should be read closely. It is interesting to see the failings of the original DVI software, how a new better conceptual model was developed, and how it was implemented using an object oriented approach. Unfortunately, Intel has since discontinued DVI.
MediaView is an interesting system. It builds upon the metaphor of a long scrolling article, that can be enhanced with annotations that involve various media types. Though the article will not be required, it is of interest in that it shows how a simple approach to authoring can go a long way, and how important object-oriented development tools are when building such a complex piece of software.