CS4624 Text - Ch 4
Multimedia I/O Technologies
Skim: MIDI circuitry, device details, image processing, speech recog.
Study Carefully: resolution, quality, architectures
Terminology: note re devices (pens, printers, speech recog.)
Timeline: see Table 4-2 re displays
Key Concepts
- Quality (from input to multiple outputs)
- Digital vs. analog
- Encoding
- Resolution, bandwidth
- Pixels/inch: 1/2/3/600, 1200-1800
- Lines: 300, 500, >1000
- Resolution: 320x240, 1280x1024
- Frames/second, interleaving
- Bit depth: 16
- Audio samples: 8, 16 bit
- Audio sampling rate
MPC
- Version 1.0 vs. 2.0
- RAM, Disk
- MIDI
- CD-ROM
Pen
- Benefits, applications
- Operation
- Electronic pen and digitizer
- Pen driver, display driver
- Recognition context mgr, recognizer
- Dictionary
- Statistics
- Inking: 120 coordinates/sec
- Points: 200 dpi (1000 for digitizers)
Scanner
- Size: A, B, large form factor
- Flatbed, rotary drum, handheld
- CCDs: linearity
- 3 color: no. sources, no. filters
- Features: resolution, area, contrast, threshold, compression, autofeed
Digital camera
- Components
- CCDs to ADCs
- Memory
- I/O interface
- Benefits
- Applications
Video input architecture
- Multiplexer: decompressor, image file, TV
- ADC: sampling rate, resolution, linearity, speed
- Input lookup table per pixel
- Frame buffer: input, output
- Codec
- SVGA interface
- Analog output mixer
- Audio and video
- Bus bandwidth
Images
- Half-tones
- Dithering
- Enhancement
- Brightness
- Skew
- Contrast
- Sharpening
- Emphasis
- Manipulation
- Scaling
- Cropping
- Rotation
- Image processing
- Pixel point to point: contrast, brightness
- Histogram sliding
- Histogram stretching, shrinking
- Pixel threshold: white, black
- Interframe
- Scaling, rotation, translation
- Scale to gray
- Transform (frequency)
- Image averaging (cancel noise), subtraction (detect motion)
- Logical operations: overlay, mask
- Spatial filter: frequency, noise, sharpness
- Convolution, Laplacian
- Low pass, high pass
Animation
- Toggling between frames
- Rotating in frame loop
- Delta frame animation
- Palette animation
Displays
- Terminology
- Pixel, triad, convergence
- Pincushioning, barrel distortion
- Drift, jitter, swim; roping
- Glare, emissions
- Masks: shadow, slot
- Resolution, bandwidth
- Dot pitch, size of screen
- Refresh rates, flicker, interlacing
- Video board
- RAM
- Mixing, scaling, buffering
- CRT
- History, evolution
- MCA, CGA, EGA, VGA, 8514/A, XGA
- Flat panel
- mono, color
- passive matrix vs. active-matrix
Printers
- Quality: resolution, speed, feed
- CRT vs. printer
- RGB (additive) vs. CMYK (subtractive)
- Calibration
- Dithering, Moire pattern
- Laser
- Components: paper feed, paper guide, laser, corona, fuser, toner
- Dye sublimation
- Heat - 256 levels, continuous tone
- Dithering
- High quality: cyan, magenta, yellow, black
Audio
- ADC: conversion speed
- Sampling: bits (8, 16, 32), number/second
- Nyquist -> 5, 11, 22, 44 KHz
Voice
- Human speech issues
- Generation: process, anatomy: vocal cord, throat, tongue, teeth, lips
- Phonemes: smallest distinguishable sound
- Applications
- Command and control
- Voice mail
- Database input, query
- Speech characteristics
- Separation between words
- Speaker dependent / independent
- Training
- Storage
- Template vs. pattern
- Use of phonemes
- Word reference pattern
- Time warping
- Vocabulary size
- Types of recognition
- Isolated-word
- Parametric analysis
- Word as cluster from samples
- Connected-word
- Word spotting
- Find beginning, end of words
- Sliding pattern against input
- Continuous speech
- Digitization, amplitude normalization
- Time normalization, parametric rep
- Segmentation, labeling into phonemes
- 10 ms snapshots
- Phonics, phonemes, demi-syllables, syllables, words
- Levels: phonetics, lexicon, syntax, semantics, pragmatics
- Speech to word segments
- Recognition performance
- Voice: accuracy, errors (substitution, insertion, no response)
- System: vocab size, speaker independence, response time, UI, throughput
MIDI
- History: chaos, MIDI, General MIDI (GM)
- Quality: FM synthesis -> samples
- Hardware: in, out, thru; 31.25 Kbps
- keyboard synthesizer
- digital effects processor
- drum machine
- PC with sound card
- Sound module, stereo amp, speakers
- Sound card
- Synthesizer
- Audio mixer: MIDI, CD-DA, decompressor
- Codec, CD-ROM interface
- Protocol
- Channel messages
- Voice: select instrument, switch notes on/off, pressure, effects, pitch
- Mode: mono, poly
- System messages
- Common: select song, request tune
- Real-time: timing clock, start/stop
- System exclusive: manuf data
[Home |
Readings
]
Copyright 1996, 1997
Edward A. Fox