Form-invoked Scripts
Script Basics
Definition
"A Web script is a program that can be executed by the Web
server in response to Web requests." [Yeager and McGrath]
Web scripts
- can be written in any language - C, Perl, UNIX shell
- take their input from the Web (e.g., from a form) and write as
output an HTTP response packet that includes a document (e.g., html or
plain text)
- may call other programs or scripts on the Web server or even on
another machine
How Web Servers Execute Scripts
A Web server must
- distinguish scripts from static documents based on URL,
- locate code to execute (e.g., the script) on its file system,
- check whether the script has execute permission,
- start the script, and pass form data (i.e., fields after "?" in a
GET or the body of a POST) to script,
- route output from the script back to the Web browser, and
- send an error message to browser if script cannot be completed,
and then close the network connection.
Distinguishing Scripts from Static Documents
Each Web server specifies what directories or files can contain
scripts. Hence scripts can be identified based on URL.
- NCSA httpd usually uses cgi-bin
directory for scripts. Our earlier example used
http://ei.cs.vt.edu/cgi-bin/wwwbtb/SampleForm.pl.
- NCSA httpd can also treat all *.cgi files as scripts.
Common Gateway Interface (CGI)
CGI specifies rules for passing data between the Web server
(httpd) and a script.
CGI rules are platform dependent:
| Platform |
Script runs as |
Input is |
Output is |
| UNIX |
process |
stdin and environment variables |
standard out |
| Macintosh |
Mac script |
Apple events |
Apple events |
| Windows NT |
application |
temp file |
temp file |
How Scripts Work
Code for Script
The Overview document illustrated a
form that executed a script to print the users logged onto
ei.cs.vt.edu. Click here to see
the script.
Some things to note:
- Script is in perl, a popular Web scripting language. However, it
could be in another language.
- Script requires no input.
- Script must actually write HTTP header!
- Minimal HTTP header is to provide Content-type header field.
- Note that two new-lines follow first print statement, because a
blank line terminates HTTP headers. (Click here
to review this.)
- Script writes an HTML file after the HTTP header.
- Script invokes another program to do real work: UNIX finger.
- Exit status of zero is returned.
- All writes to stderr are written to Web server, not to browser
window. So user will never see an error message!
How Scripts are Executed
See diagrams on pp. 65-69 of Yeager & McGrath.
Cost of Using Scripts
Scripts can really slow down a Web server, compared to serving
static documents!
Some costs of script execution are
- Multiple processes are spawned -- httpd, perl, and finger
- Some scripts don't write their own protocol header lines -- httpd
can dynamically parse script output to look for headers, and supply
missing headers. (Can be deactivated in "no parse header" scripts
that have been carefully checked for HTTP compliance.)
Note that with HTTP1.1, scripts that write their own protocol
headers will be more difficult to write.
Script Input
Scripts must do the following to retrieve the input form data
(called query string):
- Determine if script was called via GET or POST.
- Retrieve query string.
- Parse query string.
- Convert pluses and ASCII codes in query string to normal
characters.
- In UNIX, detect "tainted" strings - one containing shell control
characters, such as the semicolon.
Fortunately, libraries exist to do these things. In perl, use
get_query subroutine from cgi-utils.pl (See
pp. 375-378 in Stein book.) Also use perl's associative arrays to
easily access query string values. See example in Stein, p. 378.
Return to CS6204 home page.
Last modified on 23 October 1996.
Send comments to abrams@vt.edu.
[This is http://ei.cs.vt.edu/~wwwbtb/fall.96/ClassNotes/FormsScripts/scripts.html.]