HTTP 1.1

References:

Goals of HTTP/1.1

Support wide diversity of client/server/network configurations:

Terms (from [1])

Message Resource Entity Content negotiation User agent Origin server Proxy

Intermediaries

HTTP/1.0 was devised for a simple case when a user agent (client) requests a document from an origin server:
          request chain ------------------------>
       UA -------------------v------------------- O
          <----------------------- response chain
HTTP/1.1 recognizes that one or more intermediaries (e.g., a proxy) may exist:
          request chain -------------------------------------->
       UA -----v----- A -----v----- B -----v----- C -----v----- O
          <------------------------------------- response chain
If a cache is used on any intermediary, the request/response chain is shortened:
          request chain -------------------------------------->
       UA -----v----- A -----v----- B - - - - - - C - - - - - - O
          <------------------------------------- response chain

Some Compatibility Problems

HTTP clients, applications, and servers have some subtle incompatibilities that the HTTP/1.1 spec lists:


Tags

Product Tokens

Examples:
       User-Agent: CERN-LineMode/2.15 libwww/2.17b3
       Server: Apache/0.8.4

Language Tags

Language tags specify a natural language, not a computer language, of messages. Example tags include en, en-US, en-cockney, i-cherokee, x-pig-latin.

Entity Tags

HTTP/1.0 was designed without thought of caching. So date/time stamps have been used by caches to identify when an entity is inconsistent with the document on the origin server.

In contrast, HTTP/1.1 uses entity tags to compare two or more entities from the same requested resource.

HTTP/1.1 also adds new header fields: ETag, If-Match, If-None-Match, and If-Range.

Other Tags

Other tags include quality values (how much degradation in quality is tolerated).


New Request Types

HTTP/1.0 uses GET, HEAD, and POST.

HTTP/1.1 adds several new ones:

PUT x
Requests that x be stored under the supplied URI - thus allowing a client to write a file to a server!
DELETE x
Remove x. So now a client can delete a file on a server!
TRACE
OPTIONS
A request for information about the communication options available on the request/response chain identified by the Request-URI.
There is also a provision to extend this list.

Return codes of 405 (Method Not Allowed) and 501 (Not Implemented) are added for servers to respond. GET and HEAD must be supported by all servers; the rest are optional.


Access Authentication

Provides simple challenge-response authentication mechanism.

Basic Scheme

Upon receipt of unauthorized request for URI, server MAY respond with challenge:
       WWW-Authenticate: Basic realm="WallyWorld"
where "WallyWorld" is string assigned by server to identify protection space of the Request-URI.

Client sends userid and password, separated by ":", using base64 encoding.
Example: User agent sends userid "Aladdin", password "open sesame":

       Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

Caching

Terms

Explicit expiration time Heuristic expiration time

Age


Overview

"The goal of caching in HTTP/1.1 is to eliminate the need to send requests in many cases, and to eliminate the need to send full responses in many other cases." See Slides from the 96SIGCOMM paper by Williams, Abrams, Standridge, Abdulla, Fox slides for more information on caching.

Semantic Transparency

HTTP/1.1 defines "semantically transparency":
  • Caching only improves performance.
  • Client receives exactly the same response from cache as from origin server.

  • (except for hop-by-hop headers)
    Unlike HTTP/1.0:

    Freshness

    In HTTP/1.1 clients, servers, and caches specify the document freshness they require, to permit relaxing transparency.

    Cache responds to user request in one of four ways:

    1. Returned doc was checked for equivalence with origin server's copy.
    2. Returned doc is "fresh enough": meets least restrictive freshness requirement of client, server, and cache.
    3. Warning that freshness demand is being violated (using the Warning response header and maybe an icon of a rotting fish in the Web browser).
    4. Not Modified, Proxy Redirect, or error response returned.
    For example, if cache cannot communicate with origin server, then cache must follow cases 2, 3, or 4.


    Expiration Model

    Recall that expiration is used to avoid sending requests.

    Expiration by client:

    Expiration by server:

    Heuristic Expiration

    Expiration Calculation

    A response in a cache has expired if
    freshness_lifetime > current_age
    where
    freshness_lifetime is either
  • the max-age directive from a response,
  • or if not present expires_value - current_date,
  • or if not present, a heuristic.
  • current_age
    is a formula that uses the age header value, which is the sender's estimate of the amount of time since the response was generated or last revalidated by the origin server. Age is corrected to take into account propagation delay between multiple levels of caches.

    Validators

    Recall that HTTP/1.0 uses two forms of validating when an entity in a cache is equivalent to an entity in an origin server: strong and weak.

    One can think of a strong validator as one that changes whenever the bits of an entity changes, while a weak validator changes whenever the meaning of an entity changes.

    Weak Validator
    Example: Last-Modified header field
    Strong Validator
    Example: an integer that is incremented in stable storage every time an entity is changed

     
    Strong validator is specified by ETag entity-header field.
    Is used when servers
  • don't store modification times
  • HTTP's one-second resolution date value does not distinguish document versions, or
  • to avoid "certain paradoxes that may arise form the use of modification tags."
  • To perform conditional document retrieval from origin server, a cache specifies in its request boolean validating conditions (e.g., perform the GET if and only if validator matches, perform GET if and only if no validators match).

    HTTP Methods that Write to URLs

    HTTP/1.1 specifies write-through semantics for any HTTP methods that update a resource on an origin server (i.e., any method but GET and HEAD). Thus the origin server is contacted on every write operation.

    "The alternative (known as "write-back" or "copy-back" caching) is not allowed in HTTP/1.1, due to the difficulty of providing consistent updates and the problems arising from server, cache, or network failure prior to write-back."

    History Lists (the "Back" Button)

    History lists or Back-buttons are meant to show exactly what the user saw at a past time.

    Thus the "back" button or history list in a Web browser should simply redisplay a previous page, and should not try to show a semantically transparent view of the current state of a resource.

    (So when using a search engine, you'd never get a message to "repost" when you go back to the page of items found by the search.)


    What Can be Cached

    The Cache-Control response directives allow an origin server to override the default cachability of a response. A shared cache might be a proxy, and a private cache a Web browser.
    public
    Response is cachable
    private
    All or part of the response message is intended for a single use and must not be cached by a shared cache.
    no-cache
    All or part of the response must not be cached anywhere.


    No-Store Directive

    No-Store directive prohibits cache from storing any part of request or response in non-volatile storage.

    Thus there's no way that sensitive information in request or response could be inadvertently released (e.g., on backup tapes).

    "The purpose of this directive is to meet the stated requirements of certain users and service authors who are concerned about accidental releases of information via unanticipated accesses to cache data structures. While the use of this directive may improve privacy in some cases, we caution that it is NOT in any way a reliable or sufficient mechanism for ensuring privacy. "


    User-agent Forced Cache Reloads

    A user agent can force a cache reload using the Cache-Control request directive. There are two forms:
    end-to-end
    The "no-cache" Cache-Control directive forces a reload
    end-to-end revalidation
    Forces each cache along the path to the origin server to revalidate its own entry, if any, with the next cache or server.


    Other Aspects of Caching

    Caching in HTTP/1.1 is complex. The protocol spec discusses many more aspects of caching.


    Persistent Connections

    Originally in the Web, a separate TCP connection was used to fetch each URL. Analysis of HTTP performance [2] lead to the use of persistent HTTP conenctions. The advantages (from [1]): Persistent connections in HTTP/1.1, unlike earlier versions, are the default behavior.

    One complication of consistent connections is that either the client or server or a proxy might time-out the connection if no data is sent for a while. So implementations of HTTP must be able to reopen connections whenever needed. Another case when computer software gets much more complex to get a performance improvement.


    Miscellaneous HTTP/1.1 Features

    Chunks
    A message can be transfered as a series of chunks, each with its own size indicator, followed by a footer. "This allows dynamically-produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message." In comparison, script-generated responses in HTTP/1.0 normally have zero length, so the user-agent verify if it got the whole message!

    Chunks are also used in persistent connections, which required self-defined message lengths (rather than defining length upon closure of a connection).


    New Header Fields

    For Either Request or Response

    Cache-Control
    Connection
    Persistent TCP connection is used unless set to "CLOSE."
    Upgrade
    Allows clients and servers to negotiate an upgrade to the application protocol used on the connection. Used with 101 response code.
    Via
    Records intermediaries visited by a request type of TRACE

    For Requests Only

    HOST
    Used to supplement a URI that is not absolute:
          GET /pub/WWW/TheProject.html HTTP/1.1
           Host: www.w3.org
    If-Match, If-None-Match, If-Range, If-Unmodified-Since
    Proxy-Authorization
    Accept, Accept-Charset, Accept-Encoding, Accept-Language
    Range
    Max-Forwards

    For Responses Only

    Response Status Codes

           Status-Code    = "100"   ; Continue
                          | "101"   ; Switching Protocols
    
                          | "200"   ; OK
                          | "201"   ; Created
                          | "202"   ; Accepted
                          | "203"   ; Non-Authoritative Information
                          | "204"   ; No Content
                          | "205"   ; Reset Content
                          | "206"   ; Partial Content
    
                          | "300"   ; Multiple Choices
                          | "301"   ; Moved Permanently
                          | "302"   ; Moved Temporarily
                          | "303"   ; See Other [browser changes URI after POST]
                          | "304"   ; Not Modified
                          | "305"   ; Use Proxy [client must use proxy to get URI]
    
                          | "400"   ; Bad Request
                          | "401"   ; Unauthorized
                          | "402"   ; Payment Required
                          | "403"   ; Forbidden
                          | "404"   ; Not Found
                          | "405"   ; Method Not Allowed
                          | "406"   ; Not Acceptable
                          | "407"   ; Proxy Authentication Required
                                      [client must authenticate itself with proxy]
                          | "408"   ; Request Time-out
                          | "409"   ; Conflict
                          | "410"   ; Gone
                          | "411"   ; Length Required
                          | "412"   ; Precondition Failed
                          | "413"   ; Request Entity Too Large
                          | "414"   ; Request-URI Too Large
                          | "415"   ; Unsupported Media Type
    
                          | "500"   ; Internal Server Error
                          | "501"   ; Not Implemented
                          | "502"   ; Bad Gateway
                          | "503"   ; Service Unavailable 
                                      [server is overloaded or under maintenance]
                          | "504"   ; Gateway Time-out
                          | "505"   ; HTTP Version not supported
    
                          | extension-code

    Last modified on 5 February 1998.

    Send comments to abrams@vt.edu.
    [This is http://ei.cs.vt.edu/~wwwbtbNotes/Protocols/http1_1.html.]