Common Mistakes
- No goals
- Start cache study but have no clear idea of what we want to learn
- Biased goals
- We think up our own wonderful replacement policy, than want to
show that OUR solution is better than OTHERS
- Unsystematic approach
- Arbitrarily choose parameters (such as 2-fold factor in LRU-THOLD)
- Analysis without understanding the problem
- We had to revise the simulation model several times and redefine
our experiments several times as our understanding grew through the
summer.
- Incorrect measures
- We originally planned to use "URL get response time" as a measure.
While easily computed, it represented different things in Netscape vs.
Mosaic.
- Unrepresentative workload
- Our study used CS workloads -- what would happen with courses in
the English department?
- Wrong evaluation technique
- We had this problem in Spring 95 when we first tried to study the
problem using experiments with a real system, rather than with
simulation
- Overlooking important parameters
-
- Ignoring significant factors
-
- Inappropriate experiment design
-
- Inappropriate level of detail
-
- No analysis
- It was important to draw clear conclusions from the data: "(1)
that with our workloads a proxy has a 30-50% maximum possible hit rate
no matter how it is designed; (2) that when the cache is full and a
document is replaced, least recently used (LRU) is a poor policy, but
simple variations can dramatically improve hit rate and reduce cache
size; (3) that a proxy server really functions as a second level
cache, and its hit rate may tend to decline with time after initial
loading given a more or less constant set of users; and (4) that
certain tuning configuration parameters for a cache may have little
benefit."
- Erroneous analysis
-
- No sensitivity analysis
-
- Ignoring errors in input
-
- Improper treatment of outliers
-
- Assuming no change in the future
- Will our results hold as WWW use grows rapidly, and the uses of
the WWW change in the future?
- Ignoring variablity
- The hit rate varies dramatically with each day of each workload.
How can we give the best overall picture?
- Too complex analysis
- This danger arose in Experiment
4 in our cache study, when we had three factors to examine, but no
clear picture from the measured data
- Improper presentation of results
-
- Ignoring social aspects
-
- Omitting assumptions and limitations