The Art of Interpreting Measurements

Given the same data, two analysts may disagree on their interpretation of the data. In fact, the analysts may disagree on how the experimental study should have been done!

Example:

Which cache replacement policy is best to use (LRU, LRU-MIN, or LRU-THOLD):

Table 4: Comparison of document replacement policies LRU, LRU-MIN, and LRU-THOLD with different sizes of disk areas for caching (i.e., 50%, and 90% of MaxNeeded) using two performance measures.

		   Hit Rate	  Cache Size
		     (%)           (Mbytes) 
		U/MIN	U/MAX	U/MIN	U/MAX
______________________________________________
50% MaxNeeded
 LRU		31.2	30.3	13.8	 80.0
 LRU-MIN	32.9 	31.0	14.0	 76.5 
 LRU-THOLD	32.9 	32.1 	10.0 	 80.0
90% MaxNeeded
 LRU		32.9	32.1	24.8	144.0
 LRU-MIN	33.0 	32.1	25.0	144.0
 LRU-THOLD	32.9	32.1	10.0 	 85.9 
______________________________________________

One workload (U/MAX) gives same hit rate with all policies at 90%. What does that mean -- policy is irrelevant or workload is not representative?

At 50%, LRU-THOLD uses smaller cache on one workload, but LRU-MIN is smaller on another workload. Which policy do we recommend?