CS5014: Homework 3
Due Friday, 15 September
Please turn in a hardcopy of the solution to this homework, written
using LaTeX. You may not receive full credit if you do not staple
together the pages of your homework.
- Consider Table 4 of the Abrams, Standridge, Abdulla, Williams,
and Fox paper from HW2 (html version;
postscript). Compute all indices of dispersion from section 12.8
for the 6 values representing lifetime of a document in the cache for
the classroom workload and the LRU replacement policy. Which index
would you choose, and why? (Note: We did not discuss indices of
dispersion in class, but I trust that you can read and apply sections
12.8 and 12.9 in Jain on your own. Also refer to problems 12.13 and
12.14 in Jain if you are unsure of your answer -- the answers are in
the back of Jain!)
- Repeat the previous problem, but for index of central tendency.
(Refer to problems 12.8, 12.10, and 12.11 in Jain if you are unsure of
your answer.)
- Do problem 12.15 in Jain. Use gnuplot to generate an
encapslated postscript file of the graph, and include the graph in
your LaTeX document.
- Consider a data set with n samples, denoted
XS(1),...,XS(n). For Q-Q plots, why is it inconvenient to have
an empirical distribution function F_S(x) such that
F_S(XS(n)) = 1?
- In class, we defined a QQ plot to test if a continuous
theoretical distribution fit empirical data. What difficulty arises
when you try to define a Q-Q plot to test if a discrete theoretical
distribution fits empirical data?