CS5014: Homework 5
Due Friday, 29 September
Please turn in a hardcopy of the solution to this homework, written
using LaTeX. You may not receive full credit if you do not staple
together the pages of your homework.
Construct a simple linear regression model of the time required
to run the "latex" command on a computer of your choice as a function
of input file size.
To conduct your experiment, type "time latex" and use the total
elapsed time that "time" reports as the response variable. Use the
file size in characters as reported by the "ls" command. Use several
LaTeX files that you have written earlier this semester, such as
homeworks for class.
Include the following items with the solution that you turn in:
- State what machine and LaTeX version you used to make your
observations.
- Summarize the procedure you used to make the measurements.
(Give a detailed answer -- we will use this in a later class. For
example, give a table with the exact order of runs that you make,
including the file size used and response variable observed.)
- Visually verify the assumptions for regression using the graphs
discussed in section 14.7 of Jain. Include the graphs in the solution
you turned in. Comment on the quality of the model.
- Compute MSE.
- Compute R^2 and comment on the quality of your model.
- For what file size is your model most accurate? For
what file sizes is your model least accurate?
- What do you believe is responsible for the variation of the
response variable due to errors?
- For each of the common mistakes listed in [Jain, 15.6], state if
the mistake is potentially relevant to your solution. For all
potentially relevant mistakes, evaluate if your solution suffers to
any degree from the mistake.