Every measured value is a random variable.
So compare variation due to a factor with variation due to errors!
So identify all factors, then select a subset to vary!
Accounting for the effect of users is particularly important.
You may not be able to figure out which factor caused a response variable change.
So try not to vary several factors simultaneously.
Could you use another experiment design to obtain narrower confidence intervals with the same number of experiments?
One-factor-at-a-time designs cannot estimate interactions between factors. For example, the effect on performance of adding a 1 Kbyte cache may depend on the program size.
So don't use a one-factor-at-a-time design!
So use a sequence of smaller experiment designs that idsolate critical factors, rather than one experiment with a zillion factors!