One-Factor Experiment Design

[Jain, Ch. 20]

Last topic:

 

Today's topic:

 

Comparison of tex2html_wrap_inline529 and One-Factor

Notation

j
Factor level

i
Replication

a
Number of levels (``a'' stands for factor A)

r
Number of replicas (``r'' stands for ``replica'')

Example

Number of bytes required by five programmers to code workload on three processors (R, V, and Z):

tabular32

Here, r=5 and a=3.

Columns correspond to levels; rows correspond to replicas.

Note:

Terms i and j have opposite definition in tex2html_wrap_inline533 design! That's because i is a row, and j is a column.

More Notation

tex2html_wrap_inline535
Response observed for replica i of factor level j

tex2html_wrap_inline537
Mean of observed responses for all replicas of factor level j

tex2html_wrap_inline539

tex2html_wrap_inline545
Predicted (estimated) response when factor has level j

tex2html_wrap_inline547

tex2html_wrap_inline555
Error term, representing how much observed response for replica i of factor level j differed from the sum of the mean response (i.e., tex2html_wrap_inline541 ) and effect of factor j (i.e., tex2html_wrap_inline547 )

Model

displaymath561

where:

Equation used for prediction:

displaymath565

Computation of effects

Recall our model:

displaymath561

Parameter values are:

displaymath572

displaymath573

Thus effect due to level j (i.e., tex2html_wrap_inline547 ) is difference between mean of observations at level j and grand mean.

Table Method of Computing Effects

Form column sums to derive tex2html_wrap_inline537 and effects tex2html_wrap_inline547 :

tabular112

Model Interpretation

From table:

Recall our equation for predicted response:

displaymath611

Therefore:

displaymath612

Estimating Experimental Errors

Recall definition of tex2html_wrap_inline555 :

Difference between (1) observed response for replica i of factor level j and (2) sum of the mean response and effect of factor j

Formally,

displaymath615

Example:

 

Recall:

Thus:

displaymath619

Allocation of Variation

Recall from linear regression our analysis to explain how much variation is due to

explained variation (i.e., the factor)

versus

unexplained variation (i.e., the experimental error).

The two quantities were SSR/SST and SSE/SST, respectively.

For one-factor design, we compute SSA/SST and SSE/SST.

So what are SSA, SSE, and SST?

Calculating SSA, SSE, and SST

Definitions

displaymath627

displaymath628

displaymath629

displaymath630

displaymath631

Derivation of SSY, SS0, and SSA

Recall our model:

displaymath561

Squaring both sides, and adding equations for all i and j yields:

displaymath638

Thus:

displaymath639

Simplifying:

displaymath640

displaymath641

Computing SSE

First note the following relationship:

displaymath647

But by definition SST = SSA + SSE. Therefore:

displaymath648

Example of Allocating Variation

Recall that the observations tex2html_wrap_inline535 are:

tabular32

and that the formulas are:

displaymath651

where

displaymath652

Thus:

displaymath653

percent of variation explained by processors is

displaymath654

The remaining 89.6% is due to experimental error!

Analysis of variance (ANOVA)

Allocation of variance shows that 89.6% of variation is due to error.

Why is this so high?

How do we identify which case is true?

Resolving Point 2: Mean Square Statistics

We need a test to compare SSA and SSE.

How about using their ratio:

displaymath661

This ratio is not quite right - if we have many more replicas than factor levels, SSE will be larger! So use ``SSE per sample'' and ``SSA per sample''. Let tex2html_wrap_inline671 and tex2html_wrap_inline673 denote, respectively, the degrees of freedom of SSA and SSE:

displaymath662

displaymath663

Mean Square of A:
Ratio of SSA to degrees of freedom:

displaymath664

Mean Square of Error:
Ratio of SSE to degrees of freedom:

displaymath665

Resolving Point 3: F Test

How do factor in the sample ``quality'' (size)?

We need a statistical test to compare SSA and SSE.

tex2html_wrap_inline677 Distribution:

F Distribution:

F-Test:

Tests following null hypothesis:

Response variable does not depend on any effect tex2html_wrap_inline547 .

Acceptance criteria:

Ratio does not exceed tex2html_wrap_inline707 -quantile of the F distribution.

Note: As tex2html_wrap_inline709 , tex2html_wrap_inline707 -quantile tex2html_wrap_inline713 . So all F values in the tables exceed one. This is intuitive:

So the question

Is factor statistically significant?

is equivalent to rejecting null hypothesis, or asking:

Does F statistic computed from our data exceed theoretical F?

Test gives ``yes/no'' answer for a chosen significance level to question of whether contribution of factor to variation is statistically significant.

Explanation of ANOVA

ANOVA is a statistical procedure to compare the contribution of the percentages of variation attributed to the factor and the error.

Computed F statistic is tex2html_wrap_inline721 :

If

displaymath719

The theoretical F is tex2html_wrap_inline725 .

Example

In the code size comparison:

displaymath727

Thus

displaymath728

The value of F[.90; 2,12] = 2.8. Because 0.7 ;SPMlt; 2.8, the test yields ``no significance.''

Convenient table for ANOVA

<Insert Tables 20.3 and 20.4 here!>

Visual diagnostic tests

The one-factor analysis requires the same assumptions as were used earlier for the tex2html_wrap_inline533 design. Consider two of these:

  1. Errors are normally distributed.
  2. Errors are independent of levels or experiment number.
  3. Variance of errors is independent of factor levels.

We can visually test (1) by a quantile-quantile plot:

<Insert Fig. 14.9!>

Visually test (2) by plotting residuals versus predicted response:

<Insert Figs. 14.7, 14.8!>

Visually test (3) by last plot, but look for increasing spread:

<Insert Fig. 14.10!>

Applied to our example

<Insert Fig. 18.2!>

<Insert Fig. 18.1!>

Confidence Intervals for effects

The estimated response is:

displaymath737

All three quantities are random variables because they are based on one sample (set of experiments).

As discussed earlier for tex2html_wrap_inline533 designs, confidence interval for a term x (for tex2html_wrap_inline747 ) in above equation is the mean plus or minus the product of a t distribution value and the standard deviation of x.

displaymath738

Note: Degrees of freedom for constructing CI's always equals DF for errors (tabulated in ANOVA table). It's a(r-1) because the errors for all replicas at a given factor level sum to one, so only r-1 values are unique for each of a factor levels. Jain incorrectly writes the formula as a( r-1) on page 335.

The values of tex2html_wrap_inline751 are:

tabular412

where standard deviation of errors is

displaymath739

Example

Example: Consider again the code size comparison:

displaymath767

For 90% confidence...

displaymath768

None of the processor effect ( tex2html_wrap_inline547 ) confidence intervals contains zero.

Thus we cannot say with 90% confidence that the processors have a significant effect on code size.

How to compare two alternative factor levels

Question: Is the code size required for processor R significantly larger or smaller than the code size for V?

Answer: Does confidence interval for contrast formula tex2html_wrap_inline775 exclude zero?

We will find that 90% confidence interval is:

displaymath773

Interval includes zero; therefore, we cannot state confidently that R requires more or less storage than V.

Note: Jain contains error on p. 337. He incorrectly lists interval as (-88.7, 111.1).

Details of Confidence Interval Calculation

To compute tex2html_wrap_inline775 , we use a contrast formula:

displaymath777

where tex2html_wrap_inline785 and tex2html_wrap_inline787 and tex2html_wrap_inline789 .

From Table 20.3 in Jain:

tabular479

Thus standard deviation of tex2html_wrap_inline803 (from table above, not ``56.1'' as Jain writes on p. 337).

Also need mean difference in effects:

displaymath778

Thus 90% confidence interval is

displaymath779



cs5014@ei.cs.vt.edu