Sum of squares decomposition

1 min readDec 8, 2020

Consider the orthogonal decompositiony⊤y=y⊤MXy+y⊤HXy,y⊤y=y⊤MXy+y⊤HXy,along with the modely=β01n+X1β1+X2β2+ε.y=β01n+X1β1+X2β2+ε.

We consider four concurrent models, for X1X1 an n×p1n×p1 matrix and X2X2 and n×p2n×p2 matrix and Xa=(1⊤n,X⊤1,X⊤2)⊤Xa=(1n⊤,X1⊤,X2⊤)⊤ and n×p=n×(1+p1+p2)n×p=n×(1+p1+p2) full rank matrix.

the full model with both predictors, Ma:y=β01n+X1β1+X2β2+ε.Ma:y=β01n+X1β1+X2β2+ε.,
the restricted model with Mb:β2=0Mb:β2=0 and only the first predictor, of the form y=β01n+X1β1+ε.y=β01n+X1β1+ε.,
the restricted model with Mc:β1=0Mc:β1=0 and only the second predictor, of the form y=β01n+X2β2+ε.y=β01n+X2β2+ε.
the intercept-only model Md:y=β01n+εMd:y=β01n+ε.

Let XaXa, XbXb, XcXc, and XdXd be the corresponding design matrices.

R uses an orthogonal decomposition of the projection matrix on to XaXa, HXaHXa into two parts: HXa=HXb+HMXbX2.HXa=HXb+HMXbX2. The last term is the contribution of X2X2 to the model fit when 1n,X11n,X1 are already part of the model. We can form the sum of squares of the regression using this decomposition. We use the notation SSR(H)=y⊤HySSR(H)=y⊤Hy to denote the sum of squares obtained by projecting yy onto the span of HH.

We haveSSR(HMXbX2)=SSR(HXa)−SSR(HXb).SSR(HMXbX2)=SSR(HXa)−SSR(HXb).that is, the difference in sum of squared from the regression with model MaMa versus that from regression Mb.Mb. This is the sum of squares from the regression that are due to the addition of X2X2 to a model that already contains (1n,X1)(1n,X1) as regressors.

The usual FF-test statistic for the null hypothesis H0:β2=0H0:β2=0 can be written asF=SSR(HMXbX2)/p2RSSa/(n−p)=(RSSb−RSSa)/p2RSSa/(n−p)∼F(p2,n−p).F=SSR(HMXbX2)/p2RSSa/(n−p)=(RSSb−RSSa)/p2RSSa/(n−p)∼F(p2,n−p).The last equality follows by noting that SSR(HXb)+RSSb=SSR(HXa)+RSSaSSR(HXb)+RSSb=SSR(HXa)+RSSa.

This is the proof for decomposition of Sum of squares.

Sum of squares decomposition

Written by Ashley Denies