Panel Data

Table of Contents

The Model

Model Estimation

Extensions

Example: Cost Function


The Model

For each cross section (individual) i=1,2,...N and each time period (time) t=1,2,...T,

Yit = Xitbit + eit

Let bit = b and assume eit = ui + vt + eit where ui represents the individual or cross section difference in intercept and vt is the time difference in intercept. Two-ways analysis includes both time and individual effects. For simplicity, we further assume vt = 0. That is, there is no time effect. In other words, only the one-way individual effects will be analyzed in the following.

The component eit is a classical error term, with zero mean, homogeneous variance, and there is no serial correlation and no contemporary correlation. Also, eit is uncorrelated with the regressors Xit. That is,

Fixed Effects Model

Assume that the error component ui, the individual difference, is fixed or nonstochastic (but it varies across individuals). Thus, the model error is simply eit = eit. The model is expressed as:

Yit = (Xitb + ui) + eit

where ui is interpreted as the change in the intercept. Therefore the individual effect is defined as ui plus the intercept.

Random Effects Model

Assume that the error component ui, the individual difference, is random and satisfies the following assumptions:

Then, the model error is eit = ui + eit with the following structure: In other words, for each cross section i, the variance covariance matrix of the model error ei = [ei1, ei2, ...,eiT]' is the following TxT matrix:

S =
é
ê
ê
ë
s2e+s2us2u..s2u
s2us2e+s2u..s2u
::::
s2us2u..s2e+s2u
ù
ú
ú
û
= s2eI + s2u

Let e be a NT-element vector of the stacked errors e1, e2, ..., eN, e = [e1,e2, ..., eN]', then E(e) = 0 and E(ee') = SÄI, where I is an NxN identity matrix and S is the TxT variance-covariance matrix defined above.


Model Estimation

Fixed Effects Model

Consider the model as follows:

Yit = (Xitb + ui) + eit (i=1,2,...,N; t=1,2,...,T).

Let Yi = [Yi1,Yi2,...,YiT]', Xi = [Xi1,Xi2,...,XiT]', and ei = [ei1,ei2,...,eiT]', then the pooled (stacked) model is

é
ê
ê
ë
Y1
Y2
:
YN
ù
ú
ú
û
=
é
ê
ê
ë
X1
X2
:
XN
ù
ú
ú
û
b +
é
ê
ê
ë
e1
e2
:
eN
ù
ú
ú
û

or, Y = Xb + e

Random Effects Model

Recall the pooled model for estimation

Y = Xb + e

where e = [e1,e2,...,eN]', ei = [ei1,ei2,...,eiT]', and the random error components eit = ui + eit. By assumptions, E(e) = 0, and E(ee') = I. The Generalized Least Squares estimates of b is

b = [X'(S-1ÄI)X]-1X'(S-1ÄI)Y

Since S-1 can be derived from the estimated variance components s2e and s2u, in practice the model is estimated using the following partial deviation approach.

Hausman's Test for Fixed or Random Effects

Let bfixed be the estimated slope parameters of the fixed effects model (using dummy variable approach), and brandom be the estimated slope parameters of the random effects model. Moreover, Var(bfixed) and Var(brandom) are the corresponding estimated variance-covariance matrix, respectively. Hausman's test for no difference of these two sets of parameters is a Chi-square test in which the degree of freedom corresponds to the number of slope parameters. The test statistic is defined as follows:

H = (brandom-bfixed)'[Var(brandom)-Var(bfixed)]-1(brandom-bfixed)

Extensions

Unbalanced Panel Data

Panels in which the group sizes (time periods) differ across groups (individuals) are not unusual in empirical panel data analysis. These panels are called unbalanced panels. Estimation for fixed effects and random effects models discussed above must be modified to reflect the structure of unbalanced panels. Modify the dummy variable or deviation approach for estimating the fixed effects with unbalanced panel data is straightforward. However, for the random effects model, by allowing unequal group sizes, there presents the problem of groupwise heteroscedasticity.

Random Coefficients Model

For each cross section i=1,2,...,N, the model is written as:

Yi = Xibi + ei
bi = b + ui

where Yi = [Yi1,Yi2,...,YiT]', Xi = [Xi1,Xi2,...,XiT]', and ei = [ei1,ei2,...,eiT]'. We note that not only the intercept but also the slope parameters are random across individuals. The assumptions of the model are:

and

The model for estimation is

Yi = Xib + (Xiui + ei), or
Yi = Xib + wi where wi = Xiui + ei, and

The stacked (pooled) model is

Y = Xb + w

where w = [w1,...,wN]', and

E(w) = 0NTx1
Var(w) = E(ww') = V =
é
ê
ê
ë
W10..0
0W2..0
::::
00..WN
ù
ú
ú
û

GLS is used to estimate the model. That is,

b* = (X'V-1X)-1X'V-1Y
Var(b*) = (X'V-1X)-1

The computation is based on the following steps (Swamy, 1971):

  1. For each regression equation i, Yi = Xibi + ei, obtain the OLS estimator of bi:
    bi = (Xi'Xi)-1Xi'Yi
    Var(bi) = (Xi'Xi)-1(Xi'WiXi)(Xi'Xi)-1 = si2(Xi'Xi)-1+G = Vi+G
    (Taking account of heteroscedasticity, where Vi = si2(Xi'Xi)-1)
    Note that si2 is estimated by s2i = ei'ei/(N-K), where ei = Yi - Xibi.
    Then, Vi = si2(Xi'Xi)-1.

  2. For the random coefficients equation, bi = b + ui, the variance of bi (estimator of bi) is estimated by åi=1,...,G(bi-bm)(bi-bm)'/(G-1) = åi=1,...,G(bibi'-G bmbm')/(G-1), where bm = åi=1,...,Gbi/G.
    Therefore, G = åi=1,...,G(bibi'-G bmbm')/(G-1) - åi=1,...,GVi/G
    Concerning the possibility that G may be nonpositive definite, we use
    G = åi=1,...,G(bibi'-G bmbm')/(G-1).

  3. Write the GLS estimator of b as:
    b* = (X'V-1X)-1X'V-1Y
    = [åi=1,...,GXi'WiXi]-1 [åi=1,...,GXi'WiYi]
    = [åi=1,...,GXi'WiXi]-1 [åi=1,...,GXi'WiXibi]
    = [åi=1,...,G(G+Vi)-1]-1 [(G+Vi)-1bi]
    = åi=1,...,GWibi, where Wi = [åi=1,...,G(G+Vi)-1]-1 [(G+Vi)-1].
    Similarly,
    Var(b*) = (X'V-1X)-1 = [åi=1,...,G(G+Vi)-1]-1

The individual parameter vectors may be predicted as follows:

bi* = (G+Vi)-1[G-1b*+Vi-1bi] = Aib* + (I-Ai)bi,
where Ai = (G+Vi)-1G-1.

Var(bi*) = [Ai  I-Ai]
é
ë
åi=1,2,...,GWi(G+Vi)Wi'  Wi(G+Vi)
(G+Vi)Wi'  (G+Vi)
ù
û
é
ë
Ai
I-Ai
ù
û

Seemingly Unrelated System Model

Consider a more general specification of the model:

Yit = Xitbi + eit (i=1,2,...,N; t=1,2,...,T).

Let Yi = [Yi1,Yi2,...,YiT]', Xi = [Xi1,Xi2,...,XiT]', and ei = [ei1,ei2,...,eiT]', the stacked N equations (T observations each) system is Y = Xb + e, or

é
ê
ê
ë
Y1
Y2
:
YN
ù
ú
ú
û
=
é
ê
ê
ë
X10..0
0X2..0
::::
00..XN
ù
ú
ú
û
é
ê
ê
ë
b1
b2
:
bN
ù
ú
ú
û
+
é
ê
ê
ë
e1
e2
:
eN
ù
ú
ú
û

Notice that not only the intercept but also the slope terms of the estimated parameters are different across individuals. The error structure of the model is summarized as follows:

The model is estimated using techniques for systems of regression equations.

The system estimation techniques such as 3SLS and FIML should be used for parameter estimation. It is called the Seemingly Unrelated Regression Estimation (SURE) in the current context. Denote b and S as the estimated b and S, respectively. Then,

b = [X'(S-1ÄI)X]-1X'(S-1ÄI)Y
Var(b) = [X'(S-1ÄI)X]-1, and
S = ee'/T, where e = Y-Xb is the estimated error e.

More Examples: Cost of Production for Airline Services

Lesson 16.2
Lesson 16.2a
Lesson 16.2b


Copyright © Kuan-Pin Lin
Last updated: May 26, 2005