2. The Simple Regression Model#
from wooldridge import *
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
import numpy as np
dataWoo()
J.M. Wooldridge (2016) Introductory Econometrics: A Modern Approach,
Cengage Learning, 6th edition.
401k 401ksubs admnrev affairs airfare
alcohol apple approval athlet1 athlet2
attend audit barium beauty benefits
beveridge big9salary bwght bwght2 campus
card catholic cement census2000 ceosal1
ceosal2 charity consump corn countymurders
cps78_85 cps91 crime1 crime2 crime3
crime4 discrim driving earns econmath
elem94_95 engin expendshares ezanders ezunem
fair fertil1 fertil2 fertil3 fish
fringe gpa1 gpa2 gpa3 happiness
hprice1 hprice2 hprice3 hseinv htv
infmrt injury intdef intqrt inven
jtrain jtrain2 jtrain3 kielmc lawsch85
loanapp lowbrth mathpnl meap00_01 meap01
meap93 meapsingle minwage mlb1 mroz
murder nbasal nyse okun openness
pension phillips pntsprd prison prminwge
rdchem rdtelec recid rental return
saving sleep75 slp75_81 smoke traffic1
traffic2 twoyear volat vote1 vote2
voucher wage1 wage2 wagepan wageprc
wine
Example 2.3 CEO Salary & Return on Equity#
df = dataWoo('ceosal1')
dataWoo('ceosal1', description=True)
name of dataset: ceosal1
no of variables: 12
no of observations: 209
+----------+-------------------------------+
| variable | label |
+----------+-------------------------------+
| salary | 1990 salary, thousands $ |
| pcsalary | % change salary, 89-90 |
| sales | 1990 firm sales, millions $ |
| roe | return on equity, 88-90 avg |
| pcroe | % change roe, 88-90 |
| ros | return on firm's stock, 88-90 |
| indus | =1 if industrial firm |
| finance | =1 if financial firm |
| consprod | =1 if consumer product firm |
| utility | =1 if transport. or utilties |
| lsalary | natural log of salary |
| lsales | natural log of sales |
+----------+-------------------------------+
I took a random sample of data reported in the May 6, 1991 issue of
Businessweek.
df.head()
salary | pcsalary | sales | roe | pcroe | ros | indus | finance | consprod | utility | lsalary | lsales | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1095 | 20 | 27595.000000 | 14.1 | 106.400002 | 191 | 1 | 0 | 0 | 0 | 6.998509 | 10.225389 |
1 | 1001 | 32 | 9958.000000 | 10.9 | -30.600000 | 13 | 1 | 0 | 0 | 0 | 6.908755 | 9.206132 |
2 | 1122 | 9 | 6125.899902 | 23.5 | -16.299999 | 14 | 1 | 0 | 0 | 0 | 7.022868 | 8.720281 |
3 | 578 | -9 | 16246.000000 | 5.9 | -25.700001 | -21 | 1 | 0 | 0 | 0 | 6.359574 | 9.695602 |
4 | 1368 | 7 | 21783.199219 | 13.8 | -3.000000 | 56 | 1 | 0 | 0 | 0 | 7.221105 | 9.988894 |
model = smf.ols(formula='salary ~ 1 + roe', data=df).fit()
model.summary()
Dep. Variable: | salary | R-squared: | 0.013 |
---|---|---|---|
Model: | OLS | Adj. R-squared: | 0.008 |
Method: | Least Squares | F-statistic: | 2.767 |
Date: | Tue, 09 Jul 2024 | Prob (F-statistic): | 0.0978 |
Time: | 21:56:56 | Log-Likelihood: | -1804.5 |
No. Observations: | 209 | AIC: | 3613. |
Df Residuals: | 207 | BIC: | 3620. |
Df Model: | 1 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 963.1913 | 213.240 | 4.517 | 0.000 | 542.790 | 1383.592 |
roe | 18.5012 | 11.123 | 1.663 | 0.098 | -3.428 | 40.431 |
Omnibus: | 311.096 | Durbin-Watson: | 2.105 |
---|---|---|---|
Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 31120.902 |
Skew: | 6.915 | Prob(JB): | 0.00 |
Kurtosis: | 61.158 | Cond. No. | 43.3 |
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
print(model.summary())
OLS Regression Results
==============================================================================
Dep. Variable: salary R-squared: 0.013
Model: OLS Adj. R-squared: 0.008
Method: Least Squares F-statistic: 2.767
Date: Tue, 09 Jul 2024 Prob (F-statistic): 0.0978
Time: 21:56:56 Log-Likelihood: -1804.5
No. Observations: 209 AIC: 3613.
Df Residuals: 207 BIC: 3620.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 963.1913 213.240 4.517 0.000 542.790 1383.592
roe 18.5012 11.123 1.663 0.098 -3.428 40.431
==============================================================================
Omnibus: 311.096 Durbin-Watson: 2.105
Prob(Omnibus): 0.000 Jarque-Bera (JB): 31120.902
Skew: 6.915 Prob(JB): 0.00
Kurtosis: 61.158 Cond. No. 43.3
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
model.params
Intercept 963.191336
roe 18.501186
dtype: float64
model.nobs
209.0
if the return on equity increases by one percentage point, then salary is predicted to change by about 18.5
predicted_salary = model.predict(pd.DataFrame({'roe': [30]}))
predicted_salary
0 1518.226927
dtype: float64
Example 2.4 Wage Equation#
df2 = dataWoo('wage1')
dataWoo('wage1', description=True)
name of dataset: wage1
no of variables: 24
no of observations: 526
+----------+---------------------------------+
| variable | label |
+----------+---------------------------------+
| wage | average hourly earnings |
| educ | years of education |
| exper | years potential experience |
| tenure | years with current employer |
| nonwhite | =1 if nonwhite |
| female | =1 if female |
| married | =1 if married |
| numdep | number of dependents |
| smsa | =1 if live in SMSA |
| northcen | =1 if live in north central U.S |
| south | =1 if live in southern region |
| west | =1 if live in western region |
| construc | =1 if work in construc. indus. |
| ndurman | =1 if in nondur. manuf. indus. |
| trcommpu | =1 if in trans, commun, pub ut |
| trade | =1 if in wholesale or retail |
| services | =1 if in services indus. |
| profserv | =1 if in prof. serv. indus. |
| profocc | =1 if in profess. occupation |
| clerocc | =1 if in clerical occupation |
| servocc | =1 if in service occupation |
| lwage | log(wage) |
| expersq | exper^2 |
| tenursq | tenure^2 |
+----------+---------------------------------+
These are data from the 1976 Current Population Survey, collected by
Henry Farber when he and I were colleagues at MIT in 1988.
df2.head()
wage | educ | exper | tenure | nonwhite | female | married | numdep | smsa | northcen | ... | trcommpu | trade | services | profserv | profocc | clerocc | servocc | lwage | expersq | tenursq | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3.10 | 11 | 2 | 0 | 0 | 1 | 0 | 2 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.131402 | 4 | 0 |
1 | 3.24 | 12 | 22 | 2 | 0 | 1 | 1 | 3 | 1 | 0 | ... | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1.175573 | 484 | 4 |
2 | 3.00 | 11 | 2 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1.098612 | 4 | 0 |
3 | 6.00 | 8 | 44 | 28 | 0 | 0 | 1 | 0 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1.791759 | 1936 | 784 |
4 | 5.30 | 12 | 7 | 2 | 0 | 0 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.667707 | 49 | 4 |
5 rows × 24 columns
model2 = smf.ols(formula='wage ~ educ', data=df2).fit()
print(model2.summary())
OLS Regression Results
==============================================================================
Dep. Variable: wage R-squared: 0.165
Model: OLS Adj. R-squared: 0.163
Method: Least Squares F-statistic: 103.4
Date: Tue, 09 Jul 2024 Prob (F-statistic): 2.78e-22
Time: 21:56:56 Log-Likelihood: -1385.7
No. Observations: 526 AIC: 2775.
Df Residuals: 524 BIC: 2784.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -0.9049 0.685 -1.321 0.187 -2.250 0.441
educ 0.5414 0.053 10.167 0.000 0.437 0.646
==============================================================================
Omnibus: 212.554 Durbin-Watson: 1.824
Prob(Omnibus): 0.000 Jarque-Bera (JB): 807.843
Skew: 1.861 Prob(JB): 3.79e-176
Kurtosis: 7.797 Cond. No. 60.2
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
The intercept of −0.90 literally means that a person with no education has a predicted hourly wage of −90¢ an hour
one more year of education increases hourly wage by 54¢ an hour
Example 2.6 Table 2.2#
df['salary_hat'] = model.fittedvalues
df['uhat'] = model.resid
df[['roe','salary','salary_hat','uhat']].head(16)
roe | salary | salary_hat | uhat | |
---|---|---|---|---|
0 | 14.100000 | 1095 | 1224.058071 | -129.058071 |
1 | 10.900000 | 1001 | 1164.854261 | -163.854261 |
2 | 23.500000 | 1122 | 1397.969216 | -275.969216 |
3 | 5.900000 | 578 | 1072.348338 | -494.348338 |
4 | 13.800000 | 1368 | 1218.507712 | 149.492288 |
5 | 20.000000 | 1145 | 1333.215063 | -188.215063 |
6 | 16.400000 | 1078 | 1266.610785 | -188.610785 |
7 | 16.299999 | 1094 | 1264.760660 | -170.760660 |
8 | 10.500000 | 1237 | 1157.453793 | 79.546207 |
9 | 26.299999 | 833 | 1449.772523 | -616.772523 |
10 | 25.900000 | 567 | 1442.372056 | -875.372056 |
11 | 26.799999 | 933 | 1459.023116 | -526.023116 |
12 | 14.800000 | 1339 | 1237.008898 | 101.991102 |
13 | 22.299999 | 937 | 1375.767778 | -438.767778 |
14 | 56.299999 | 2011 | 2004.808114 | 6.191886 |
15 | 12.600000 | 1585 | 1196.306291 | 388.693709 |
The first four CEOs have lower salaries than what we predicted from the OLS regression line (2.26); in other words, given only the firm’s roe, these CEOs make less than what we predicted. As can be seen from the positive uhat, the fifth CEO makes more than predicted from the OLS regression line.
Example 2.7 Wage & education#
df2['wage'].mean()
np.float64(5.896102674787035)
df2['educ'].mean()
np.float64(12.562737642585551)
print(model2.summary())
OLS Regression Results
==============================================================================
Dep. Variable: wage R-squared: 0.165
Model: OLS Adj. R-squared: 0.163
Method: Least Squares F-statistic: 103.4
Date: Tue, 09 Jul 2024 Prob (F-statistic): 2.78e-22
Time: 21:56:56 Log-Likelihood: -1385.7
No. Observations: 526 AIC: 2775.
Df Residuals: 524 BIC: 2784.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -0.9049 0.685 -1.321 0.187 -2.250 0.441
educ 0.5414 0.053 10.167 0.000 0.437 0.646
==============================================================================
Omnibus: 212.554 Durbin-Watson: 1.824
Prob(Omnibus): 0.000 Jarque-Bera (JB): 807.843
Skew: 1.861 Prob(JB): 3.79e-176
Kurtosis: 7.797 Cond. No. 60.2
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
model2.predict(pd.DataFrame({'educ': [12.56]}))
0 5.894621
dtype: float64
\(\bar{x}\) and \(\bar{y}\) fall on the regression line
Example 2.8. CEO Salary - R-squared#
print(model.summary())
OLS Regression Results
==============================================================================
Dep. Variable: salary R-squared: 0.013
Model: OLS Adj. R-squared: 0.008
Method: Least Squares F-statistic: 2.767
Date: Tue, 09 Jul 2024 Prob (F-statistic): 0.0978
Time: 21:56:56 Log-Likelihood: -1804.5
No. Observations: 209 AIC: 3613.
Df Residuals: 207 BIC: 3620.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 963.1913 213.240 4.517 0.000 542.790 1383.592
roe 18.5012 11.123 1.663 0.098 -3.428 40.431
==============================================================================
Omnibus: 311.096 Durbin-Watson: 2.105
Prob(Omnibus): 0.000 Jarque-Bera (JB): 31120.902
Skew: 6.915 Prob(JB): 0.00
Kurtosis: 61.158 Cond. No. 43.3
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
model.rsquared
np.float64(0.01318862408103405)
The firm’s return on equity explains only about 1.3% of the variation in salaries for this sample of 209 CEOs.
Example2.9 Voting outcome - R-squared.#
print(model3.summary())
OLS Regression Results
==============================================================================
Dep. Variable: voteA R-squared: 0.856
Model: OLS Adj. R-squared: 0.855
Method: Least Squares F-statistic: 1018.
Date: Tue, 09 Jul 2024 Prob (F-statistic): 6.63e-74
Time: 21:56:56 Log-Likelihood: -565.20
No. Observations: 173 AIC: 1134.
Df Residuals: 171 BIC: 1141.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 26.8122 0.887 30.221 0.000 25.061 28.564
shareA 0.4638 0.015 31.901 0.000 0.435 0.493
==============================================================================
Omnibus: 20.747 Durbin-Watson: 1.826
Prob(Omnibus): 0.000 Jarque-Bera (JB): 44.613
Skew: 0.525 Prob(JB): 2.05e-10
Kurtosis: 5.255 Cond. No. 112.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
model3.rsquared
np.float64(0.8561408655827665)
The share of campaign expenditures explains over 85% of the variation in the election outcomes for this sample
Exercises#
C1#
The data in 401K are a subset of data analyzed by Papke (1995) to study the relationship between participation in a 401(k) pension plan and the generosity of the plan. The variable prate is the percentage of eligible workers with an active account; this is the variable we would like to explain. The measure of generosity is the plan match rate, mrate. This variable gives the average amount the firm contributes to each worker’s plan for each \(1 contribution by the worker. For example , if mrate = 0.50, then a \)1 contribution by the worker is matched by a 50¢ contribution by the firm.
df = dataWoo('401K')
df.head()
prate | mrate | totpart | totelg | age | totemp | sole | ltotemp | |
---|---|---|---|---|---|---|---|---|
0 | 26.100000 | 0.21 | 1653.0 | 6322.0 | 8 | 8709.0 | 0 | 9.072112 |
1 | 100.000000 | 1.42 | 262.0 | 262.0 | 6 | 315.0 | 1 | 5.752573 |
2 | 97.599998 | 0.91 | 166.0 | 170.0 | 10 | 275.0 | 1 | 5.616771 |
3 | 100.000000 | 0.42 | 257.0 | 257.0 | 7 | 500.0 | 0 | 6.214608 |
4 | 82.500000 | 0.53 | 591.0 | 716.0 | 28 | 933.0 | 1 | 6.838405 |
dataWoo('401K', description=True)
name of dataset: 401k
no of variables: 8
no of observations: 1534
+----------+---------------------------------+
| variable | label |
+----------+---------------------------------+
| prate | participation rate, percent |
| mrate | 401k plan match rate |
| totpart | total 401k participants |
| totelg | total eligible for 401k plan |
| age | age of 401k plan |
| totemp | total number of firm employees |
| sole | = 1 if 401k is firm's sole plan |
| ltotemp | log of totemp |
+----------+---------------------------------+
L.E. Papke (1995), “Participation in and Contributions to 401(k)
Pension Plans:Evidence from Plan Data,” Journal of Human Resources 30,
311-325. Professor Papke kindly provided these data. She gathered them
from the Internal Revenue Service’s Form 5500 tapes.
(i) Find the average participation rate and the average match rate in the sample of plans.
print("Average participation rate:", round(df['prate'].mean(), 2))
print("Average match rate:", round(df['mrate'].mean(), 2))
Average participation rate: 87.36
Average match rate: 0.73
(ii) Now, estimate the simple regression equation $\( \hat{prate} = \hat{b}_0 + \hat{b}_1 mrate \)$
and report the results along with the sample size and R-squared
prate_hat = smf.ols("prate ~ 1 + mrate", data=df).fit()
print("results:", prate_hat.params)
print("R squared:", prate_hat.rsquared.__round__(3))
print("Sample size:", prate_hat.nobs)
results: Intercept 83.075455
mrate 5.861079
dtype: float64
R squared: 0.075
Sample size: 1534.0
(iii) Interpret the intercept in your equation. Interpret the coefficient on mrate.
print('intercept:', prate_hat.params.iloc[0].__round__(2))
intercept: 83.08
Find the predicted prate when mrate = 3.5. Is this a reasonable prediction? Explain what is happening here.
round(prate_hat.predict({'mrate': 3.5}), 2)
0 103.59
dtype: float64
(v) How much of the variation in prate is explained by mrate? Is this a lot in your opinion?
print("Percentage explained:", round(prate_hat.rsquared * 100, 1))
Percentage explained: 7.5
C2#
The data set in CEOSAL2 contains information on chief executive officers for U.S. corporations. The variable salary is annual compensation, in thousands of dollars, and ceoten is prior number of years as company CEO.
df2 = dataWoo("CEOSAL2")
df2.head()
salary | age | college | grad | comten | ceoten | sales | profits | mktval | lsalary | lsales | lmktval | comtensq | ceotensq | profmarg | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1161 | 49 | 1 | 1 | 9 | 2 | 6200.0 | 966 | 23200.0 | 7.057037 | 8.732305 | 10.051908 | 81 | 4 | 15.580646 |
1 | 600 | 43 | 1 | 1 | 10 | 10 | 283.0 | 48 | 1100.0 | 6.396930 | 5.645447 | 7.003066 | 100 | 100 | 16.961130 |
2 | 379 | 51 | 1 | 1 | 9 | 3 | 169.0 | 40 | 1100.0 | 5.937536 | 5.129899 | 7.003066 | 81 | 9 | 23.668638 |
3 | 651 | 55 | 1 | 0 | 22 | 22 | 1100.0 | -54 | 1000.0 | 6.478509 | 7.003066 | 6.907755 | 484 | 484 | -4.909091 |
4 | 497 | 44 | 1 | 1 | 8 | 6 | 351.0 | 28 | 387.0 | 6.208590 | 5.860786 | 5.958425 | 64 | 36 | 7.977208 |
dataWoo("CEOSAL2", description=True)
name of dataset: ceosal2
no of variables: 15
no of observations: 177
+----------+--------------------------------+
| variable | label |
+----------+--------------------------------+
| salary | 1990 compensation, $1000s |
| age | in years |
| college | =1 if attended college |
| grad | =1 if attended graduate school |
| comten | years with company |
| ceoten | years as ceo with company |
| sales | 1990 firm sales, millions |
| profits | 1990 profits, millions |
| mktval | market value, end 1990, mills. |
| lsalary | log(salary) |
| lsales | log(sales) |
| lmktval | log(mktval) |
| comtensq | comten^2 |
| ceotensq | ceoten^2 |
| profmarg | profits as % of sales |
+----------+--------------------------------+
See CEOSAL1.RAW
(i) Find the average salary and the average tenure in the sample.
print("Average Salary:", round(df2['salary'].mean(), 3))
print("Average ceoten", round(df2["ceoten"].mean(), 2))
Average Salary: 865.864
Average ceoten 7.95
(ii) How many CEOs are in their first year as CEO (that is, ceoten = 0)? What is the longest tenure as a CEO?
print("Number of first year CEO:", (df2['ceoten'] == 0).sum())
print("Longest Tenure:", df2["ceoten"].max())
Number of first year CEO: 5
Longest Tenure: 37
(iii) Estimate the simple regression model $\( \log(salary) = {B}_0 + {B}_1 {ceoten} + u, \)$ and report your results in the usual form. What is the (approximate) predicted percentage increase in salary given one more year as a CEO?
log_salary_hat = smf.ols("np.log(salary) ~ 1 + ceoten", data=df2).fit()
print("Paramters:\n", log_salary_hat.params, sep='')
print("Percentage increase:", round(log_salary_hat.params.iloc[1] * 100, 2))
Paramters:
Intercept 6.505498
ceoten 0.009724
dtype: float64
Percentage increase: 0.97
C3#
Use the data in SLEEP75 from Biddle and Hamermesh (1990) to study whether there is a tradeoff between the time spent sleeping per week and the time spent in paid work. We could use either variable as the dependent variable. For concreteness, estimate the model $\( sleep = B_0 + B_1 totwrk + u \)$ where sleep is minutes spent sleeping at night per week and totwrk is total minutes worked during the week
df3 = dataWoo("sleep75")
df3.head()
age | black | case | clerical | construc | educ | earns74 | gdhlth | inlf | leis1 | ... | spwrk75 | totwrk | union | worknrm | workscnd | exper | yngkid | yrsmarr | hrwage | agesq | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 32 | 0 | 1 | 0.0 | 0.0 | 12 | 0.0 | 0 | 1 | 3529 | ... | 0 | 3438 | 0 | 3438 | 0 | 14 | 0 | 13 | 7.070004 | 1024 |
1 | 31 | 0 | 2 | 0.0 | 0.0 | 14 | 9500.0 | 1 | 1 | 2140 | ... | 0 | 5020 | 0 | 5020 | 0 | 11 | 0 | 0 | 1.429999 | 961 |
2 | 44 | 0 | 3 | 0.0 | 0.0 | 17 | 42500.0 | 1 | 1 | 4595 | ... | 1 | 2815 | 0 | 2815 | 0 | 21 | 0 | 0 | 20.529997 | 1936 |
3 | 30 | 0 | 4 | 0.0 | 0.0 | 12 | 42500.0 | 1 | 1 | 3211 | ... | 1 | 3786 | 0 | 3786 | 0 | 12 | 0 | 12 | 9.619998 | 900 |
4 | 64 | 0 | 5 | 0.0 | 0.0 | 14 | 2500.0 | 1 | 1 | 4052 | ... | 1 | 2580 | 0 | 2580 | 0 | 44 | 0 | 33 | 2.750000 | 4096 |
5 rows × 34 columns
dataWoo("sleep75", description=True)
name of dataset: sleep75
no of variables: 34
no of observations: 706
+----------+--------------------------------+
| variable | label |
+----------+--------------------------------+
| age | in years |
| black | =1 if black |
| case | identifier |
| clerical | =1 if clerical worker |
| construc | =1 if construction worker |
| educ | years of schooling |
| earns74 | total earnings, 1974 |
| gdhlth | =1 if in good or excel. health |
| inlf | =1 if in labor force |
| leis1 | sleep - totwrk |
| leis2 | slpnaps - totwrk |
| leis3 | rlxall - totwrk |
| smsa | =1 if live in smsa |
| lhrwage | log hourly wage |
| lothinc | log othinc, unless othinc < 0 |
| male | =1 if male |
| marr | =1 if married |
| prot | =1 if Protestant |
| rlxall | slpnaps + personal activs |
| selfe | =1 if self employed |
| sleep | mins sleep at night, per wk |
| slpnaps | minutes sleep, inc. naps |
| south | =1 if live in south |
| spsepay | spousal wage income |
| spwrk75 | =1 if spouse works |
| totwrk | mins worked per week |
| union | =1 if belong to union |
| worknrm | mins work main job |
| workscnd | mins work second job |
| exper | age - educ - 6 |
| yngkid | =1 if children < 3 present |
| yrsmarr | years married |
| hrwage | hourly wage |
| agesq | age^2 |
+----------+--------------------------------+
J.E. Biddle and D.S. Hamermesh (1990), “Sleep and the Allocation of
Time,” Journal of Political Economy 98, 922-943. Professor Biddle
kindly provided the data.