The STATA mailing list has a way to identify runs of consecutive observations. With some googling, SAS can do the same thing. Here's how.
Suppose you want to figure out how many observations you have per GVKEY:
I write because I forget. 因為忘,所以寫下來.
The STATA mailing list has a way to identify runs of consecutive observations. With some googling, SAS can do the same thing. Here's how.
Suppose you want to figure out how many observations you have per GVKEY:
CIKs are 10-digit characters, but if you manually code them from EDGAR, they won't have leading zeroes. So use "z10."
PROC MEANS nolabel DATA=[dataset] ;
CLASS [year];
VAR [variable];
OUTPUT OUT=[dataset] MEAN= ;
RUN;
MEAN can be replaced by MEDIAN.
Suppose your empirical specification has both unit and time fixed effects. You don't the table to be cluttered with n or m variables, do you?
Including indicate("Time fixed effects = " "Unit fixed effects = ") after esttab will do the trick. Note the STATA output to determine what to put after the equal signs.
Emerald at UNC has STATA 11.2 but is not cooperative when it comes to downloading *.ado files (e.g. estout). So I downloaded estout in STATA 9.2 through Latte at Fuqua and then copied the folder to my Emerald account. Do note that you must use "scp" and not "sftp" due to the latter's restriction on recursive copying (i.e. folders).
The help for adopath suggested that the folder be copied to the "/netscr/[username]" folder. After that, esttab worked just fine.
GVKEY as-is from COMPUSTAT in WRDS comes as a string variable. To destring it in STATA, type
To destring it in SAS, a ghetto way is to multiply it by 1.
Table 4 of "The power of the pen and executive compensation" (Core, Guay, and Larcker 2008) contains "fixed effects for year and 2-digit SIC codes." But ExecuComp in WRDS gives you the 4-digit SIC codes. How do you get from 4 digits to 2 in SAS?
First, note that the first two digits denote the "major industry group." Second, 4-digit SIC from ExecuComp is unsurprisingly numeric. So substr won't work unless you first convert from numeric to string:
Then you can use substr. Just be sure to convert it back to numeric:
Now run your fixed effects regression.