- At minimum, you need SICH, SALE, and FYEAR from COMPUSTAT.
- SICH is four digits. As an example, if you want to compute HHI by three-digit SIC, do the following:
char_sic=put(sich,4.0);*This converts SICH from numeric to character; three_char_sic=substr(char_sic,1,3);*This takes the first three components of the string; sicthree=input(three_char_sic,best3.);*This converts the character back;
- We first need to compute the total size of the market for a given industry-year:
proc summary data=sic; by sicthree fyear; var sale; output out=hhidenom (drop=_type_ _freq_) sum=; run;
- Now, we can compute the HHI:
*Sort before merge; proc sort data=sic; by sicthree fyear; run; proc sort data=hhidenom; by sicthree fyear; run;
*Merge denominator; proc sql;*http://sbaleone.bus.miami.edu/PERLCOURSE/SASFILES/SQL_EXAMPLES.sas; create table compstatdenom as select a.*, b.sale as denom from sic a left join hhidenom b on (a.sicthree=b.sicthree) and (a.fyear=b.fyear); quit;
*Create market share; data compstatshare; set compstatdenom; share=sale/denom; sharesq=share**2; run;
*Create HHI; *http://www.talkstats.com/showthread.php/10892-Sum-up-values-by-groups; proc summary data=compstatshare; by sicthree fyear; var sharesq; output out=sumhhi (drop=_type_ _freq_) sum=; run;
*Merge HHI; proc sql;*http://sbaleone.bus.miami.edu/PERLCOURSE/SASFILES/SQL_EXAMPLES.sas; create table compstathhi as select a.*, b.sharesq as hhi from compstatshare a left join sumhhi b on (a.sicthree=b.sicthree) and (a.fyear=b.fyear); quit;
Tuesday, February 19, 2013
Steps to compute HHI in SAS
Monday, April 09, 2012
Dates in SAS
Suppose you imported data from a spreadsheet with dates that look like 20120409. How can you convert them to the familiar date9 format in SAS?
If they are already in character format, do the following:
Take care to replace the mmddyy8. depending on how the date is initially written.
If they are in numeric format, then do the following:
That is, you first convert numeric to character and then character to date. Be sure to keep in mind that this process is vastly made easier if you take care to input the dates in a "nice" format to begin with in the spreadsheet.
Monday, January 16, 2012
How do I identify runs of consecutive observations in panel data?
The STATA mailing list has a way to identify runs of consecutive observations. With some googling, SAS can do the same thing. Here's how.
Suppose you want to figure out how many observations you have per GVKEY:
Monday, November 21, 2011
Merging with CIK
CIKs are 10-digit characters, but if you manually code them from EDGAR, they won't have leading zeroes. So use "z10."
Tuesday, October 11, 2011
Averages across years
PROC MEANS nolabel DATA=[dataset] ;
CLASS [year];
VAR [variable];
OUTPUT OUT=[dataset] MEAN= ;
RUN;
MEAN can be replaced by MEDIAN.
Sunday, October 09, 2011
Esttab and multiple dummy indicators
Suppose your empirical specification has both unit and time fixed effects. You don't the table to be cluttered with n or m variables, do you?
Including indicate("Time fixed effects = " "Unit fixed effects = ") after esttab will do the trick. Note the STATA output to determine what to put after the equal signs.
Ado Path
Emerald at UNC has STATA 11.2 but is not cooperative when it comes to downloading *.ado files (e.g. estout). So I downloaded estout in STATA 9.2 through Latte at Fuqua and then copied the folder to my Emerald account. Do note that you must use "scp" and not "sftp" due to the latter's restriction on recursive copying (i.e. folders).
The help for adopath suggested that the folder be copied to the "/netscr/[username]" folder. After that, esttab worked just fine.