From OGJAL@coba.ttu.edu Sun Jun 16 09:04:35 1996 Received: from COBA2.TTU.EDU (coba2.ttu.edu [129.118.49.47]) by gsb-pound.Stanford.EDU (8.7.5/8.7.1) with SMTP id JAA14873 for ; Sun, 16 Jun 1996 09:04:34 -0700 (PDT) Date: Sun, 16 Jun 1996 11:05:34 -0500 (CDT) From: TERRY JALBERT To: doncram@sphinx.stanford.edu Message-Id: <960616110534.20801352@coba.ttu.edu> Subject: SAS program for beta calculation Status: RO title 'this program calculates Scholes-Wms betas from CRSP information'; options ls=80 ps=45; /* Terrance Jalbert's beta calculation program, in SAS, which was mentioned on CRSP-L. Terrance points out this program may be very specific to his situation. It appears to me to draw on CRSP data installed by a FORTRAN program (as it addresses the CRSP file missing value codes -66, -77, etc.) rather than by SAS PROC DATASOURCE (which recodes missing values as .P, .T etc., see p. 19 of my GSB Technical Report 80. It implements a beta calculation formula from the CRSP manual, a formula which Terrance reports was listed incorrectly in many copies of the CRSP stock file guide dated December 31, 1993, at least. Don Cram > > With regard to calculating beta's. I have done this before using sas. > > The code is not difficult, I just used the formula in the CRSP mannual. > > I do want to caution you on one thing though.... The formula for > > calculating betas is wrong in some versions of the CRSP documentation. > > So you want to make sure before you do it that you check with CRSP > > to get the correct formula. I would be happy to supply you with the > > sas code, however; the code that I have is very much specific to the > > application that I was using it for, and thus I am not sure how > > usefull that it would be for you. > > > > With regard to missing days ect... I believe that they handle this by > > not calculating the beta's for anything that does not have at least > > 50 or 100 days of information. I do not recall how they handle a > > specific day where data was missing. On the Bid ask spread question. > > If I remember correctly, they use the average of the bid ask spread. > > The precise method by which they do it is outlined in the mannual. > > > > Terrance Jalbert > > Texas Tech Univerisity. */ data one; infile 'DISK$5:[CRSP.INDEX94]dsp500'; input ldate vwretd vwretx ewretd ewretx totval totcnt usdval usdcnt spindx sprtrn; drop vwretx ewretd ewretx totval totcnt usdval usdcnt spindx sprtrn; if vwretd = -99.0 then delete; if Ldate < 810101 then delete; VWR=(1+ VWRETD); LLVWR=LOG(VWR); MRET=LAG(LLVWR); date=lag(ldate); l1=lag1(LLVWR); l2=lag2(LLVWR); mRET3=(LLVWR + l1 + l2); PRODM=(MRET*MRET3); drop l1 l2 VWR VWRETD VWRETX EWRETD EWRETX TOTVAL TOTCNT USDVAL USDCNT SPINDX SPRTRN LDATE LLVWR; proc sort; by date; title 'this program will read the daily return info from my file information'; options ls=80 ps=45; data two; infile mercdr1; input cusip $ date dr @@; if dr = -66.0 then delete; if dr = -77.0 then delete; if dr = -88.0 then delete; if dr = -99.0 then delete; DR1=(1+DR); RET=LOG(DR1); DROP DR DR1; proc sort; by date; data threeb; merge ONE(in=inone) two(in=intwo); by date; if intwo ne 1 then delete; IF DATE < 820101 THEN D1 = 1; IF DATE > 811231 AND DATE < 830101 THEN D1=2; IF DATE > 821231 AND DATE < 840101 THEN D1=3; IF DATE > 831231 AND DATE < 850101 THEN D1=4; IF DATE > 841231 AND DATE < 860101 THEN D1=5; IF DATE > 851231 AND DATE < 870101 THEN D1=6; IF DATE > 861231 AND DATE < 880101 THEN D1=7; IF DATE > 871231 AND DATE < 890101 THEN D1=8; IF DATE > 881231 AND DATE < 900101 THEN D1=9; IF DATE > 891231 AND DATE < 910101 THEN D1=10; IF DATE > 901231 AND DATE < 920101 THEN D1=11; IF DATE > 911231 AND DATE < 930101 THEN D1=12; IF DATE > 921231 AND DATE < 940101 THEN D1=13; IF DATE > 931231 AND DATE < 950101 THEN D1=14; IF DATE > 941231 AND DATE < 960101 THEN D1=15; PRODR=(RET*MRET3); data three; set threeb; PROC SORT; BY CUSIP D1; PROC UNIVARIATE DATA=THREE NOPRINT; var MRET3; by cusip D1; output out=outMRET3 sum=sum1; PROC UNIVARIATE DATA=THREE NOPRINT; VAR RET; BY CUSIP D1; OUTPUT OUT=OUTRET N=N1 SUM=SUM2 STD=STD; PROC UNIVARIATE DATA=THREE NOPRINT; VAR MRET; BY CUSIP D1; OUTPUT OUT=OUTMRET SUM=SUM3; PROC UNIVARIATE DATA=THREE NOPRINT; VAR PRODM; BY CUSIP D1; OUTPUT OUT=OUTPRODM SUM=SUM4; PROC UNIVARIATE DATA=THREE NOPRINT; VAR PRODR; BY CUSIP D1; OUTPUT OUT=OUTPRODR SUM=SUM5; DATA FOUR; MERGE OUTMRET OUTMRET3 OUTRET OUTPRODM OUTPRODR; BY CUSIP D1; IF N1 < 50 THEN DELETE; TOP=((SUM5)-((1/N1)*(SUM2*SUM1))); BOTTOM=((SUM4)-((1/N1)*(SUM3*SUM1))); BETA=TOP/BOTTOM; DROP SUM1 SUM2 SUM3 SUM4 SUM5 TOP BOTTOM; PROC PRINT U;