/* compseg.sas A SAS program to read the Compustat Industry Segment file. This data file is not supported by the SAS PROC DATASOURCE command, at least not in SAS versions 6.12 and below. Haim Mozes at Fordham University kindly shares this program, in response to a request on the CRSP-L. This program has been edited only by reformatting and the addition of an endsas statement to finish the program. Don Cram 6/4/97. NOTE: This program as written may or may not do an adequate job of replacing Compustat Universal missing value code of -.001 by SAS missing value code. Haim checked whether Assets and other variables are less than zero rather than checking if the values = -.001 directly. So I have commented out his missing value replacement code and replaced it. Don Cram 9/8/97. From: Haim Mozes Date: Wed, 4 Jun 1997 Subject: the SAS program to read the segment file I am sending you the SAS program. A few things you should know: 1 - The program was writtten for a VAX system. 2 - The program reads the file from disk. You have to dump it from the tape to the disk first. 3 - I dropped some variables, because I didn't need them for my purposes. To change this, just alter the keep/drop statement. */ filename compseg 'dsk10:[gbapgms]new280sic.dat'; libname segments 'dsk10:[gbapgms.util]'; data segments.segment ; infile compseg lrecl=774; input dnum 1-4 cnum $ 5-10 cic 11-13 stk 14 smbl $ 15-22 file 23-24 zlist 25-26 @55 coname $ 55-82 ssrce 83-84 fyr 85-86 year 87-88 fundf 89-94 sucode 95 sid 96-97 segn 98-99 sname $ 100-127 pname1 $ 128-147 pname2 $ 148-167 pname3 $ 168-187 pname4 $ 188-207 cname1 $ 208-223 cname2 $ 224-239 cname3 $ 240-255 cname4 $ 256-271 cname5 $ 272-287 cname6 $ 288-303 ssic1 304-307 ssic2 308-311 psic1 312-315 psic2 316-319 psic3 320-323 psic4 324-327 @428 sales 10.3 opprofit 10.3 deprec 10.3 capex 10.3 identass 10.3 eqincome 10.3 eqinvest 10.3 nworkers 10.3 backlog 10.3 rDcust 10.3 rdcomp 10.3 dgovrev 10.3 fgovrev 10.3 csale1 10.3 csale2 10.3 csale3 10.3 csale4 10.3 psale1 10.3 psale2 10.3 psale3 10.3 psale4 10.3 @734 foot20 $ 774 ; /* if backlog not > 0 then backlog=.; if sales not > 0 then sales=.; if capex not > 0 then capex=.; if rdcomp not > 0 then rdcomp=.; if identass not > 0 then identass =.; if nworkers not > 0 then nworkers =.; if backlog = -.01 then backlog=.; if sales =-.01 then sales=.; if capex =-.01 then capex=.; if rdcomp =-.01 then rdcomp=.; if identass =-.01 then identass =.; if nworkers = -.01 then nworkers =.; if opprofit = -.01 then opprofit=.; if backlog = -.001 then backlog=.; if sales = -.001 then sales=.; if capex = -.001 then capex=.; if rdcomp = -.001 then rdcomp=.; if identass = -.001 then identass =.; if nworkers = -.001 then nworkers =.; if opprofit = -.001 then opprofit=.; */ roa=opprofit/identass; pm=opprofit/sales; /* Replace Compustat Universal missing value code of -.001 by SAS missing value code. Programmers note that for IBM360/370 format Compustat data, the missing value code to be replaced is .0001. See Compustat Technical Guide and/or CRSP-L FAQ for explanation. */ array numerics {*} dnum cic stk file zlist ssrce fyr year fundf sucode sid segn ssic1 ssic2 psic1-psic4 sales opprofit deprec capex identass eqincome eqinvest nworkers backlog rDcust rdcomp dgovrev fgovrev csale1-csale4 psale1-psale4; do i = 1 to dim(numerics); if numerics{i} = -.001 then numerics{i} = .; end; drop i; drop pname1 pname2 pname3 pname4 cname1 cname2 cname3 cname4 cname5 cname6 foot20 ssic2 psic1 psic2 psic3 psic4 eqinvest dgovrev fgovrev fundf sucode prevd file zlist ssrce eqincome rdcust rdcomp smbl; endsas;