/*------------------------------------------------------------------* | MACRO NAME : surv | SHORT DESC : General survival statistics, K-M estimates, | log-rank test *------------------------------------------------------------------* | CREATED BY : Offord, Jan (04/13/2004 9:34) | : Harrell, Frank | : Helms, Mike *------------------------------------------------------------------* | PURPOSE | | This macro will calculate the general survival statistics (p(t), | standard error, confidience limits, and median survival time. It | will also do the k-sample logrank and plotting upon request. | | For "expected" or "normal" survival calculations refer | to %SURVEXP. | | | | *------------------------------------------------------------------* | MODIFIED BY : Christianson, Teresa (10/17/2005 13:43) | | Clarify documentation regarding Confidence Intervals. *------------------------------------------------------------------* | MODIFIED BY : Christianson, Teresa (10/18/2005 17:03) | | In version 9, will fail if all titles are null due to %Eval of . | Minor modification to allow for no titles. *------------------------------------------------------------------* | MODIFIED BY : Christianson, Teresa (02/09/2006 13:14) | | Missing log rank p-values (due to only 1 class level or no events | in both classes) currently show up as PVALUE < .001 on the plots. | Changing so they are PVALUE = NA . NOTE: macro variables that are | missing are stored differently in various versions of SAS. Some | are stored as . and others as (a blank). *------------------------------------------------------------------* | OPERATING SYSTEM COMPATIBILITY | | UNIX SAS v8 : YES | UNIX SAS v9 : | MVS SAS v8 : YES | MVS SAS v9 : | PC SAS v8 : | PC SAS v9 : *------------------------------------------------------------------* | MACRO CALL | | %surv ( | time= , | event= , | cen_vl=0, | class= , | by= , | out=_survout, | outsum=_survsum, | data=_LAST_, | printop=1, | points= , | cl=3, | alpha=0.05, | logrank=1, | medtype=1, | plottype=1, | plotop=1, | scale=1, | maxtime= , | xdivisor= , | laserprt= , | pvals=Y | ); *------------------------------------------------------------------* | REQUIRED PARAMETERS | | Name : time | Default : | Type : Variable Name (Single) | Purpose : Variable containing time to event or last follow-up | in any units | | Name : event | Default : | Type : Variable Name (Single) | Purpose : Event variable as a numeric two-valued variable, (0,1), | (1,2) etc. The event value must be 1 larger than | the censoring value. | *------------------------------------------------------------------* | OPTIONAL PARAMETERS | | Name : cen_vl | Default : 0 | Type : Number (Single) | Purpose : Censoring value for the EVENT variable as 0,1 etc. | Event = Cen_vl + 1. (Default is 0) | | Name : class | Default : | Type : Variable Name (List) | Purpose : List of classification variables. They may be either | character or numeric. Note - Any observations in the | input dataset with missing CLASS data, are not included | in the results. Using multiple class variables will | cause a problem when plotting. | | Name : by | Default : | Type : Variable Name (List) | Purpose : List of "by" variables. They may be either character or numeric. | | Name : out | Default : _survout | Type : Dataset Name | Purpose : Output dataset name | | Name : outsum | Default : _survsum | Type : Dataset Name | Purpose : Output summary dataset name | | Name : data | Default : _LAST_ | Type : Dataset Name | Purpose : Input dataset name (Default is the last dataset created) | | Name : printop | Default : 1 | Type : Number (Single) | Purpose : printing options (Default is 1): | 0 = print nothing | 1 = print summary table only | 2 = print one line per event | 3 = print one line per event and/or censor | 4 = print one line for each of a series of time points, | as months, half-years, or years (see POINTS). | 5 = print one line per event and/or defined time points. | 6 = print one line per event and/or censor and/or time | point. | | Name : points | Default : | Type : Text | Purpose : Specific time points at which survival statistics are | needed, as months, half-years, years. These points are | specified by dividing time into intervals as: | '0 to 36500 by 365'. | The endpoint of each interval will be the time point to | be reported. If you have comas within your statement, | enclose the entire parameter in quotes, as: | '0 to 360 by 30, 0 to 3650 by 182.5' | You may also specify specific points as well as | groups of points as: | '1,5,10,15,0 to 36500 by 182.5' | | Name : cl | Default : 3 | Type : Number (Single) | Purpose : Type of confidence limits (Default is 3, 1 and 2 are not recommended) | 1 = Greenwood (simple) | 2 = Greenwood with modified lower limit | 3 = log-e transformation (log) | 4 = log-e transformation with modified lower limit | 5 = log(-log-e) transformation (log(-log)) | 6 = log(log-e) transformation with modified lower limit | 7 = logit transformation (logit) | 8 =logit transformation with modified lowere limit | | Name : alpha | Default : 0.05 | Type : Number (Single) | Purpose : Type I error rate for confidence limits | | Name : logrank | Default : 1 | Type : Number (Single) | Purpose : Option to compute the logrank k-sample test statistics | for the groups defined by the variable CLASS. Separate | tests will be done for BY variable groupings. | (Default is 1) | 1 = do not calculate | 2 = calculate | | Name : medtype | Default : 1 | Type : Number (Single) | Purpose : Type of median if there are several time points | having probability=.5 (Default is 1) | 1 = use the midpoint between the times as the median | 2 = use the first time value as the median | | Name : plottype | Default : 1 | Type : Number (Single) | Purpose : Where to plot the graph(s) (Default is 1) | 1 = no plot | 2 = greenbar printer plot | 3 = graphics plot on unix laser printer | 4 = plot goes to graphics window | (interactive processing) | | Name : plotop | Default : 1 | Type : Number (Single) | Purpose : What to plot on the y-axis (Default is 1) | 1 = plot pt | 2 = plot 1-pt or pe | | Name : scale | Default : 1 | Type : Number (Single) | Purpose : plotting scale (Default is 1) | 1 = arithmetic | 2 = 1-cycle log | 3 = 2-cycle log | | Name : maxtime | Default : | Type : Number (Single) | Purpose : the maximum time allowed for the x-axis (Default is the | max time for all graphs per page). Specify MAXTIME in | the same units as TIME, even if XDIVISOR used. | | Name : xdivisor | Default : | Type : Number (Single) | Purpose : the divisor used if you want the plotted x-axis in | other units then TIME is in. | Example: XDIVISOR=365 would plot | the x-axis as TIME/365. | | Name : laserprt | Default : | Type : Text | Purpose : the name of the HSR printer you want your plot to go to | if different from your standard printer. | | Name : pvals | Default : Y | Type : Text | Purpose : print p values on plots (Default is Y) | N = No p values | Y = Print p_values | *------------------------------------------------------------------* | RETURNED INFORMATION | | Output Dataset OUT: | | Output Dataset (OUT) contains one observation for each event, | censor, and extra time point specified by POINTS. The variables | in the output dataset are: | | &by = the by-variable(s) (if defined). | | &class = the class variable(s) (if defined). | | &time = the time variable. | | NRISK = the number at risk at &time. | | NEVENT = the number of events from &time the next time | | NCENSOR = the number of censors from &time the next time | | CUM_EV = the cumulative number of events up to and including &time | | CUM_CEN = The cumulative number of censors up to and including | &time | | PT = The probability of no event up to and including &time. | | PE = 1-PT, or the probability of an event occcurring. | | UPPER_CL = the upper confidience limit (based on the input | parameters ALPHA and CL). | | LOWER_CL = the lower confidience limit (based on the input | parameters ALPHA and CL). | | SE = the Greenwood Standard Error. | | POINTFLG = the flag indicating points added to the output dataset | because the of POINTS option. (1=point added, | missing otherwise). | | | Output Dataset OUTSUM: | | Output Summary Dataset (OUTSUM) contains one observation for | each group processed. That is, the total group, or each BY | and/or CLASS value. The variables in the output dataset are: | | &by = the by-variable(s) (if defined). | | &class = the class variable(s) (if defined). | | TOTAL = the total number of observations in this group. | | CUM_EV = the total number of events in this group. | | CUM_CEN = the total number of censors in this group. | | TL_MISS = the total number of observations not included because | of missing values. | | MEDIAN = the median survival time (based on the input parameter | MEDTYPE). | | The following variables will be added if the LOGRANK test is | specified: | | OBSERVED = the calculated number of observed events. | | EXPECTED = the calculated number of expected events. | | RR = the Relative Risk (this group's observed/expected / group 1's | observed/expected). | | CHISQ = chi-square value. | | DF = degrees of freedom. | | PVALUE = pvalue (probability of a greater chi-square value). | | | | *------------------------------------------------------------------* | ADDITIONAL NOTES | | | 1. If you are getting a message about VPOS not being large | enough, try cutting down on the number of title lines you | are using. SAS does it's calculations for size based on | 1 title, so having 3 or 4 titles MAY cause a problem with | the vertical spacing. | | 2. If you are plotting the output dataset yourself, remember | that you need a symbol statement as follows to get the steps | correct: | | symbol1 i=stepjl v=none l=1; | | | | *------------------------------------------------------------------* | EXAMPLES | | | %surv(time=fu_time,event=fu_stat,cen_vl=1); | | %surv(time=fu_time,event=fu_stat,cen_vl=1,class=arm, | out=two,data=one,printop=4,logrank=1,cl=6, | points='0 to 36500 by 182.5'); | | %surv(time=fu_time,event=fu_stat,cen_vl=1,class=arm,by=course, | out=two,data=one,printop=6,logrank=2,plottype=2,xdivisor=365, | points='0 to 360 by 30, 361 to 36500 by 365'); | | | | *------------------------------------------------------------------* | Copyright 2006 Mayo Clinic College of Medicine. | | This program is free software; you can redistribute it and/or | modify it under the terms of the GNU General Public License as | published by the Free Software Foundation; either version 2 of | the License, or (at your option) any later version. | | This program is distributed in the hope that it will be useful, | but WITHOUT ANY WARRANTY; without even the implied warranty of | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU | General Public License for more details. *------------------------------------------------------------------*/ /* SAS MACRO SURV This macro will calculate the general survival statistics (p(t), standard error, confidience limits, and median survival time. It will also do the k-sample logrank and plotting upon request. Note: For "expected" or "normal" survival calculations refer to %SURVEXP. The Input Parameters are: TIME = Variable containing time to event or last follow-up in any units (Required) EVENT = Event variable as a numeric two-valued variable, (0,1), (1,2) etc. The event value must be 1 larger than the censoring value. (Required) CEN_VL = Censoring value for the EVENT variable as 0,1 etc. Event = Cen_vl + 1. (Default is 0) CLASS = List of classification variables. They may be either character or numeric. Note - Any observations in the input dataset with missing CLASS data, are not included in the results. Using multiple class variables will cause a problem when plotting. BY = List of "by" variables. They may be either character or numeric. OUT = Output dataset name (Default is _SURVOUT) OUTSUM = Output summary dataset name (Default is _SURVSUM) DATA = Input dataset name (Default is the last dataset created) PRINTOP = printing options (Default is 1): 0 = print nothing 1 = print summary table only 2 = print one line per event 3 = print one line per event and/or censor 4 = print one line for each of a series of time points, as months, half-years, or years (see POINTS). 5 = print one line per event and/or defined time points. 6 = print one line per event and/or censor and/or time point. POINTS = Specific time points at which survival statistics are needed, as months, half-years, years. These points are specified by dividing time into intervals as: '0 to 36500 by 365'. The endpoint of each interval will be the time point to be reported. If you have comas within your statement, enclose the entire parameter in quotes, as: '0 to 360 by 30, 0 to 3650 by 182.5' You may also specify specific points as well as groups of points as: '1,5,10,15,0 to 36500 by 182.5' CL = Type of confidence limits (Default is 3, 1 and 2 are not recommended) 1 = Greenwood (simple) 2 = Greenwood with modified lower limit 3 = log-e transformation (log) 4 = log-e transformation with modified lower limit 5 = log(-log-e) transformation (log(-log)) 6 = log(log-e) transformation with modified lower limit 7 = logit transformation (logit) 8 =logit transformation with modified lowere limit ALPHA= Type I error rate for confidence limits (Default is .05) LOGRANK = Option to compute the logrank k-sample test statistics for the groups defined by the variable CLASS. Separate tests will be done for BY variable groupings. (Default is 1) 1 = do not calculate 2 = calculate MEDTYPE = Type of median if there are several time points having probability=.5 (Default is 1) 1 = use the midpoint between the times as the median 2 = use the first time value as the median PLOTTYPE = Where to plot the graph(s) (Default is 1) 1 = no plot 2 = greenbar printer plot 3 = graphics plot on unix laser printer 4 = plot goes to graphics window (interactive processing) PLOTOP = What to plot on the y-axis (Default is 1) 1 = plot pt 2 = plot 1-pt or pe SCALE = plotting scale (Default is 1) 1 = arithmetic 2 = 1-cycle log 3 = 2-cycle log MAXTIME = the maximum time allowed for the x-axis (Default is the max time for all graphs per page). Specify MAXTIME in the same units as TIME, even if XDIVISOR used. XDIVISOR = the divisor used if you want the plotted x-axis in other units then TIME is in. Example: XDIVISOR=365 would plot the x-axis as TIME/365. LASERPRT = the name of the HSR printer you want your plot to go to if different from your standard printer. PVALS = print p values on plots (Default is Y) N = No p values Y = Print p_values Output Dataset OUT: Output Dataset (OUT) contains one observation for each event, censor, and extra time point specified by POINTS. The variables in the output dataset are: &by = the by-variable(s) (if defined). &class = the class variable(s) (if defined). &time = the time variable. NRISK = the number at risk at &time. NEVENT = the number of events from &time the next time NCENSOR = the number of censors from &time the next time CUM_EV = the cumulative number of events up to and including &time CUM_CEN = The cumulative number of censors up to and including &time PT = The probability of no event up to and including &time. PE = 1-PT, or the probability of an event occcurring. UPPER_CL = the upper confidience limit (based on the input parameters ALPHA and CL). LOWER_CL = the lower confidience limit (based on the input parameters ALPHA and CL). SE = the Greenwood Standard Error. POINTFLG = the flag indicating points added to the output dataset because the of POINTS option. (1=point added, missing otherwise). Output Dataset OUTSUM: Output Summary Dataset (OUTSUM) contains one observation for each group processed. That is, the total group, or each BY and/or CLASS value. The variables in the output dataset are: &by = the by-variable(s) (if defined). &class = the class variable(s) (if defined). TOTAL = the total number of observations in this group. CUM_EV = the total number of events in this group. CUM_CEN = the total number of censors in this group. TL_MISS = the total number of observations not included because of missing values. MEDIAN = the median survival time (based on the input parameter MEDTYPE). The following variables will be added if the LOGRANK test is specified: OBSERVED = the calculated number of observed events. EXPECTED = the calculated number of expected events. RR = the Relative Risk (this group's observed/expected / group 1's observed/expected). CHISQ = chi-square value. DF = degrees of freedom. PVALUE = pvalue (probability of a greater chi-square value). Notes: 1. If you are getting a message about VPOS not being large enough, try cutting down on the number of title lines you are using. SAS does it's calculations for size based on 1 title, so having 3 or 4 titles MAY cause a problem with the vertical spacing. 2. If you are plotting the output dataset yourself, remember that you need a symbol statement as follows to get the steps correct: symbol1 i=stepjl v=none l=1; Examples: %surv(time=fu_time,event=fu_stat,cen_vl=1); %surv(time=fu_time,event=fu_stat,cen_vl=1,class=arm, out=two,data=one,printop=4,logrank=1,cl=6, points='0 to 36500 by 182.5'); %surv(time=fu_time,event=fu_stat,cen_vl=1,class=arm,by=course, out=two,data=one,printop=6,logrank=2,plottype=2,xdivisor=365, points='0 to 360 by 30, 361 to 36500 by 365'); Programmer: Jan Offord (based on macro KMPL by Frank Harrell and Mike Helms) Date: April, 1993 */ %MACRO SURV (TIME= ,EVENT= ,CEN_VL=0, PRINTOP=1,CLASS= ,BY= , DATA=_LAST_, OUT=_SURVOUT, POINTS= ,CL=3, ALPHA=.05,PLOTTYPE=1,PLOTOP=1,SCALE=1,MAXTIME= , XDIVISOR= ,LASERPRT= ,PVALS=Y ,LOGRANk=1,MEDTYPE=1, OUTSUM=_SURVSUM); RUN; proc sql; reset noprint; select max(number) into :t from dictionary.titles; quit; %if &t=. %then %let t= ; %let t=%eval(&t+2); %if &t > 10 %then %let t=10; %local byword byclword lastby lastbycl dev errorflg j a b x cgrp cl_name indata; %LET errorflg = 0; %LET byword = ; %LET byclword = ; %LET lastby = ; %LET lastbycl = ; %let p=.; %LET dev = &SYSDEVIC; %let a = %index(&points,%str(%')); %if &a > 0 %then %do; %let b = %eval(%length(&points)-1); %let points = %substr(&points,%eval(&a+1),%eval(&b-1)); %end; %if &time= %then %do; %put ERROR - Variable