Tuesday, September 30, 2014

R - confidence intervals for risk ratio, odds ratio, attributable risk (resp. absolute risk reduction), cond. MLE odds ratio

1 R - confidence intervals for risk ratio, odds ratio, attributable risk (resp. absolute risk reduction), cond. MLE odds ratio

  • generate the data, and table them

obese <- sample(c(T,F),size=1000,replace=T)
mi <- factor((obese + sample(c(-1,0,1),prob = c(0.3,0.35,0.35),size=1000,replace=T)) > 0,labels=c("myocardinfarct","non"))
obese <- factor(obese,labels=c("obese","non-obese"))

table(obese,mi)

mi
obese       myocardinfarct non
  obese                326 187
  non-obese            144 343

  • use the twoby2 function to get the ratios incl. the confidence intervals
  • the risk difference (resp. attributable risk (AR), absolute risk reduction (ARR)) is also calculated

require(Epi)
twoby2(obese,mi)

2 by 2 table analysis: 
------------------------------------------------------ 
Outcome   : myocardinfarct 
Comparing : obese vs. non-obese 

          myocardinfarct non    P(myocardinfarct) 95% conf. interval
obese                326 187               0.6355    0.5929   0.6760
non-obese            144 343               0.2957    0.2568   0.3378

                                   95% conf. interval
             Relative Risk: 2.1491    1.8462   2.5018
         Sample Odds Ratio: 4.1525    3.1859   5.4122
Conditional MLE Odds Ratio: 4.1462    3.1584   5.4617
    Probability difference: 0.3398    0.2800   0.3959

             Exact P-value: 0 
        Asymptotic P-value: 0 
------------------------------------------------------

  • so we got a risk of myocard. infarction of 0.6507 for obese and 0.3006 for non-obese
  • dividing the first by the second (0.6507/0.3006) gives the risk ratio
  • the odds are not given for the groups, calculated by hand it would be

(326/187)/(144/343)

[1] 4.152481

  • which you see in the output of twoby2() (in addition you find the conditional MLE odds)
  • the probability difference (as stated above also known as attributable risk (AR) and absolute risk reduction (ARR)) is simply the difference between the two risks
confidence-intervals-for-odds-ratio paired case

Saturday, September 27, 2014

Flow Charts with R and the Gmisc package

1 flow charts

1.1 with the Gmisc package

  • first create a transition matrix tm, i.e. the rows represent start state the columns represent end state, so tm[1,1] is the number of subjects state in state 1 and end in state 1, so tm[1,3] is the number of subjects state in state 1 and end in state 3


tm <- matrix(NA,nrow=3,ncol=3)
tm[1,] <- 200*c(.5,.25,.25)
tm[2,] <- 800*c(.75,.1,.15)
tm[3,] <- 600*c(0,.2,.8)

tm

     [,1] [,2] [,3]
[1,]  100   50   50
[2,]  600   80  120
[3,]    0  120  480

  • a more convenient way is to tabulate start state and end state


xx <- data.frame(id=1:1000,
                 start.state=sample(1:4,size=1000,replace=T,prob=c(0.2,0.5,0.1,0.2)),
                 end.state=sample(1:4,size=1000,replace=T,prob=c(0.1,0.3,0.2,0.4)))

tm2 <- table(xx$start.state,xx$end.state)
tm2

 
    1   2   3   4
1  24  67  37  89
2  45 143  87 196
3  13  30  24  44
4  25  59  39  78

  • than a simple call to transitionPlot() does the magic
  • box_txt provides the states' names

plot.new() ## call plot.new because transitionPlot does not
transitionPlot(tm,box_txt = 1:3)






plot.new() ## call plot.new because transitionPlot does not
transitionPlot(tm2,box_txt = paste("state",1:4))




  • of course you can change the appearance of the plot, e.g. color of the boxes, appearance of the arrows
  • in a similar way text color can be changed


plot.new() ## call plot.new because transitionPlot does not

transitionPlot(tm2,box_txt = c(paste("state",1:4)),
               type_of_arrow="gradient",
               fill_start_box=c("midnightblue","darkred","darkgreen","maroon4"),
               fill_end_box=c("midnightblue","darkred","darkgreen","maroon4"))



R - create an Excel file from data frame and color the cells given by some index

1.1 create an Excel file from data frame and color the cells given by some index

  • data the following function takes an data.frame containing the data as argument
  • the second argument is also a data frame containing the indices of the cells which should be coloured, row and column are assumed the be the names of these columns (can be changed through the resp arguments)
  • the filename is set through filename (default: date+excelfile.xlsx) and
  • the fill color through color (default: red - for a list of colors type XLC after the XLConnect package is loaded)


require(XLConnect)
create.excel <- function(data,index,color=53,filename=NULL,row="row",column="column"){
    names(index)[names(index)==row] <- "row"
    names(index)[names(index)==column] <- "column"
    if(is.null(filename))filename <- paste0(gsub("-","",as.character(Sys.Date())),"excelfile.xlsx")
    wb <- loadWorkbook(filename,create=T)
    createSheet(wb,"conspvalues")
    writeWorksheet(wb,data,"conspvalues")
    style <- createCellStyle(wb,name="style")
    setFillPattern(style,fill=XLC$"FILL.SOLID_FOREGROUND")
    setFillForegroundColor(style,color=10)
    setCellStyle(wb,sheet = "conspvalues",
                         row=index$row + 1,
                         col=index$column,
                         cellstyle = style)
    return(saveWorkbook(wb))
}

## example
cells <- data.frame(x=sample(nrow(mtcars)),y=sample(ncol(mtcars),size=nrow(mtcars),replace = T))
create.excel(mtcars,cells,row="x",column="y")