Saturday, August 20, 2011

R - R2OpenBUGS

download linux source
R CMD INSTALL BRugs_0.7.1.tar.gz
download windows binaries than install it as a local package (in the menu)


more information here

R - par() change the position of axes labels and tick marks

  • create a example plot using the trees data set
head(trees)
  Girth Height Volume
1   8.3     70   10.3
2   8.6     65   10.3
3   8.8     63   10.2
4  10.5     72   16.4
5  10.7     81   18.8
6  10.8     83   19.7
  • now we create the plot (girth on the x-axis, heigth on the y-axis, size and color of the circles indicating the volume of the trees):
my.col <- rainbow(100, start=0.3,end=0.9) ## provid an vector with colors
with(trees, symbols(Girth,Height, circles=Volume/30, bg=my.col[Volume], inches=0.3)) ## plot the example plot (symbols plot) 

  • now we save the current par()-settings (for recovering) - and so we can restore it later
my.old.par <- par()
  • the mgp controls the position of the axes labels, and tick labels and the axes
  • default value c(3,1,0)
  • the first entry controls the position of the axis labels, the second the position of the tick labels
par(mgp=c(-2,-1,0))
with(trees, symbols(Height, Girth, circles=Volume/30, bg=my.col[Volume], inches=0.3, main="Trees") )

  • so if we change the first two values of mgp to negative numbers the axis labels and the tick labels are now located in the inner region
  • the absolute value define the distances, so we try this second example:
par(mgp=c(-2,-3,0))
with(trees, symbols(Height, Girth, circles=Volume/30, bg=my.col[Volume], inches=0.3, main="Trees") )

  • the third element of mgp controls the axes and it have the same effect, every value different from zero moves the axes
par(mgp=c(-2,-3,-1))
with(trees, symbols(Height, Girth, circles=Volume/30, bg=my.col[Volume], inches=0.3, main="Trees") )

my.col <- rainbow(100, start=0.3,end=0.9) ## provid an vector with colors
with(trees, symbols(Height, Girth, circles=Volume/30, bg=my.col[Volume], inches=0.3)) ## plot the example plot (symbols plot) 
  • if you want to change just the orientation of the ticks use tcl
  • tcl defines the length of tick marks as a fraction of the height of a line of text
  • default ist -0.5
  • if you use a positive value the orientation is changed
par(mgp=c(3,1,0),tcl=0.5) ### set mgp back to default, change tcl
with(trees, symbols(Height, Girth, circles=Volume/30, bg=my.col[Volume], inches=0.3, main="Trees") )

Saturday, August 13, 2011

R - extract year, month etc.... from date

extract year, month, day… from date

  • create vector of dates in standard format
my.dates <- as.Date(format(ISOdatetime(2000:2009,1:10,1:5,0,0,0),"%Y-%m-%d"))
my.dates
 [1] "2000-01-01" "2001-02-02" "2002-03-03" "2003-04-04" "2004-05-05"
 [6] "2005-06-01" "2006-07-02" "2007-08-03" "2008-09-04" "2009-10-05"
(make sure your object is of date fromat, check it with str(your.object)
  • extract years using format:
my.years <- format(my.dates,"%Y") # %y without century
my.years
 [1] "2000" "2001" "2002" "2003" "2004" "2005" "2006" "2007" "2008" "2009"
  • or months
my.months <- format(my.dates,"%m")
my.months
 [1] "01" "02" "03" "04" "05" "06" "07" "08" "09" "10"
  • or names of month (in current local)
my.months <- format(my.dates,"%b") # %B for long form
my.months
 [1] "Jan" "Feb" "Mär" "Apr" "Mai" "Jun" "Jul" "Aug" "Sep" "Okt"
  • or days (of month)
my.days <- format(my.dates,"%d")
my.days
 [1] "01" "02" "03" "04" "05" "01" "02" "03" "04" "05"
  • or week day (in current local)
my.days <- format(my.dates,"%a") # %A for long form, %w (0-6) of %u (1-7) for number
my.days
 [1] "Sa" "Fr" "So" "Fr" "Mi" "Mi" "So" "Fr" "Do" "Mo"
  • or day of year
my.days <- format(my.dates,"%j")
my.days
 [1] "001" "033" "062" "094" "126" "152" "183" "215" "248" "278"
  • or week of year (sunday as first day of the week)
my.weeks <- format(my.dates,"%U") # %W for monday = first day of the week
my.weeks
 [1] "00" "04" "09" "13" "18" "22" "27" "30" "35" "40"
  • local format of date
my.days <- format(my.dates,"%x")
my.days
 [1] "01.01.2000" "02.02.2001" "03.03.2002" "04.04.2003" "05.05.2004"
 [6] "01.06.2005" "02.07.2006" "03.08.2007" "04.09.2008" "05.10.2009"

R - subscripting 3

subsetting with logical indices

  • elements corresponding to a TRUE in the subscript vector are selected, those corresponding to a FALSE are excluded, NAs in the the logical subscript vector produce NAs
my.vector <- letters[1:10] # vector of a...j
my.vector
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
my.index <- rep(c(TRUE,FALSE),5) # the subscript vector (of length 10)
my.index
 [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE
my.vector[my.index]
[1] "a" "c" "e" "g" "i"
  • if the subscript vector consists of less elements than the subscripted vector, it is repeated as many times as necessary (warning if the the vector length is not a multiple of the length of the subscript vector)
short.index <- c(T,F) # T is the same as TRUE, F as FALSE
short.index
[1]  TRUE FALSE
my.vector[short.index] # has the same result as above
[1] "a" "c" "e" "g" "i"
  • if the subscript vector is longer than the vector for every TRUE or NA entry without target a NA is produced
short.vector <- letters[1:3]
short.vector
[1] "a" "b" "c"
my.index <- rep(c(T,F,NA),2)
my.index
[1]  TRUE FALSE    NA  TRUE FALSE    NA
short.vector[my.index]
[1] "a" NA  NA  NA

R - subscripting 2

subsetting with character indices - vectors and lists

  • extract elements of named vectors, lists
named.x <- 1:10  # produce a vector containing the numbers 1...10
named.x   #  has no names yet
 [1]  1  2  3  4  5  6  7  8  9 10
  • if the object has no names, NAs are produced
named.x["one"]
[1] NA
  • name the vector:
names(named.x) <- c("one","two","three","four","five","six","seven","eight","nine","ten")
named.x
  one   two three  four  five   six seven eight  nine   ten 
    1     2     3     4     5     6     7     8     9    10
  • now it works
named.x["one"]
one 
  1
  • if names are duplicated only the first match is returned (the value with the lowest index)
names(named.x)<-rep(c("one","two","three","four","five"),rep(2,5))
named.x
  one   one   two   two three three  four  four  five  five 
    1     2     3     4     5     6     7     8     9    10
named.x[c("one","two","three")]
  one   two three 
    1     3     5
  • NA is returned if a name do not match any element
named.x["twenty"]
<NA> 
  NA
  • NA is returned if a subscript is character NA
named.x[c("one",NA,"three")]
  one  <NA> three 
    1    NA     5
  • to find all occurences one can use %in% and then subset with the logical vector (you can do this in one step)
named.x[names(named.x) %in% c("one","two")]
one one two two 
  1   2   3   4

Tuesday, August 2, 2011

Monday, August 1, 2011

R - subscripting

subsetting

  • creating a list for examples
myl <- list(one=10,two=20,three=30)
myi <- "one"
one
  • subsetting is carried out with three different operators:
    • dollar: $
    • square bracket: [
    • double square bracket: [[
  • these are generic functions
  • there are four basic types of subscript indices
    • positive integers
    • negative integers
    • logical vectors
    • character vectors
  • the first (left-most) index moves the fastest, the right-most index is the slowest (so a matrix is filled column by column)

the square bracket

  • the returned value is of the same type of the value it is applied to
myl[c(1,2)]
10 20
  • evaluate its second argument
myl[myi]
10
  • does not use partial matching when extracting named elements
myl["on"]
  • cannot be used for subscripting environments

the double square bracket

  • extract a single value
myl[[1]]
10
  • evaluate its second argument
myl[[myi]]
10
  • does not use partial matching when extracting named elements
myl[["on"]]

the dollar

  • extract a single value
myl$two
20
  • does not evaluate its second argument
myl$myi
  • uses partial matching extracting named elements
myl$on
10
  • cannot be used for subscripting atomic vectors

subsetting with positive indices

  • vector of positive integer values to indicate the set to be extracted
myv <- 1:10 
myi <- c(2,4,7)
myv[myi]
myv[c(2,4,7)]
[1] 2 4 7
[1] 2 4 7
  • if the subscript is larger than the length of the vector, NAs are produce
myv[c(1,11,3)]
[1]  1 NA  3
  • NA as subscript produces also a NA
myv[c(1,11,NA)]
[1]  1 NA NA
  • zero as subscript will be ignored
myv[c(1,2,0,3)]
[1] 1 2 3
  • if the subscript vector has the length zero, the result is also of length zero
myi <- numeric()
myv[myi]
integer(0)
  • example: select elements which have even numbered subscripts
v <- rnorm(20)
myi <- seq(2,length(v),by=2)
myi
v[myi]
 [1]  2  4  6  8 10 12 14 16 18 20
 [1]  0.35380115 -0.02156840 -1.51804278  0.38278037  0.03867578 -1.25803279
 [7]  0.62863255  0.07111270 -0.73416837  0.18966622

subsetting with negative indices

  • subscript vector consists of negative integer values indicating the indices not to be extracted
  • zero as subscript will be ignored
myv[-c(1,2,0,3)]
[1]  4  5  6  7  8  9 10
  • NAs are not allowed
myv[-c(1,NA,3)]
Error in myv[-c(1, NA, 3)] : 
  nur Nullen dürfen mit negativen Indizes gemischt werden
  • zero length subscript produces zero length result
myv <- 1:10 
myi <- numeric()
myv[-myi]
 integer(0)
  • positive and negatives subscripts cannot be mixed

to be continued... (most parts taken from Gentleman, R Programming for Bioinformatics) …