r

1. Syntax
- 1.1. Algemene beginopdrachten
- 1.2. Interpunctie bij commando's
2. helpfunctie
3. data constructies
4. lussen
5. rekenen
6. functies
7. SEQUENCE
8. LOGICAL
9. grafieken
10. Functie
- 10.1. Sample size
- 10.2. randomiseren

1 Syntax

1.1 Algemene beginopdrachten

setwd("home/koen")
r <- read.csv(file = "test.csv")

q()	quit
sink("output.lis")	alle output gaat naar dit bestand
source("commands.R")	dit script wordt gedraaid
objects()	laat alle objecten in workspace zien
jpeg("naam.jpg")	slaat de plaatjes die gemaakt worden op als jpg
dev.off()	stop dit
install.packages("xlsx")
library(xlsx)

1.2 Interpunctie bij commando's

#	comment
;	nieuwe opdracht (is zelfde als newline)
:	sequence
'en "	mag allebei
\\	\
\"	"
\n	zie ?Quotes
\t
\b
c	maakt vector/list
->	=
<-	=
a[,,]	stands for the entire array
list[ [ ¹ ] ]	returns the first object in list
NA	missing
NaN	not a number, 0/0

2 helpfunctie

help(solve)
?solve
help("[[")
help.start()
??solve
?help
?help.search
example(topic)

3 data constructies

rm	delete een object
attach	voegt iets tot aan search path, bijv. dataframe
attributes(object)
attr(z, "dim")	dimensies van array
attr(z, "dim")	<- c(10,10)
dimension vector	is a vector of non-negative integers
typen	numeric, character, integer, logical, complex
winter	\= print(winter)
class(winter)	wat voor object is het?
unclass(winter)
View(winter)
modes	numeric, integer, complex, logical, character and raw
numeric	amalgam of two distinct modes, namely integer and double precision
mode(object)
length(object)
length(alpha) <- 3
nrow
head (x, -1)	haalt een rij eraf
as.matrix(x[,1])	maakt van rij 1 van een table een matrix

3.0.1 NAMEN

names(fruit) <- c("orange", "banana", "apple", "peach")
lunch <- fruit[c("apple","orange")]
rownames(x) <- r[,2] maakt van rij 2 de rijnamen

lists mogen namen hebben: list$coe returns the obejct coefficient from list
Lst <- list(name="Fred", wife="Mary", no.children=3, child.ages=c(4,7,9))

3.0.2 VECTOR

ordered collected of same mode
berekeningen met vector levert weer vector op
alle elementen in expressie worden tot vector van de langste lengte omgezet door (fractionele) herhaling
maken van vector
- leeg
  - e <- numeric() makes e an empty vector structure of mode numeric.
  - e = c() ; minder netjes
- x <- c(10.4, 5.6, 3.1, 6.4, 21.7) maakt vector
- assign("x", c(10.4, 5.6, 3.1, 6.4, 21.7)) idem
- c(10.4, 5.6, 3.1, 6.4, 21.7) -> x
append: d = c(d, a)

3.0.3 LIST: ordered sequences of objects which individually can be of any mode

list1 returns the first object in list
lists mogen namen hebben: list$coe returns the obejct coefficient from list
Lst <- list(name="Fred", wife="Mary", no.children=3, child.ages=c(4,7,9))

3.0.4 ARRAY: multiply subscripted collection of data entries

Y,X
t(x) transponeert x
the first subscript moving fastest and the last subscript slowest
dimension vector of a is c(3,4,2) then there are 3 x 4 x 2 = 24 entries in a and the data vector holds them in the order:
a[1,1,1], a[2,1,1], …, a[2,4,2], a[3,4,2]

verschil vector

vector can be used as array if it has a dimension vector as its dim attribute: dim(z) <- c(3,5,100)
dimension vector: is a vector of non-negative integers
array can one-dimensional: treated as vectors (including when printing), but exceptions can cause confusion

3.0.5 matrix: 2-dimensional array

te maken met dim() uit vector
- dim(x) <- c(2,3)
- cbind(c(1,2,3),c(4,5,6))
- rbind(c(1,2,3),c(4,5,6))
N = c();

3.0.6 data frame

baskets.df <- as.data.frame(t(baskets.team))
output zelfde als maxtrix, maar:
str(baskets.df)

3.0.7 INDEX

x[1:10]
y <- x[-(1:5)]
y <- x[!is.na(x)]
x[is.na(x)] <- 0
y[y < 0] <- -y[y < 0] is y <- abs(y)
alpha is an object of length 10, then > alpha <- alpha[2 * 1:5] makes it an object of length 5 consisting of just the former components with even index
(x+1)[(!is.na(x)) & x>0] -> z indexhaken mogen ook op uitkomst van expressie gebruikt worden om deel van uitkomst te selecteren
fruit <- c(5, 10, 1, 20)

3.0.8 STRINGS

paste(c("X","Y"), 1:10, sep="") plakt strings aan elkaar
grep("deel", string-of-vector)
digits <- as.character(z)
d <- as.integer(digits)

4 lussen

for (i in seq(1:lengte)) { j[i,2] = cheftest (j[i,1]); }
for (val in a) { b = d(val); }
- a is een vector/matrix

N=c();
delta = seq(0.9, 2, by=0.1);
b <- function(c) { return (power.t.test(n = NULL, delta = c, sd = 3.57, sig.level = 0.05, power = 0.8, type = "two.sample", alternative = "two.sided", strict = FALSE, tol = .Machine$double.eps⁰.25)$n) ; }
for (val in delta) { N = c(N, b(val)); }
plot(delta,N)
text(delta,N, label=as.character(paste(round(N),">",delta)), cex=0.6,pos=3)

5 rekenen

+, -, *, / and ^
log, exp, sin, cos, tan, sqrt
max and min select the largest and smallest elements of a vector
pmax and pmin (als meerdere vectoren aan max en min worden gegeven nemen ze de kleinste en grootste van allemaal, pmax en pmin niet)
range a vector of length two, namely c(min(x), max(x))
length(x) is the number of elements in x
sum(x) gives the total of the elements in x
prod(x) their product
mean(x) which calculates the sample mean, which is as sum(x)/length(x)
var(x) which gives sum((x-mean(x))²)/(length(x)-1)
sort(x) in increasing order
more flexible sorting facilities available (see order() or sort.list())

6 functies

mag argumenten "named" geven, dan maakt volgorde niet uit: seq(from=1, to=30)
handig zodat niet altijd alle argumenten gegeven hoeven worden
b <- function(c) { return (d); }

7 SEQUENCE

heeft hoogste prioriteit

z <- 0:9
1:30 is the vector c(1, 2, …, 29, 30)
seq(2,10) is 2:10
30:1 backwards
2*1:15 is the vector c(2, 4, …, 28, 30).
rep(x, times=5)
rep(x, each=5)

8 LOGICAL

TRUE, FALSE, and NA
<, <=, >, >=, = for exact equality and ! for inequality
c1 & c2 , c1 | c2, and !c1 is the negation of c1
is.na(xx) is TRUE both for NA and NaN values
to differentiate these, is.nan(xx) is only TRUE for NaNs
x == NA is a vector of the same length as x all of whose values are NA

9 grafieken

barplot

install.packages("xlsx")
library(xlsx)
list.files("Desktop/tmp/", "CALORI*")
dat <- read.xlsx("~/Desktop/tmp/CALORI₁.xlsx", sheetName="Sheet2")
dat$Form.of.cancer = as.character(dat$Form.of.cancer)
IS ZELFDE ALS: dat[,2] = as.character(dat[,2])
dat2 <- read.xlsx("~/Desktop/tmp/CALORI₂.xlsx", sheetName="Sheet2")
dat[2,] = dat2[1,]
dat[,2] = as.factor(dat[,2])

Fibonacci <- function(n) {
return(x)
}

wel = read.csv("ttoetsenvoorRienGroepAspergillus.csv")
geen = read.csv("ttoetsenvoorRienGroepGeenAspergillus.csv")
hist(c(wel$X, geen$X))
T-TOETSEN, LEEFTIJD en FEV1
t.test(wel$X, geen$X)

mydata = (cbind(c(wel$X.1, geen$X.1), c(rep(TRUE,7),rep(FALSE,20))))
table(mydata[,2], mydata[,1])
chisq.test(table(mydata[,2], mydata[,1]))

CV.R

library(Hmisc)
setwd("C:\\Users\\Koen\\Desktop\\Nieuwe werkmap\\Werkmap\\projecten\\eNose\\analyses nav stage Maayke\\CF-analyse")
source("CV.R")

10 Functie

10.1 Sample size

tn <- function(s, d) {
power.t.test(n = NULL, delta = d, sd = s, sig.level = 0.05, power = 0.8, type = "two.sample", alternative = "two.sided", strict = FALSE, tol = .Machine$double.eps⁰.25)$n
}
td <- function(s, n1) {
power.t.test(n = n1, delta = NULL, sd = s, sig.level = 0.05, power = 0.8, type = "two.sample", alternative = "two.sided", strict = FALSE, tol = .Machine$double.eps⁰.25)$delta
}

power.t.test(n = NULL, delta = 2, sd = 3.57, sig.level = 0.05, power = 0.8, type = "two.sample", alternative = "two.sided", strict = FALSE, tol = .Machine$double.eps⁰.25)

pwr.t.test(n = , d = , sig.level = , power = , type = c("two.sample", "one.sample", "paired"))
pwr.2p.test two proportions (equal n)
pwr.2p2n.test two proportions (unequal n)
pwr.anova.test balanced one way ANOVA
pwr.chisq.test chi-square test
pwr.f2.test general linear model
pwr.p.test proportion (one sample)
pwr.r.test correlation
pwr.t.test t-tests (one sample, 2 sample, paired)
pwr.t2n.test t-test (two samples with unequal n)

10.2 randomiseren

install.packages("randomizeR")
library(randomizeR)
params = crPar(7)
rs=genSeq(params)
s = rs$seed
rs$groups[rs$M+1]
params = crPar(8)
rs=genSeq(params,1,s)
rs$groups[rs$M²+1]

Footnotes:

DEFINITION NOT FOUND.

r

Table of Contents