Send your solutions by email to the lecturer (firstname.lastname@helsinki.fi), at the latest on Mon 7 Feb at 10.00 using R1 as the title of your message. Also include your name and student number in the message.

In most cases, your solutions should be a few lines of R code. Please, send the solutions as plain text (do not format your code with a word processing program).

Either

- start R in the microcomputer room C128 (instructions)
- or install R on your own computer (instructions).

Create a directory (folder) where you put files that you need during the course (instructions). You do not need to document your solution to this exercise, even though this is the most important R exercise in the course. You will get automatically credit for this exercise, if you send solutions to any of the other R exercises.

R has a function called `pnorm`

.
What are the names of its formal arguments?
Which values are bound to the formal arguments in the following call?

z <- 3 pnorm(1, lower = (1 < 2), 2, mean = z)

Suggested solution:
To find out the names of the formal arguments,
give the command `args(pnorm)`

or read the help text by
giving the command `?pnorm`

.
The formal arguments and their default values are as follows.

> args(pnorm) function (q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)

In the given function call,
we have two named actual arguments and two
other actual arguments. The named arguments are bound first, then
the unnamed arguments are bound, left to right, to the remaining unbound
formal arguments. If any of the formal arguments remains unbound
and it has a default value, then
it recieves the default value. Hence the arguments
are bound in the following order to the following values. The name of
the
formal argument `lower.tail`

has been abbreviated in the function call.

Formal argument | Value | Explanation |
---|---|---|

lower.tail | TRUE | named argument, value of (1 < 2) is TRUE |

mean | 3 | named argument; value of variable z is 3 |

q | 1 | first unbound formal |

sd | 2 | second unbound formal |

log.p | FALSE | default value |

This function call calculates the probability P(X <= 1), when random variable X is normally distributed with mean 3 and standard deviation 2 (or variance 4).

First create vector `x`

, which
contains 100 random values drawn from the Poisson distribution.

x <- rpois(100, 1.1)

Formulate your answers to the following questions so that they work not only for your particular sample but for any random sample drawn as above.

- How do you extract a vector, which contains the entries of
`x`

at the positions 2, 3 and 20? - How do you create a logical vector
`b`

whose i'th entry is TRUE if and only if the i'th entry of`x`

is zero? - How do you find out, how many entries of
`x`

have the value zero? - How do you select the non-zero entries of
`x`

?

Suggested solutions: (it is possible to solve this exercise in many ways).

# 1: x[c(2, 3, 20)] c(x[2],x[3], x[20]) # a longer way # 2: b <- x == 0 # 3: sum(b) sum(x == 0) # alternative # 4: # a) b <- x == 0 x[!b] # b) x[x != 0] # c) x[!(x == 0)] # and so on ...

Now we want to find the indices of those entries of vector `x`

(generated in the previous exercise) which are greater than or equal to 2.
One way of doing this is the following.

which(x >= 2)

Now you should try to achieve the same result without using
the `which`

function.
Instead, you should index with a suitable logical vector the
vector `inds`

which you generate as follows.

inds <- 1:length(x)

Suggested solution

inds <- 1:length(x) inds[x >= 2]

The following lines create vector `x`

which contains a random number of
missing values (NA's).

n <- 100 x <- rnorm(n) x[rbinom(n, 1, 0.1) == 1] <- NA

- How do you find out how many missing values there are in
`x`

? - How do you replace all the missing values with zeros?

Suggested solution:

# number of missing values: sum(is.na(x)) # replacing missing values with zeros: x[is.na(x)] <- 0

Suppose that you want to plot data which resembles the data we generate as follows.

x <- runif(100, -pi, pi) y <- sin(x)Here we first sample 100 value uniformly on the interval (-pi, pi) and then calculate the sine function.

Try the command

plot(x, y, type = 'l')

(there is a lower case L inside the quotation marks).
The result is a line plot, where the point `(x[1], y[1])`

is connected to the points `(x[2], y[2])`

, `(x[3], y[3])`

and so on.
Since the x-values are not ordered, the plot looks like a spider's web.

Instead, you want a line plot which resembles the graph of the
sine function. The trick is to sort the `x`

vector into increasing
order, and to apply the same permutation also to the `y`

vector
prior to plotting. How do you do this in practice? Pretend that you do
not know the rule of calculating the y's from the x's.

(Hint: `sort()`

, `order()`

.)

Suggested solution:

We sort x's and reorder y's using the permutation, which sorts the x-vector.

plot(sort(x), y[order(x)], type = 'l')

The same operation could be done also in other ways, e.g. like this,

ind <- order(x) plot(x[ind], y[ind], 'l')

Since we know in this exercise that y's are just sines of x's, we could also sort x's first and then recalculate y's.

plot(sort(x), sin(sort(x)), type = 'l')However, this solution depends on the fact that we know how the

`y`

values were
calculated in the first place.
Last updated 2011-02-04 17:55

Petri Koistinen

petri.koistinen 'at' helsinki.fi