Software Tools / R: Homework 1: due Monday, 7 Feb, 2011

Send your solutions by email to the lecturer (firstname.lastname@helsinki.fi), at the latest on Mon 7 Feb at 10.00 using R1 as the title of your message. Also include your name and student number in the message.

In most cases, your solutions should be a few lines of R code. Please, send the solutions as plain text (do not format your code with a word processing program).


Exercise 1. (Initial preparation)

Either

  1. start R in the microcomputer room C128 (instructions)
  2. or install R on your own computer (instructions).

Create a directory (folder) where you put files that you need during the course (instructions). You do not need to document your solution to this exercise, even though this is the most important R exercise in the course. You will get automatically credit for this exercise, if you send solutions to any of the other R exercises.


Exercise 2. (Function calls)

R has a function called pnorm. What are the names of its formal arguments? Which values are bound to the formal arguments in the following call?

z <- 3
pnorm(1, lower = (1 < 2), 2, mean = z)

Exercise 3. (Indexing)

First create vector x, which contains 100 random values drawn from the Poisson distribution.

x <- rpois(100, 1.1)

Formulate your answers to the following questions so that they work not only for your particular sample but for any random sample drawn as above.

  1. How do you extract a vector, which contains the entries of x at the positions 2, 3 and 20?
  2. How do you create a logical vector b whose i'th entry is TRUE if and only if the i'th entry of x is zero?
  3. How do you find out, how many entries of x have the value zero?
  4. How do you select the non-zero entries of x?

Exercise 4. (More advanced indexing)

Now we want to find the indices of those entries of vector x (generated in the previous exercise) which are greater than or equal to 2. One way of doing this is the following.

which(x >= 2)

Now you should try to achieve the same result without using the which function. Instead, you should index with a suitable logical vector the vector inds which you generate as follows.

inds <- 1:length(x)

Exercise 5. (Coping with missing values)

The following lines create vector x which contains a random number of missing values (NA's).

n <- 100
x <- rnorm(n)
x[rbinom(n, 1, 0.1) == 1] <- NA
  1. How do you find out how many missing values there are in x?
  2. How do you replace all the missing values with zeros?

Exercise 6. (Ordering data according to the values of one variable)

Suppose that you want to plot data which resembles the data we generate as follows.

x <- runif(100, -pi, pi)
y <- sin(x)
Here we first sample 100 value uniformly on the interval (-pi, pi) and then calculate the sine function.

Try the command

plot(x, y, type = 'l')

(there is a lower case L inside the quotation marks). The result is a line plot, where the point (x[1], y[1]) is connected to the points (x[2], y[2]), (x[3], y[3]) and so on. Since the x-values are not ordered, the plot looks like a spider's web.

Instead, you want a line plot which resembles the graph of the sine function. The trick is to sort the x vector into increasing order, and to apply the same permutation also to the y vector prior to plotting. How do you do this in practice? Pretend that you do not know the rule of calculating the y's from the x's.

(Hint: sort(), order().)


Last updated 2011-02-01 14:35
Petri Koistinen
petri.koistinen 'at' helsinki.fi