1  Introduction to the course

1.1 Introduction to R

R is a computing environment that combines numerical analysis tools for linear algebra; functions for classical and modern statistical analysis; and functions for graphics and data visualization. It is based on the programming language S, developed by John Chambers in the 1970s.

There are two ways to start R:

  1. Run R on the command line. On a Mac, you would do this in the Terminal application. On a Windows machine, you would do this using, e.g., Ubuntu. We will cover the command line next week.
  2. Click the R icon on your desktop, assuming the software has already been installed.

We will use the graphical user interface RStudio throughout this course. Although the GUI makes certain things easier, it is not necessary to use it when running an R script. For example, running the following code in R returns the sum of the numbers 1 and 2.

1+2
[1] 3

Running the code with a # at the beginning of the line results in the line being read as a comment. This means that the calculation which is specified in the line is not processed and the output not returned. Comments are a useful way to keep track of what line(s) of code do, multiple versions of the same code, etc.

1.2 RStudio and the R Notebook

RStudio includes the R console, but also many other convenient functionalities, which makes it easier to get started and to work with R. When you launch RStudio, you will see four panels. Starting at the top left and going clockwise, these panels are:

  • The text editor panel. This is where we can write scripts, i.e. putting several commands of code together and saving them as a text document so that they are accessible for later and so that we can execute them all at once by running the script instead of typing them in one by one.
  • The environment panel, which shows us all the files and objects we currently loaded into R.
  • The files-plots-help panel. This panel shows the files in the current directory (the folder we are working out of), any plots we make later, and also documentation for various packages and functions. Here, the documentation is formatted in a way that is easier to read and also provides links to the related sections.
  • The console is another space we can input code, only now the code is executed immediately and doesn’t get saved at the end.

To change the appearance of your RStudio, navigate to Tools > Global Options > Appearance. You can change the font and size, and the editor theme. The default is “Textmate”, but if you like dark mode, a good option is “Tomorrow Night Bright”. You can also change how your panels are organized.

In the RStudio interface, we will be writing code in a format called the R Notebook. As the name entails, this interface works like a notebook for code, as it allows us to save notes about what the code is doing, the code itself, and any output we get, such as plots and tables, all together in the same document.

When we are in the Notebook, the text we write is normal plain text, just as if we would be writing it in a text document. If we want to execute some R code, we need to insert a code chunk.

You insert a code chunk by either clicking the “Insert” button or pressing Ctrl/Command + Alt + i simultaneously. You could also type out the surrounding backticks, but this would take longer. To run a code chunk, you press the green arrow, or Ctrl/Command + Shift + Enter.

1+2
[1] 3

As you can see, the output appears right under the code block.

This is a great way to perform explore your data, since you can do your analysis and write comments and conclusions right under it all in the same document. A powerful feature of this workflow is that there is no extra time needed for code documentation and note-taking, since you’re doing your analyses and taking notes at the same time. This makes it great for both taking notes at lectures and to have as a reference when you return to your code in the future.

1.3 R Markdown

The text format we are using in the R Notebook is called R Markdown. This format allows us to combine R code with the Markdown text format, which enables the use of certain characters to specify headings, bullet points, quotations and even citations. A simple example of how to write in Markdown is to use a single asterisk or underscore to emphasize text (*emphasis*) and two asterisks or underscores to strongly emphasize text (**strong emphasis**). When we convert our R Markdown text to other file formats, these will show up as italics and bold typeface, respectively. If you have used WhatsApp, you might already be familiar with this style of writing. In case you haven’t seen it before, you have just learned something about WhatsApp in your quantitative methods class…

To learn more about R Markdown, check out this reference.

1.3.1 Saving data and generating reports

To save our notes, code, and graphs, all we have to do is to save the R Notebook file, and the we can open it in RStudio next time again. However, if we want someone else to look at this, we can’t always just send them the R Notebook file, because they might not have RStudio installed. Another great feature of R Notebooks is that it is really easy to export them to HTML, Microsoft Word, or PDF documents with figures and professional typesetting. There are actually many academic papers that are written entirely in this format and it is great for assignments and reports. (You might even use it to communicate with your collaborators!) Since R Notebook files convert to HTML, it is also easy to publish simple and good-looking websites in it, in which code chunks are embedded nicely within the text.

Let’s try to create a document in R.

First, let’s set up the YAML block. This is found at the top of your document, and it is where you specify the title of your document, what kind of output you want, etc.

---
title: "Your title here"
author: "Your name here"
date: "Insert date"
output:
  pdf_document: default
---

Next, let’s type code to perform the calculation we did above:

1+2
[1] 3

To create the output document, we say that we “knit” our R Markdown file into, e.g., a PDF. Simply press the Knit button here and the new document will be created.

As you can see in the knitted document, the title showed up as we would expect, and lines with pound sign(s) in front of them were converted into headers. Most importantly, we can see both the code and its output! Plots are generated directly in the report without us having to cut and paste images! If we change something in the code, we don’t have to find the new images and paste it in again, the correct one will appear right in your code.

When you quit, R will ask you if you want to save the workspace (that is, all of the variables you have defined in this session); in general, you should say “no” to avoid clutter and unintentional confusion of results from different sessions. Note: When you say “yes” to saving your workspace, it is saved in a hidden file named .RData. By default, when you open a new R session in the same directory, this workspace is loaded and a message informing you so is printed: [Previously saved workspace restored]. It is often best practice to turn this feature off completely.

1.4 Practice knitting!

For your first assignment, you will need to write a short description of your interests in EEB and submitted the knitted document on Quercus. This assignment is due September 12. If you are having trouble knitting, which you will have to do throughout the course, come find us and we will help you troubleshoot.