Overview

In this lesson we learn how to use RMarkdown and Latex to produce documents combining code, text, and math.

Objectives

  1. Create a RMarkdown document
  2. Incorporate R code and Latex math
  3. Knit your RMarkdown to a PDF document

Reading

Lander, Ch 27, 28

1 Introduction

All submissions for this course should be written using RMarkdown in RStudio, and submitted as a PDF document. RMarkdown is a plain-text format that allows you to create rich documents in a variety of common formats, such as PDF, HTML, or Microsoft Word. It is a powerful tool for producing beautiful documents that combine R code, the output from that code, well-formatted math, and text.

The basic idea is that you write a plain-text document with some plain-text formatting notations, and your machine (such as RStudio) turns that into a pretty PDF, HTML, or Word document. The power of RMarkdown is that it allows you to include R code and equations in a way that is easy to write, but produces attractive and legible output.

To create a new RMarkdown document, in RStudio you go to File -> New File -> R Markdown. This creates a new plain-text file with a .Rmd ending. Once you have created a new .Rmd plain-text document, you write your homework answers in that document, including any R code or equations you want, and then when you are done (or want to check how the output will look so far), you “knit” the .Rmd file into the final PDF, Word, or HTML document, which is what you will submit as your finished product. You will always have two documents for any assignment: the plain-text .Rmd file, and the prettier PDF, Word or HTML file you “knit” the .Rmd file into.

To “knit” the pretty document, you click on “Knit PDF” in the Source pane in RStudio, which will create the PDF in the same directory as the .Rmd file, and also open it in RStudio to preview it. For this class, we want the homework submissions to be in PDF format, because that works well with Canvas. Assignment file names should be in the following format: HW_01_lastname_firstinitial.pdf, where of course 01 gets replaced with whatever the appropriate homework number is.

2 Generating PDFs with Latex

When you create a new RMarkdown document, RStudio asks you what format you want your final output to be in: PDF, Word, or HTML. Unlike HTML or Word formats, generating PDFs with RMarkdown requires one additional step. You have a few different options for getting PDFs to work in RStudio:

  1. To generate PDFs directly with RStudio, you first need to install another program called “Latex.” The easiest way to install Latex is via an R package, TinyTex, by running just once, in the R console, first install.packages('tinytex') and then tinytex::install_tinytex(). This only needs to be done once, and if it completes without error messages, you should be good to go.

  2. TinyTex should work for almost all users; if it doesn’t, try to seek help, because there is often a simple fix. If it still doesn’t work though, or you want to go beyond the Latex basics, you can also install the full (not “tiny”) version of Latex: On a Mac, go here https://tug.org/mactex/ to install MacTex; on a Windows machine go here to install Miktex http://miktex.org/download.

  3. If you are having trouble installing Latex, as a temporary solution you can knit the .Rmd as an HTML document, and then open the HTML in your browswer and print/save the page to PDF. Note that this should be a temporary stop-gap in case you can’t get methods 1 or 2 to work immediately; please try to get Latex working as soon as possible.

If methods 1 and 2 both fail, please seek help and don’t waste too much time trying to debug, since Latex installations can get messy very quickly once you diverge from the easiest method, installing TinyTex.

3 The RMarkdown syntax

RMarkdown comes from an earlier language called Markdown, which was originally designed for writing simple HTML. RMarkdown is just Markdown with the additional ability to include R code and equations. For the purposes of the homeworks, you don’t need anything fancy beyond the R code and equations, but if you want to do any fancier formatting, here is a quick reference sheet: http://en.support.wordpress.com/markdown-quick-reference/ . As you can see there, Markdown is just a way to generate nice looking formatting using plain text. In fact, if you know HTML, you can actually just insert HTML directly into your RMarkdown code and it will mostly work fine.

When you open a new RMarkdown file, as you will see, it creates a new .Rmd document with some text already in there. If you knit this, you can see a nice example of how to format R code to get the right output. You can also use the example text at the end of this lesson as a template for your assignments.

If you want a quick review of how RMarkdown works in RStudio, you can go here: http://rmarkdown.rstudio.com/RMarkdownCheatSheet.pdf.

4 R Code

The lectures for this course were produced with RMarkdown, and as you can see from them, they do a nice job showing both the raw R code, and the output from that code, in pretty boxes in the final document. To include R code in your .Rmd file that produces nice output, you put your R code inside a triple set of apostrophe marks like so:

    ```{r}
    summary(cars)
    ```

The default text RStudio includes every time you create a new .Rmd document also has some examples in it. When we knit the .Rmd file with this R code in it, it doesn’t just print the R code – it actually runs the R code, and prints both the code itself, and the output of running that code – in this case, the summary information about the built-in cars dataset.

summary(cars)
     speed           dist       
 Min.   : 4.0   Min.   :  2.00  
 1st Qu.:12.0   1st Qu.: 26.00  
 Median :15.0   Median : 36.00  
 Mean   :15.4   Mean   : 42.98  
 3rd Qu.:19.0   3rd Qu.: 56.00  
 Max.   :25.0   Max.   :120.00  

5 Plots

You can also embed R plots in your file, which is all done automatically just by including the R code that generates it:

    ```{r}
    plot(cars)
    ```
    

And when you knit it, the above code produces:

plot(cars)

You can put any R code inside the triple-apostrophes, and RStudio will quietly run it in the background and output the PDF, Word, or HTML document with the output included. Again, see the sample code when you create a new RMarkdown file, or you can even look directly at the .Rmd file that generated these Homework Guidelines, which I have included in the Course Resources.

Note that if you are using any packages that need to be loaded, such as ggplot2, those packages need to be loaded in your .Rmd file just as they would be if you were running an R script. But as with R code, the package only needs to be loaded once per file. So if we wanted a fancier plot using ggplot2 in our homework file, we would run somthing like:

  ```{r fig.width=5, fig.height=2.5, comment=NA}
  library(ggplot2)
  ggplot(data=airquality,aes(x=Temp)) + geom_histogram()
  ```

This produces the following when we knit it:

library(ggplot2)
ggplot(data=airquality,aes(x=Temp)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Note that we can also change the figure size and width, which gives a bit more control over looks. Again, more information on these tweaks can be found here: http://rmarkdown.rstudio.com/RMarkdownCheatSheet.pdf.

6 Math and equations in Latex

Last but not least, RMarkdown also lets you typeset equations. To include mathematical equations, you just put your math in between $ signs: $3x+4$ outputs \(3x+4\).

There are a few basic examples of syntax for getting pretty equations. On the left is the raw syntax, and on the right the pretty output you get when you knit your .Rmd document.

In the long run, it is much easier to learn these basics than to write your math by hand, scan it, and send it in.

Basics: $3x + 4y - 6.0z / 12 * 43.8$: \(3x + 4y - 6.0z / 12 * 43.8\)

Exponents: $3^{2x}$ : \(3^{2x}\)

Subscripts: $Y_{i}$ : \(Y_{i}\)

Summation: $\sum_{i=1}^{10} x_i$: \(\sum_{i=1}^{10} x_i\)

Integral: $\int_{1}^{10} x dx$: \(\int_{1}^{10} x dx\)

Fractions: $\frac{3x-9}{2}$ : \(\frac{3x-9}{2}\)

Hat: $\hat{x}$: \(\hat{x}\)

Bar: $\bar{x}$: \(\bar{x}\)

Square root: $\sqrt{b^2-4ac}$: \(\sqrt{b^2-4ac}\)

Some greek: $\alpha$: \(\alpha\), $\beta$: \(\beta\), $\chi$: \(\chi\), $\delta$: \(\delta\), $\epsilon$: \(\epsilon\), $\lambda$: \(\lambda\), $\mu$: \(\mu\), $\pi$: \(\pi\), $\rho$: \(\rho\), $\sigma$: \(\sigma\), $\theta$: \(\theta\), $\infty$: \(\infty\)

To put an equation on its own line and centered, use two $$ instead of just one:

$$p(x; \mu, \sigma) = \frac{1}{\sigma \sqrt{2 \pi}} e^{\frac{-(x-\mu)^2}{2 \sigma^2}}$$

\[p(x; \mu, \sigma) = \frac{1}{\sigma \sqrt{2 \pi}} e^{\frac{-(x-\mu)^2}{2 \sigma^2}}\]

For more math code, see here: http://nickbeauchamp.com/comp_stats_NB/homework_guidelines/ShortSymbInd.pdf, though you probably won’t need much more than is included here.

Give it a try! Open a new RMarkdown file, tinker with it a bit, add a few equations, and then hit “Knit PDF” if you’ve installed Latex (or “Knit HTML” if you haven’t yet). You’ll see that it’s super-easy and produces nice-looking output instantly.

And again, remember you can use the homework_solution_example.Rmd in the Course Resources folder as your homework template.

7 Example

Here is the entirety of a Rmd file that should serve as a template for assignments and brief reports. Feel free to customize however as much as you like!

---
title: "Homework 1 Solution"
author: "Nick Beauchamp"
date: "01/10/15"
output: pdf_document
---

This is an example of a homework assignment solution.  It is written in RMarkdown, and was knit to PDF.  It is most useful to look at the .Rmd file that produced the PDF.  You may use that .Rmd file as a template for you own homework assignments.

Note that these solutions are incorrect!  Do not just copy these answers as solutions for a problem set.  

1. 

a. Create two vectors named v1 and v2, where v1 is the sequence of integers from 2 to 6, and v2 is the sequence of integers from 5 to 9.  

```{r}
v1 <- 1:3
v2 <- 10:12
v1
v2
```


2. 

a. Create a 5 by 5 matrix with the numbers 1 to 25 as its elements, and call it m1.

```{r}
mat1 <- matrix(1:9,3,3)
mat1
```

b. What is m1 times v1?

```{r}
mat1 %*% v1
```


3. 

```{r fig.width=4, fig.height=3}
plot(cars)
```


4. 

a. Using latex equation notation in your .Rmd file, write out the quadratic formula, so that in your html file it looks pretty and like the version we all learned in high school. (Eg, see the box in the top right of this wikipedia page: <http://en.wikipedia.org/wiki/Quadratic_equation>.)

$$\frac{\sqrt{10 \pm x^7}}{2 - \alpha}$$