writing in Rmd with inline code
in Rmd writing reproducibility
January 14, 2022
One of the best things about RMarkdown is that you can use inline code to report summary and inferential statistics in your text. This means that it is impossible to make an error and if your data/values change, the text automatically updates.
Here I play with some penguin data and reporting summary stats using inline code.
load packages/data
library(tidyverse)
library(janitor)
library(palmerpenguins)
library(gt)
options(digits=2)
penguins <- penguins
count the penguins
Lets make a table that counts how many penguins there are in each species.
Here I’m using the tabyl()
function from the janitor
package to count how many penguins there are in each species and adorn a total column, then printing the table using gt()
.
count_penguins <- penguins %>%
tabyl(species) %>%
adorn_totals()
gt(count_penguins)
species | n | percent |
---|---|---|
Adelie | 152 | 0.44 |
Chinstrap | 68 | 0.20 |
Gentoo | 124 | 0.36 |
Total | 344 | 1.00 |
Now I can use inline text to refer to values in the count_penguins dataframe. The syntax goes like this…
r dataframe$column[rownumber]
For example, the following text in my Rmd file…
… knits into the text below.
In the Palmer penguins dataset, there are body measurements from a total of 344 penguins. There are 3 species represented (N = 124 Gentoo, N = 68 Chinstrap and N = 152 Adelie).
Lets get some summary statistics.
body_mass <- penguins %>%
group_by(species) %>%
summarise(mean = mean(body_mass_g, na.rm = TRUE))
gt(body_mass)
species | mean |
---|---|
Adelie | 3701 |
Chinstrap | 3733 |
Gentoo | 5076 |
Now this text in my Rmd file…
… knits into the text below.
On average, Gentoo penguins are the heaviest (M = 5076.02 g); Chinstrap (M = 3733.09 g) and Adelie (M = 3700.66 g) penguins are smaller.
Reproducibility risks with inline code
Writing reports with Rmd can save you tons of time because once you have the code, you can reuse it with different data. But there are also risks… what if in the next penguin experiment, the mean body mass that ended up in the 3rd row of this table wasn’t for the Gentoo penguins, but rather some other species.
You can refer to rows by name in inline code using the column_to_rowname() function from tibble
.
body_mass <- penguins %>%
group_by(species) %>%
summarise(mean = mean(body_mass_g, na.rm = TRUE)) %>%
column_to_rownames(var = "species") # this replaces rownames that are numbers with species values
# print the rownames
rownames(body_mass)
## [1] "Adelie" "Chinstrap" "Gentoo"
Once the species values are rownames, you can refer to a particular row,column by their name within square brackets. Not sure why you need quotes to refer by row/col name… its a mystery but it works!
On average, Gentoo penguins are the heaviest (M = 5076.02 g); Chinstrap (M = 3733.09 g) and Adelie (M = 3700.66 g) penguins are smaller.
- Posted on:
- January 14, 2022
- Length:
- 10 minute read, 1955 words
- Categories:
- Rmd writing reproducibility
- Tags:
- Rmd writing reproducibility
- See Also:
- knitting to pdf
- parameterised penguins