data wrangling

cleaning penguins with the janitor package

The janitor package by Sam Firke contains probably my FAVOURITE R function: clean_names(). By default when I am reading data into R, I pipe clean_names() onto the end of my read_csv(). I never have to look at inconsistently formatted variable names. But janitor package includes lots of other useful functions that make it easier to deal with dirty data and count stuff. new_df <- read_csv(here("data", "df.csv") %>% clean_names()) Exploring package functions Are you keen to dig into the little known functions of a package that you use all the time?

group_by and summarise

Some students have been asking me how they can calcuate means and standard errors by condition. Here is a quick example using the palmer penguin data. Details of the palmer penguin data, with art by Allison Horst, can be found here. load packages library(palmerpenguins) library(tidyverse) read in data penguins <- penguins glimpse(penguins) ## Rows: 344 ## Columns: 8 ## $ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel… ## $ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse… ## $ bill_length_mm <dbl> 39.