Just Three Things

in datawrangling

February 19, 2019

I love me a good #rstats screencast. David Robinson has been screencasting his #TidyTuesday efforts for the past few months and while it is GREAT to watch a master at work, I just don’t have time to watch someone code for an hour, in order to extract a handful of tips.

So when I saw Nick Tierney tweet about posting short videos that contain Just Three Things, I thought “that is a GREAT idea.”

The question is which 3 things will I start with?

Here is the video

blogdown::shortcode("youtube", "vWvtFSQXTLE")

1. janitor::clean_names()

The janitor package is full of helpful tricks and clean_names is one of things that is at the top of every one of my analyses by default. It automatically converts all of your variable names to lower case, and puts underscores in the gaps. Awesome.

Generally, I call tidyverse and use read_csv to get the data in and then check the variable names with names()

First load packages

library(tidyverse)
library(janitor)

Then read the data and check names

beaches <- read_csv("sydneybeaches.csv")
## Rows: 3690 Columns: 8

## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Region, Council, Site, Date
## dbl (4): BeachId, Longitude, Latitude, Enterococci (cfu/100ml)

## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

The variable names in the beaches data are not horrible, but the capital letters are a bit of a pain and the enterococci variable- well that one is bad news

Then clean_names() and use names() to check them

cleanbeaches <- clean_names(beaches)

names(cleanbeaches)
## [1] "beach_id"              "region"                "council"              
## [4] "site"                  "longitude"             "latitude"             
## [7] "date"                  "enterococci_cfu_100ml"

The enterococci_cfu_100ml variable is still not something you would want to type too often but it is definitely better!

2. Eat your pancakes

I am a markdown girl and love how the stack of “pancakes” in the top right of your markdown document will give you an outline of your document by levels of heading. It makes it super easy to navigate your way around a long analysis document.

Did you know that you can do the same thing in a R script?

When you are writing notes to yourself in a script, you use hash to demarcate pieces of text that comments rather than code. Simply put 4+ dashes at the end of your comment and it will appear in your pancake list.

# this is a heading I want to appear in my pancake list, 4+dashes —-

3. hotkeys

As you get better at R, there are occasions where it woould be good for your fingers to keep up with your brain. Hotkeys help you write code faster.

Ones I use a lot (note: these are mac versions)

The pipe = Shift-Command-M %>%

A new markdown chunk = Option-Command-I

Run the chunk you are in = Command-Enter

There are heaps of hotkeys built into Studio, but you can customise them or make your own.

Posted on:
February 19, 2019
Length:
3 minute read, 551 words
Categories:
datawrangling
See Also: