Spending in COVID

By Jen Richmond

May 5, 2020

There is no doubt that COVID19 has massively disrupted how we spend our time and money. In this post, I update code inspired by this R-bloggers blog to look at how much less money we have been spending on eating in the last month, compared to April 2019.

Download a copy of your bank transactions in csv format.

Load packages

library(tidyverse)
library(janitor)
library(lubridate)

Read in data

Clean and rename, select just date, description, debit amount, add new column for class with default value = “Other”

items <- read_csv("data_new.csv") %>%
  clean_names() %>%
  rename(desc = narrative, 
         debit = debit_amount, 
         credit = credit_amount) %>%
  select(date, desc, debit) %>%
  mutate(class = "Other")

Build regular expression strings for categories

Mostly just interested in food changes between April 2019 and April 2020. Scan transaction desc for phrases that relate to food and transport.

# Build simple regexp strings

coffeelunch <- "MOORE|T2|Grams|CAKE|LAKSA|SUBWAY|STOCK|BURGER|XCEL|Penny|KETTLE|POP|Ground|GOMEZ|FUNG|VILLAGE|Campos|Estate|ANTONIO|MCDONALDS|Gelatissimo|COFFEE|CAFE|PICCOLO|MADENTZOGLOU|YACOCO|LAYERED|LITTLE ME|Marionette|Bohaus|BAYLEAF|SOCIETY|GRAMS|HEY|Courtyard|CHARLIE'S|TROPICAL|Lounge|GUSTO|ATOMIC|Coffee|Campus|Navitas|X74|LENTIL|Espresso|Poke|Bakery|SHED|CANS|PICKLE|Hungry|FORMAGGI|BAKERY|KURTOSH|Fratelli|Buffalo|Cow|Burgers|ANDERSEN'S|Chargrill|ICHI-BAN|COCO|ARC|Cafe|Boulangerie|UNIVERSITY|Erciyes|WOOD|AMBLE|Messina|Catering|Bake"

dinnerout <- "SUSHI|MAMAK|KTCHN|BLUEBOTTLES|Little L|Bavarian|Homestead|RESTAURANT|Ringer|THAI|Mamak|Gomez|BIERHAUS|CREPERIE |MESSINA|DOMINOS|GELATISSIMO|Osteria|TwoFatIndian|Dining|WOK|Rooftop|Pizza|Beresford|JOES|STINKING|COOKHOUSE|CLUB|Club|PASTA|FISH|FOUR|TOKYO|BISTRO|CLOVE|Arthur|HANGAR|FUSION|LUMI|ROCKPOOL|EATERY|Paddington|FRANK|BAT|Brauhaus|SHINABURO|JADE|NAKAJIMA|ITALIAN|INDIAN|TAVERN|Foodora|WPIZZA|SALUD|BETTYS|COFFS|Milky|BREW|PIZZA"


groceries <- "WW|COLES|WOOLWORTHS|Harris|MALONEYS|FRESH|CATERING|DELI|MARYDIAS|SEAFOOD|CAKE|BAKERS|IGA|SWEET|HOA|FARM|COUNTDOWN|WORLD|BUTCHRY|MARKET|FOODWORKS|ALDI|FOOD|HFM|PLAZA|Fine Foods|BALDICO|HILLVIEWFIN|KINGSMORE"

Use grepl to assign class value to each transaction

grepl is a weird thing. I think this checks if the value in items$desc matches something in the regexp defined for groceries (for example) and if it does puts Groceries in the items$class column

#assign values to class column based on regexp
   
items$class <-  
  ifelse(grepl(coffeelunch, items$desc),"CoffeeLunch",
      ifelse(grepl(dinnerout, items$desc),"Dinnerout",
                   ifelse(grepl(groceries, items$desc),"Groceries",
                                   "Other")))

Remove NAs and fix dates

list_items <- na.omit(items) 

list_items$date <- dmy(list_items$date)

list_items <- list_items %>%
  mutate(year = year(date), month = month(date), day = day(date))

Filter just April

Filter data for just April 2019 and 2020, make class and year factor.

april_list_items <- list_items %>%
  filter(month == 4)


april_list_items$class <- as.factor(april_list_items$class)
april_list_items$year <- as.factor(april_list_items$year)

Plot food spending

april_list_items %>%
  filter(class %in% c("Groceries", "Dinnerout", "CoffeeLunch")) %>%
  group_by(year, class) %>%
  summarise(monthlytotal = sum(debit)) %>%
  ggplot(aes(x = year, y = monthlytotal, fill = year)) +
  geom_col() +
  scale_fill_manual(values = c("#0072B2","#CC79A7")) +
  facet_wrap(~class) +
  labs(title = "April eating") +
  ylab("Monthly Spend") +
  theme(
  axis.text.y = element_blank(),
  axis.ticks = element_blank()) # removes y axis tick labels
## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.

ggsave("april.png")
## Saving 7 x 5 in image

We spend WAY too much on lunch/coffee at work every day!

Posted on:
May 5, 2020
Length:
2 minute read, 358 words
See Also: