/ #Data Wrangling #Data Exploration 

Loading, Merging and Joining Several Datasets - Excel EDT

This is the minimal coding necessary to assemble various data feeds and sort out the likes of variables naming & new features creation plus some general housekeeping tasks. It includes general housekeeping tasks like shortening variables names to ease visualisations, creating essential new features and sorting out variables order

The Dataset

library(tidyverse)
library(lubridate)
library(readr)

The dataset I’m using here accompanies a Redbooks publication called Building 360-Degree Information Applications which is available as a free PDF download. Also available as a free download are the excel files in the Additional Material section to follow along with the exercises in the book.

The data covers 3 & 12 years worth of sales orders for the Sample Outdoors Company, a fictitious B2B outdoor equipment retailer enterprise. The data comes with details about the products they sell as well as their customers (which are retailers in their case). The data is a subset of the GSDB database, a realistic and feature-rich database created by IBM® to support demos for their products.

Load, Merge and Join

This data pipeline assumes that all files are stored on the hard drive of your computer, in a folder named 00_data that sits on the root of your R project. I’m hosting all files on my GitHub account, on a repository that mirrors the file structure you may want to replicate on your PC for this code to run smoothly.

I will start with sales transactions as they are distributed across almost 900 separate excel files.

First, I load and merge them at once from a single folder order-details and bind them by row into an orders file.

orders <- list.files(path = "../00_data/order-details",
           pattern = "*.csv", 
           full.names = T) %>% 
   map_df(~readr::read_csv(.))

NOTE that all files have the same format with each variable sitting on the same column in each file

orders %>% glimpse()

## Observations: 446,023
## Variables: 45
## $ order_date            <dttm> 2004-01-12, 2004-01-12, 2004-01-12, 200...
## $ sales_staff_code      <dbl> 10798, 10798, 10406, 10406, 10406, 10406...
## $ order_method_it       <chr> "Reparto vendite", "Reparto vendite", "F...
## $ order_method_en       <chr> "Sales visit", "Sales visit", "Fax", "Fa...
## $ order_method_nl       <chr> "Vertegenwoordiger", "Vertegenwoordiger"...
## $ order_detail_code     <dbl> 1000001, 1000002, 1000013, 1000014, 1000...
## $ return_count          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ order_method_no       <chr> "Salgsbesøk", "Salgsbesøk", "Faks", "Fak...
## $ order_method_el       <chr> "<U+0395>p<U+03AF>s<U+03BA>e<U+03C8><U+03B7> p<U+03C9><U+03BB><U+03B7>t<U+03AE>", "<U+0395>p<U+03AF>s<U+03BA>e<U+03C8><U+03B7> p<U+03C9><U+03BB><U+03B7>t<U+03AE>", "F...
## $ unit_cost             <dbl> 15.62, 49.69, 80.00, 23.53, 176.47, 39.0...
## $ order_method_tc       <chr> "<U+696D><U+52D9><U+62DC><U+8A2A>", "<U+696D><U+52D9><U+62DC><U+8A2A>", "<U+50B3><U+771F>", "<U+50B3><U+771F>", "<U+50B3><U+771F>", "<U+50B3><U+771F>", ...
## $ order_detail_code_2   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ retailer_name         <chr> "Kavanagh Sports", "Kavanagh Sports", "A...
## $ retailer_name_mb      <chr> "Kavanagh Sports", "Kavanagh Sports", "A...
## $ order_method_cs       <chr> "Návšteva prodejce", "Návšteva prodejce"...
## $ order_method_hu       <chr> "Kiszállásos eladás", "Kiszállásos eladá...
## $ order_method_ms       <chr> "Lawatan jualan", "Lawatan jualan", "Fak...
## $ order_method_th       <chr> "<U+0E40><U+0E22><U+0E35><U+0E48><U+0E22><U+0E21><U+0E25><U+0E39><U+0E01><U+0E04><U+0E49><U+0E32>", "<U+0E40><U+0E22><U+0E35><U+0E48><U+0E22><U+0E21><U+0E25><U+0E39><U+0E01><U+0E04><U+0E49><U+0E32>", "<U+0E41><U+0E1F><U+0E01><U+0E0B><U+0E4C>",...
## $ quantity              <dbl> 256, 92, 162, 172, 74, 90, 422, 3252, 11...
## $ order_number_2        <dbl> 100001, 100001, 100002, 100002, 100002, ...
## $ sales_branch_code     <dbl> 21, 21, 30, 30, 30, 30, 30, 30, 30, 30, ...
## $ promotion_code        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ order_method_id       <chr> "Kunjungan penjualan", "Kunjungan penjua...
## $ order_number          <dbl> 100001, 100001, 100002, 100002, 100002, ...
## $ order_method_sc       <chr> "<U+9500><U+552E><U+9762><U+8BBF>", "<U+9500><U+552E><U+9762><U+8BBF>", "<U+4F20><U+771F>", "<U+4F20><U+771F>", "<U+4F20><U+771F>", "<U+4F20><U+771F>", ...
## $ order_method_fi       <chr> "Myyntikäynti", "Myyntikäynti", "Faksi",...
## $ order_method_ru       <chr> "<U+0422><U+043E><U+0440><U+0433>. <U+0430><U+0433><U+0435><U+043D><U+0442>", "<U+0422><U+043E><U+0440><U+0433>. <U+0430><U+0433><U+0435><U+043D><U+0442>", "<U+0424><U+0430><U+043A><U+0441>", "<U+0424>...
## $ retailer_site_code    <dbl> 20530, 20530, 20895, 20895, 20895, 20895...
## $ unit_sale_price       <dbl> 33.69, 102.30, 111.31, 38.90, 334.43, 75...
## $ order_method_da       <chr> "Sælgerbesøg", "Sælgerbesøg", "Fax", "Fa...
## $ order_method_pl       <chr> "Wizyta", "Wizyta", "Faks", "Faks", "Fak...
## $ order_method_de       <chr> "Verkaufsbesuch", "Verkaufsbesuch", "Fax...
## $ order_method_ko       <chr> "<U+BC29><U+BB38><U+D310><U+B9E4>", "<U+BC29><U+BB38><U+D310><U+B9E4>", "<U+D329><U+C2A4>", "<U+D329><U+C2A4>", "<U+D329><U+C2A4>", "<U+D329><U+C2A4>", ...
## $ ship_date             <dttm> 2004-01-19, 2004-02-17, 2004-01-19, 200...
## $ order_method_sv       <chr> "Säljbesök", "Säljbesök", "Fax", "Fax", ...
## $ order_method_fr       <chr> "Visite d'un représentant", "Visite d'un...
## $ order_method_code_2   <dbl> 6, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ unit_price            <dbl> 35.09, 110.00, 119.69, 40.52, 359.60, 81...
## $ order_method_pt       <chr> "Visita vendas", "Visita vendas", "Fax",...
## $ order_method_es       <chr> "Visita de ventas", "Visita de ventas", ...
## $ order_close_date      <dttm> 2004-02-17, 2004-02-17, 2004-01-19, 200...
## $ retailer_contact_code <dbl> 3549, 3549, 3915, 3915, 3915, 3915, 3915...
## $ order_method_code     <dbl> 6, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ order_method_ja       <chr> "<U+5E97><U+982D>", "<U+5E97><U+982D>", "<U+30D5><U+30A1><U+30C3><U+30AF><U+30B9>", "<U+30D5><U+30A1><U+30C3><U+30AF><U+30B9>", "<U+30D5><U+30A1><U+30C3><U+30AF><U+30B9>", "...
## $ product_number        <dbl> 41110, 69110, 75110, 76110, 85110, 65110...

A bit of a clean up is required as there are 45 orders variables but most of them are not necessary. For instance, order_method is available in many languages but I only need the one in English.

I start by removing all order_method other than English.

orders <- 
   orders %>% 
   rename(
      order_method = order_method_en,
      retailer = retailer_name
      )  %>% 
   select(
      -contains('order_method_')
      ) 
orders %>% glimpse()

## Observations: 446,023
## Variables: 21
## $ order_date            <dttm> 2004-01-12, 2004-01-12, 2004-01-12, 200...
## $ sales_staff_code      <dbl> 10798, 10798, 10406, 10406, 10406, 10406...
## $ order_method          <chr> "Sales visit", "Sales visit", "Fax", "Fa...
## $ order_detail_code     <dbl> 1000001, 1000002, 1000013, 1000014, 1000...
## $ return_count          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ unit_cost             <dbl> 15.62, 49.69, 80.00, 23.53, 176.47, 39.0...
## $ order_detail_code_2   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ retailer              <chr> "Kavanagh Sports", "Kavanagh Sports", "A...
## $ retailer_name_mb      <chr> "Kavanagh Sports", "Kavanagh Sports", "A...
## $ quantity              <dbl> 256, 92, 162, 172, 74, 90, 422, 3252, 11...
## $ order_number_2        <dbl> 100001, 100001, 100002, 100002, 100002, ...
## $ sales_branch_code     <dbl> 21, 21, 30, 30, 30, 30, 30, 30, 30, 30, ...
## $ promotion_code        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ order_number          <dbl> 100001, 100001, 100002, 100002, 100002, ...
## $ retailer_site_code    <dbl> 20530, 20530, 20895, 20895, 20895, 20895...
## $ unit_sale_price       <dbl> 33.69, 102.30, 111.31, 38.90, 334.43, 75...
## $ ship_date             <dttm> 2004-01-19, 2004-02-17, 2004-01-19, 200...
## $ unit_price            <dbl> 35.09, 110.00, 119.69, 40.52, 359.60, 81...
## $ order_close_date      <dttm> 2004-02-17, 2004-02-17, 2004-01-19, 200...
## $ retailer_contact_code <dbl> 3549, 3549, 3915, 3915, 3915, 3915, 3915...
## $ product_number        <dbl> 41110, 69110, 75110, 76110, 85110, 65110...

Then I load the products file, which, as it says on the tin, contains the products information

# load products file
products <- 
   read_csv('products.csv')
products %>% glimpse()

## Observations: 274
## Variables: 20
## $ product_description <chr> "The Venue has a fun and fashionable overs...
## $ base_product_number <dbl> 125, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12...
## $ product_line        <chr> "Personal Accessories", "Camping Equipment...
## $ production_cost     <dbl> 0.00, 4.00, 9.22, 15.93, 5.00, 34.97, 85.1...
## $ product_image       <chr> "P65PA3WT10.jpg", "P01CE1CG1.jpg", "P02CE1...
## $ product_type        <chr> "Watches", "Cooking Gear", "Cooking Gear",...
## $ color               <chr> "Red", "Clear", "Brown", "Unspecified", "S...
## $ gross_margin        <dbl> 0.00, 0.33, 0.23, 0.28, 0.28, 0.30, 0.28, ...
## $ product_line_code   <dbl> 993, 991, 991, 991, 991, 991, 991, 991, 99...
## $ product_name        <chr> "Venue", "TrailChef Water Bag", "TrailChef...
## $ introduction_date   <dttm> 2005-04-01, 1995-02-15, 1995-02-15, 1995-...
## $ title               <chr> "Venue", "TrailChef Water Bag", "TrailChef...
## $ product_type_code   <dbl> 960, 951, 951, 951, 951, 951, 951, 951, 95...
## $ product_size        <chr> "One-size", "10 liters", "2 liters", "15-p...
## $ brand               <chr> "Relax", "TrailChef", "TrailChef", "TrailC...
## $ discontinued_date   <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
## $ product_brand_code  <dbl> 756, 701, 701, 701, 701, 701, 701, 701, 70...
## $ product_size_code   <dbl> 853, 808, 807, 825, 804, 823, 824, 845, 84...
## $ product_color_code  <dbl> 922, 908, 906, 924, 923, 923, 923, 923, 91...
## $ product_number      <dbl> 125150, 1110, 2110, 3110, 4110, 5110, 6110...

Finally, the retailers file

# load retailer info
retailers <- 
   read_csv('retailers.csv')
retailers %>% glimpse()

## Observations: 847
## Variables: 13
## $ REGION_CODE        <dbl> 710, 710, 710, 710, 710, 710, 710, 710, 710...
## $ REGION_EN          <chr> "Americas", "Americas", "Americas", "Americ...
## $ COUNTRY_CODE       <dbl> 1004, 1004, 1004, 1004, 1004, 1004, 1004, 1...
## $ COUNTRY_KEY        <dbl> 90002, 90002, 90002, 90002, 90002, 90002, 9...
## $ COUNTRY_EN         <chr> "Canada", "Canada", "Canada", "Canada", "Ca...
## $ RETAILER_KEY       <dbl> 6789, 6789, 6789, 6790, 6790, 6790, 6790, 6...
## $ RETAILER_CODE      <dbl> 1189, 1189, 1189, 1190, 1190, 1190, 1190, 1...
## $ RETAILER_NAME      <chr> "Rock Steady", "Rock Steady", "Rock Steady"...
## $ RETAILER_SITE_KEY  <dbl> 5001, 5002, 5003, 5004, 5005, 5006, 5007, 5...
## $ RETAILER_SITE_CODE <dbl> 20001, 20002, 20003, 20004, 20005, 20006, 2...
## $ RTL_CITY           <chr> "Winnipeg", "Montr\xe9al", "Fredericton", "...
## $ RETAILER_TYPE_CODE <dbl> 6, 6, 6, 8, 8, 8, 8, 3, 2, 2, 2, 2, 2, 1, 7...
## $ RETAILER_TYPE_EN   <chr> "Outdoors Shop", "Outdoors Shop", "Outdoors...

As R is case sensitive, I’m setting retailers column names to lower-case

retailers <- 
   retailers %>% 
   rename_all(tolower) 

Now I can joining orders, products and retailers info into one data frame

df <- 
   orders %>% 
   left_join(products, by = ('product_number')) %>% 
   left_join(retailers, by = ('retailer_site_code')) 

Creating essential features & some housekeeping tasks

One long piece of code to sort all in one go.

I’ve added comments to each block to explain what it’s doing.

orders_tbl   <- 
   
    # create revenue, total product cost and gross profit
   df %>%
      mutate(
         revenue       = quantity * unit_price,
         tot_prod_cost = quantity * unit_cost,
         gross_profit  = revenue - tot_prod_cost
      ) %>% 
      
      # replacing NAs in the return_count variable
   replace_na(list(return_count = 0)) %>% 
      
    # Shorten product line names for readablility and ease of use in charts and code
    
   mutate(
      prod_line = case_when(
         product_line == 'Camping Equipment' ~ 'Camping Eqpt',
         product_line == 'Golf Equipment' ~ 'Golf Eqpt',
         product_line == 'Mountaineering Equipment' ~ 'Mountain Eqpt',
         product_line == 'Personal Accessories' ~ 'Personal Acces',
         product_line == 'Outdoor Protection' ~ 'Outdoor Prot',
         TRUE ~ product_line
      ),
      
      prod_line_2 = case_when(
         product_line == 'Camping Equipment' ~ 'Camping Eqpt',
         product_line == 'Golf Equipment' ~ 'Golf Eqpt',
         product_line == 'Mountaineering Equipment' ~ 'Mountain Eqpt',
         product_line == 'Personal Accessories' ~ 'Personal Acces',
         product_line == 'Outdoor Protection' ~ 'Personal Acces',
         TRUE ~ product_line
      ),
      
      # create alternative regional group
      region2 = case_when(
         country_en %in% c('United Kingdom', 'France', 'Spain',  
                        'Netherlands','Belgium','Switzerland') ~ 'West Europe',
         country_en %in% c('Germany', 'Italy', 'Finland',  
                        'Austria','Sweden','Denmark') ~ 'East Europe',
         TRUE ~ region_en
      )
   ) %>% 
   
   # create financial years
   mutate(
      ord_date = ymd(order_date),
      fin_year = case_when(
         between(ord_date, ymd("2004-07-01"), ymd('2005-06-30')) ~ 'FY_04_05',
         between(ord_date, ymd("2005-07-01"), ymd('2006-06-30')) ~ 'FY_05_06',
         between(ord_date, ymd("2006-07-01"), ymd('2007-06-30')) ~ 'FY_06_07',
         TRUE ~ 'other'
      ),
      
      # create all quarters
      quarter_all = case_when(
         between(ord_date, ymd("2004-01-01"), ymd('2004-03-31')) ~ '04_Q1',
         between(ord_date, ymd("2004-04-01"), ymd('2004-06-30')) ~ '04_Q2',
         between(ord_date, ymd("2004-07-01"), ymd('2004-09-30')) ~ '04_Q3',
         between(ord_date, ymd("2004-10-01"), ymd('2004-12-31')) ~ '04_Q4',
         between(ord_date, ymd("2005-01-01"), ymd('2005-03-31')) ~ '05_Q1',
         between(ord_date, ymd("2005-04-01"), ymd('2005-06-30')) ~ '05_Q2',
         between(ord_date, ymd("2005-07-01"), ymd('2005-09-30')) ~ '05_Q3',
         between(ord_date, ymd("2005-10-01"), ymd('2005-12-31')) ~ '05_Q4',
         between(ord_date, ymd("2006-01-01"), ymd('2006-03-31')) ~ '06_Q1',
         between(ord_date, ymd("2006-04-01"), ymd('2006-06-30')) ~ '06_Q2',
         between(ord_date, ymd("2006-07-01"), ymd('2006-09-30')) ~ '06_Q3',
         between(ord_date, ymd("2006-10-01"), ymd('2006-12-31')) ~ '06_Q4',
         between(ord_date, ymd("2007-01-01"), ymd('2007-03-31')) ~ '07_Q1',
         between(ord_date, ymd("2007-04-01"), ymd('2007-06-30')) ~ '07_Q2',
         between(ord_date, ymd("2007-07-01"), ymd('2007-09-30')) ~ '07_Q3',
         TRUE ~ 'other'
      ),
      
      # create selected quarters
      quarter_sel = case_when(
         between(ord_date, ymd("2004-07-01"), ymd('2004-09-30')) ~ '04_Q3',
         between(ord_date, ymd("2004-10-01"), ymd('2004-12-31')) ~ '04_Q4',
         between(ord_date, ymd("2005-01-01"), ymd('2005-03-31')) ~ '05_Q1',
         between(ord_date, ymd("2005-04-01"), ymd('2005-06-30')) ~ '05_Q2',
         between(ord_date, ymd("2005-07-01"), ymd('2005-09-30')) ~ '05_Q3',
         between(ord_date, ymd("2005-10-01"), ymd('2005-12-31')) ~ '05_Q4',
         between(ord_date, ymd("2006-01-01"), ymd('2006-03-31')) ~ '06_Q1',
         between(ord_date, ymd("2006-04-01"), ymd('2006-06-30')) ~ '06_Q2',
         between(ord_date, ymd("2006-07-01"), ymd('2006-09-30')) ~ '06_Q3',
         between(ord_date, ymd("2006-10-01"), ymd('2006-12-31')) ~ '06_Q4',
         between(ord_date, ymd("2007-01-01"), ymd('2007-03-31')) ~ '07_Q1',
         between(ord_date, ymd("2007-04-01"), ymd('2007-06-30')) ~ '07_Q2',
         TRUE ~ 'other'
      )
      
      
   ) %>% 
   
   # reorder columns and rename a few
      select(
      order_number,
      order_date,
      order_close_date,
      order_ship_date = ship_date,
      fin_year,
      quarter_all,
      quarter_sel,
      order_method,
      retailer,
      retailer_code,
      retailer_type   = retailer_type_en,
      region          = region_en,
      region2,
      country         = country_en,
      city            = rtl_city,
      promotion_code,
      return          = return_count,
      quantity,
      unit_price,
      unit_sale_price,
      unit_cost,
      unit_prod_cost  = production_cost,
      unit_gross_marg = gross_margin,
      revenue,
      tot_prod_cost,
      gross_profit,
      prod_numb       = product_number,
      prod_line,
      prod_line_2,
      prod_type       = product_type,
      prod_name       = product_name,
      brand,
      color,
      size            = product_size,
      intro_date      = introduction_date,
      halt_date       = discontinued_date
   ) 
orders_tbl %>% glimpse()

## Observations: 446,023
## Variables: 36
## $ order_number     <dbl> 100001, 100001, 100002, 100002, 100002, 10000...
## $ order_date       <dttm> 2004-01-12, 2004-01-12, 2004-01-12, 2004-01-...
## $ order_close_date <dttm> 2004-02-17, 2004-02-17, 2004-01-19, 2004-01-...
## $ order_ship_date  <dttm> 2004-01-19, 2004-02-17, 2004-01-19, 2004-01-...
## $ fin_year         <chr> "other", "other", "other", "other", "other", ...
## $ quarter_all      <chr> "04_Q1", "04_Q1", "04_Q1", "04_Q1", "04_Q1", ...
## $ quarter_sel      <chr> "other", "other", "other", "other", "other", ...
## $ order_method     <chr> "Sales visit", "Sales visit", "Fax", "Fax", "...
## $ retailer         <chr> "Kavanagh Sports", "Kavanagh Sports", "Ar fre...
## $ retailer_code    <dbl> 1398, 1398, 1760, 1760, 1760, 1760, 1760, 176...
## $ retailer_type    <chr> "Sports Store", "Sports Store", "Outdoors Sho...
## $ region           <chr> "Americas", "Americas", "Americas", "Americas...
## $ region2          <chr> "Americas", "Americas", "Americas", "Americas...
## $ country          <chr> "United States", "United States", "Brazil", "...
## $ city             <chr> "San Clara", "San Clara", "Sao Paulo", "Sao P...
## $ promotion_code   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ return           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, ...
## $ quantity         <dbl> 256, 92, 162, 172, 74, 90, 422, 3252, 1107, 8...
## $ unit_price       <dbl> 35.09, 110.00, 119.69, 40.52, 359.60, 81.55, ...
## $ unit_sale_price  <dbl> 33.69, 102.30, 111.31, 38.90, 334.43, 75.84, ...
## $ unit_cost        <dbl> 15.62, 49.69, 80.00, 23.53, 176.47, 39.00, 2....
## $ unit_prod_cost   <dbl> 24.62, 73.33, 80.00, 23.53, 176.47, 39.00, 3....
## $ unit_gross_marg  <dbl> 0.23, 0.33, 0.28, 0.40, 0.40, 0.50, 0.50, 0.6...
## $ revenue          <dbl> 8983.04, 10120.00, 19389.78, 6969.44, 26610.4...
## $ tot_prod_cost    <dbl> 3998.72, 4571.48, 12960.00, 4047.16, 13058.78...
## $ gross_profit     <dbl> 4984.32, 5548.52, 6429.78, 2922.28, 13551.62,...
## $ prod_numb        <dbl> 41110, 69110, 75110, 76110, 85110, 65110, 100...
## $ prod_line        <chr> "Camping Eqpt", "Personal Acces", "Personal A...
## $ prod_line_2      <chr> "Camping Eqpt", "Personal Acces", "Personal A...
## $ prod_type        <chr> "Lanterns", "Eyewear", "Knives", "Knives", "N...
## $ prod_name        <chr> "Flicker Lantern", "Polar Ice", "Edge Extreme...
## $ brand            <chr> "Firefly", "Polar", "Extreme", "Edge", "Extre...
## $ color            <chr> "Orange", "Black", "Red", "Black", "Black", "...
## $ size             <chr> "8-hour", "Men's", "14.5 cm", "33 cm", "Unspe...
## $ intro_date       <dttm> 2005-01-10, 1995-02-15, 2000-10-26, 1995-02-...
## $ halt_date        <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...

Remove original files and save

Last but not least, I can drop all original files and save the orders_tbl

# remove original files
rm(df, products, retailers, orders)

# Save orders as RDS
orders_tbl %>%
   write_rds("orders_tbl.rds")

Code Repository

The full R code can be found on my GitHub profile