#Propensity Modelling #Data Exploration #ML Interpretability #Model Selection #Optimisation #Machine Learning

Propensity Modelling - Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Abridged Version

In this day and age, a business that leverages data to understand the drivers of customers’ behaviour has a true competitive advantage. Organisations can dramatically improve their performance in the market by analysing customer level data in an effective way and focus their efforts towards those that are more likely to engage. One trialled and tested approach to tease this type of insight out of data is Propensity Modelling, which combines information such as a customers’ demographics (age, race, religion, gender, family size, ethnicity, income, education level), psycho-graphic (social class, lifestyle and personality characteristics), engagement (emails opened, emails clicked, searches on mobile app, webpage dwell time, etc. ...

#Data Wrangling #Data Exploration

Segmenting with Mixed Type Data - Initial data inspection and manupulation

With the new year, I started to look for new employment opportunities and even managed to land a handful of final stage interviews before it all grounded to a halt following the coronavirus pandemic. Invariably, as part of the selection process I was asked to analyse a set of data and compile a number of data driven-recommendations to present in my final meeting. In this post I retrace the steps I took for one of the take home analysis I was tasked with and revisit clustering, one of my favourite analytic methods. ...

#Data Wrangling #Data Exploration #Propensity Modelling

Propensity Modelling - Using h2o and DALEX to Estimate the Likelihood of Purchasing a Financial Product - Data Preparation and Exploratory Data Analysis

In this day and age, a business that leverages data to understand the drivers of its customers’ behaviour has a true competitive advantage. Organisations can dramatically improve their performance in the market by analysing customer level data in an effective way and focus their efforts towards those that are more likely to engage. One trialled and tested approach to tease out this type of insight is Propensity Modelling, which combines information such as a customers’ demographics (age, race, religion, gender, family size, ethnicity, income, education level), psycho-graphic (social class, lifestyle and personality characteristics), engagement (emails opened, emails clicked, searches on mobile app, webpage dwell time, etc. ...

#Machine Learning #Time Series #Forecasting #Data Exploration

Time Series Machine Learning Analysis and Demand Forecasting with H2O & TSstudio

Traditional approaches to time series analysis and forecasting, like Linear Regression, Holt-Winters Exponential Smoothing, ARMA/ARIMA/SARIMA and ARCH/GARCH, have been well-established for decades and find applications in fields as varied as business and finance (e.g. predict stock prices and analyse trends in financial markets), the energy sector (e.g. forecast electricity consumption) and academia (e.g. measure socio-political phenomena). In more recent times, the popularisation and wider availability of open source frameworks like Keras, TensorFlow and scikit-learn helped machine learning approaches like Random Forest, Extreme Gradient Boosting, Time Delay Neural Network and Recurrent Neural Network to gain momentum in time series applications. ...

#Data Wrangling #Data Exploration

Loading, Merging and Joining Several Datasets - PostgreSQL EDT

This is the coding necessary to assemble the various data feeds and sort out the likes of variables naming & new features creation plus some general housekeeping tasks. In order to simulate normal working conditions I would face if the data was stored on a database, I’ve uploaded the excel files onto a local PostgreSQL database that I’ve created on my machine. I am going to go through the steps I followed to set up a connection between RStudio and said database and extract the information I needed. ...

#Data Wrangling #Data Exploration

Loading, Merging and Joining Several Datasets - Excel EDT

This is the minimal coding necessary to assemble various data feeds and sort out the likes of variables naming & new features creation plus some general housekeeping tasks. It includes general housekeeping tasks like shortening variables names to ease visualisations, creating essential new features and sorting out variables order The Dataset library(tidyverse) library(lubridate) library(readr) The dataset I’m using here accompanies a Redbooks publication called Building 360-Degree Information Applications which is available as a free PDF download. ...