Over the past several months I’ve been building what’s become a large Google Sheet (which is now at the 2M cell limit) tracking Find&Save Cash Dash offers. Like a lot of data projects this one started out small but over many months grew to a point where it no longer makes sense to cut/paste SQL data into a Google Sheet for pivot tables and charts. I wanted a solution that could pull data from REST APIs, MSSQL and Postgres and present data in an internal dashboard. Unfortunately, Google Sheet’s JDBC support doesn’t include Postgres otherwise Google Apps Script might have been a viable choice for automating data collection notwithstanding size limitations. Finally, another issue with Sheets was that I couldn’t seem to get pivot tables to resize automatically as more data was added. So, 3 strikes and Google Sheets was out.
Part of building mobile web apps is understanding the myriad of mobile analytics and in part visualizing the data to shed light on trends that my otherwise be difficult to see in tabular data or even a colorful cohort table. I’ve been building a dashboard using R, RStudio, Shiny, and Shiny Dashboards aggregating data from MSSQL, Postgres, Google Analytics, and Localytics.
Below is the main function to fetch the Localytics sample data and convert it into a data frame that’s suitable for plotting. Now, admittedly I’m not an R expert so there may well be better ways to slice this JSON response but this is a fairly straight forward approach. Essentially, this fetches the data, converts it from JSON to an R object, extracts the weeks, preallocates a matrix and then iterates over the data filling the matrix to build a data frame.
retentionDF <- function() {
# Example data from: http://docs.localytics.com/dev/query-api.html#query-api-example-users-by-week-and-birth_week
localyticsExampleJSON <- getURL('https://gist.githubusercontent.com/strefethen/180efcc1ecda6a02b1351418e95d0a29/raw/1ad93c22488e48b5e62b017dc5428765c5c3ba0f/localyticsexampledata.json')
cohort <- fromJSON(localyticsExampleJSON)
weeks <- unique(cohort$results$week)
numweeks <- length(weeks)
# Take the JSON response and convert it to a retention matrix (all numeric for easy conversion to a dataframe) like so:
# Weekly.Cohort Users Week.1
# 1 2014-12-29 7187 4558
# 2 2015-01-05 5066 NA
i <- 1
# Create a matrix big enough to hold all of the data
m <- matrix(nrow=numweeks, ncol=numweeks + 1)
for (week in weeks) {
# Get data for all weeks of this cohort
d <- cohort$results[cohort$results$birth_week==week,][,2]
lencohort <- length(d)
for (n in 1:lencohort) {
# Skip the first column using "+ 1" below which will be Weekly.Cohort (date)
m[i,n + 1] <- d[n]
}
i <- i + 1
}
# Convert matrix to a dataframe
df <- as.data.frame(m)
# Set values of the first column to the cohort dates
df$V1 <- weeks
# Set the column names accordingly
colnames(df) <- c("Weekly.Cohort", "Users", paste0("Week.", rep(1:(numweeks-1))))
return(df)
}
To make things easy I put together a gist and if you’re using R you can runGist it yourself. It requires several other packages so be sure to check the sources in case you’re missing any. Fair warning the Localytics API demo has very limited data so the chart, let’s just say simplistic however given many weeks worth of data it will fill out nicely (see example below).