I use the 511.org API and one question I see come up frequently on the support forum are requests to increase the number of API requests possible using an API key. The answer is pretty straightforward, the 511 API is not intended to serve as your application’s backend. If you want to serve the data to large numbers of users you’ll need to use your API key, fetch the data, cache it on your own backend and serve it from there. You can then use your API key to refresh your own cached data and serve your application.
Another question that often comes up is access to historic data which, as of this writing, the API simply doesn’t provide. I would doubt the 511.org team will offer historic data anytime soon (if ever). If you’re looking for historic data the answer would be to build your own backend and archive the data.
To request an increased rate limit, you can provide the following information to transitdata@511.org:
your API key
the rate limit you need
a description of the use case
data endpoints you intend to use
a brief justification for the requested rate limit
With the imminent demise of Google Analytics Universal in July 2023 I’ve been searching for an alternative and found an Open Source project from Plausible.io. Slight warning, this post is largely for myself to remember what I did to get this going so YMMV.
I host a few sites on Amazon Lightsail and while I found a few great posts discussing self-hosting of Plausible I found these setups perhaps a bit more complicated that I think it needed to be. For example, Lightsail has instances that include both a web server and Bitnami’s Let’s Encrypt SSL tool already installed so starting from an OS-only instance type requires some unnecessary configuration.
To get started quickly browse to the Amazon Lightsail console and under Apps + OS select the NodeJS instance using Debian (as of this writing Feb 2023). You could use and “OS Only” option but then you’d be left installing a number of packages vs. using this pre-built instance.
You’ll want to setup a static IP address for this instance for your DNS entry (see below). Next, SSH into your instance and create an “apps” folder under /opt/bitnami:
NOTE: I added DISABLE_REGISTRATION as I’m the only user who needs access.
I use Cloudflare for my CDN/DNS and I created an A NAME DNS-only entry pointing to analytics.<domain>.com which I use for Plausible.
Next, again following the self-hosting guide I started up the server using docker-compose:
$ sudo docker-compose up -d
The docs weren’t super clear about getting the first user created but browse to your URL and register the first user directly on the site. Following that I would recommend disabling registrations in the plausible-conf.env using:
Finally, setup your Apache virtual host config by copying the existing sample HTTPS configuration:
$ cd /opt/bitnami/apache/conf/vhosts
$ sudo cp sample-https-vhost.conf.disabled plausible-https-vhost.conf
Here’s my Apache vhost file for Plausible (using X-Robots-Tag to prevent crawling):
<VirtualHost 127.0.0.1:443 _default_:443>
ServerName analytics.<domain>.com
ServerAlias *
SSLEngine on
Header Set X-Robots-Tag "noindex, noarchive, nosnippet"
SSLCertificateFile "/opt/bitnami/apache/conf/analytics.<domain>.com.crt"
SSLCertificateKeyFile "/opt/bitnami/apache/conf/analytics.<domain>.com.key"
DocumentRoot /opt/bitnami/apps/plausible
# BEGIN: Configuration for letsencrypt
Include "/opt/bitnami/apps/letsencrypt/conf/httpd-prefix.conf"
# END: Configuration for letsencrypt
# BEGIN: Support domain renewal when using mod_proxy without Location
<IfModule mod_proxy.c>
ProxyPass /.well-known !
</IfModule>
# END: Support domain renewal when using mod_proxy without Location
<Directory "/opt/bitnami/apps/plausible">
Options -Indexes +FollowSymLinks -MultiViews
AllowOverride All
Require all granted
</Directory>
ProxyPass / http://localhost:8000/
ProxyPassReverse / http://localhost:8000/
# BEGIN: Support domain renewal when using mod_proxy within Location
<Location /.well-known>
<IfModule mod_proxy.c>
ProxyPass !
</IfModule>
</Location>
# END: Support domain renewal when using mod_proxy within Location
</VirtualHost>
Following this suggestion, in /opt/bitnami/apps/plausible I created a robots.txt file with the following. Note, need to allow /js/ to allow crawling to work correctly for pages using Plausible:
User-agent: *
Allow: /js/
Disallow: /
I then updated the docker-compose.yaml adding a volumes mapping to this local file under the plausible service.
Lastly, restarted both Plausible and Apache and you should be well on your way to getting your Google Analytics alternative running.
$ sudo docker-compose down --remove-orphans && sudo docker-compose up -d
$ sudo /opt/bitnami/ctlscript.sh restart apache
All in all, the above took me less than a hour to get a working instance of Plausible.io up and running, self-hosted for $20/month on a 4GB Lightsail instance.
So far, I’ve been running this instance for 6 days and it looks pretty good. I’m getting a 500 error trying to import my old GA data so not quite sure what’s going on there but I did manage to get Google Search Console enabled but I’m not see where that data might show up. I’ve also setup email reporting and I’m curious to see what those look like.
Lastly, this project lives on its paid subscriptions to its cloud hosted version and assuming it works out I’ll be sure to use their sponsorship page.
Over the past several months I’ve been building what’s become a large Google Sheet (which is now at the 2M cell limit) tracking Find&Save Cash Dash offers. Like a lot of data projects this one started out small but over many months grew to a point where it no longer makes sense to cut/paste SQL data into a Google Sheet for pivot tables and charts. I wanted a solution that could pull data from REST APIs, MSSQL and Postgres and present data in an internal dashboard. Unfortunately, Google Sheet’s JDBC support doesn’t include Postgres otherwise Google Apps Script might have been a viable choice for automating data collection notwithstanding size limitations. Finally, another issue with Sheets was that I couldn’t seem to get pivot tables to resize automatically as more data was added. So, 3 strikes and Google Sheets was out.
Part of building mobile web apps is understanding the myriad of mobile analytics and in part visualizing the data to shed light on trends that my otherwise be difficult to see in tabular data or even a colorful cohort table. I’ve been building a dashboard using R, RStudio, Shiny, and Shiny Dashboards aggregating data from MSSQL, Postgres, Google Analytics, and Localytics.
Below is the main function to fetch the Localytics sample data and convert it into a data frame that’s suitable for plotting. Now, admittedly I’m not an R expert so there may well be better ways to slice this JSON response but this is a fairly straight forward approach. Essentially, this fetches the data, converts it from JSON to an R object, extracts the weeks, preallocates a matrix and then iterates over the data filling the matrix to build a data frame.
retentionDF <- function() {
# Example data from: http://docs.localytics.com/dev/query-api.html#query-api-example-users-by-week-and-birth_week
localyticsExampleJSON <- getURL('https://gist.githubusercontent.com/strefethen/180efcc1ecda6a02b1351418e95d0a29/raw/1ad93c22488e48b5e62b017dc5428765c5c3ba0f/localyticsexampledata.json')
cohort <- fromJSON(localyticsExampleJSON)
weeks <- unique(cohort$results$week)
numweeks <- length(weeks)
# Take the JSON response and convert it to a retention matrix (all numeric for easy conversion to a dataframe) like so:
# Weekly.Cohort Users Week.1
# 1 2014-12-29 7187 4558
# 2 2015-01-05 5066 NA
i <- 1
# Create a matrix big enough to hold all of the data
m <- matrix(nrow=numweeks, ncol=numweeks + 1)
for (week in weeks) {
# Get data for all weeks of this cohort
d <- cohort$results[cohort$results$birth_week==week,][,2]
lencohort <- length(d)
for (n in 1:lencohort) {
# Skip the first column using "+ 1" below which will be Weekly.Cohort (date)
m[i,n + 1] <- d[n]
}
i <- i + 1
}
# Convert matrix to a dataframe
df <- as.data.frame(m)
# Set values of the first column to the cohort dates
df$V1 <- weeks
# Set the column names accordingly
colnames(df) <- c("Weekly.Cohort", "Users", paste0("Week.", rep(1:(numweeks-1))))
return(df)
}
To make things easy I put together a gist and if you’re using R you can runGist it yourself. It requires several other packages so be sure to check the sources in case you’re missing any. Fair warning the Localytics API demo has very limited data so the chart, let’s just say simplistic however given many weeks worth of data it will fill out nicely (see example below).