Categories
Random Bits Technology and Internet

Log based analytics are still useful

A long time ago, most of the modern website analytics software made the shift from relying on server logs to use client-side code snippets to gather information about the user, in this last category we can include as examples Google Analytics and Piwik. In fact, this paradigm allows to collect information with greater detail about the visitors of the website and gives developers more flexibility, however this can also be seen as the website owners imposing the execution of code on the user’s computing device that goes against his will and undermines his privacy (some people go as further as putting it in the same category as malware). Log based analytics software, last time i checked, is seen as a museum relic from the 90s and early 00s.

However, as have been explained in a blog post named: Why “Ad Blockers” Are Also Changing the Game for SaaS and Web Developers and further discussed by the Hacker News community, we might need look again to the server-side approach, since the recent trends of using Ad blockers (which have all legitimacy, given the excesses of the industry) can be undermining the usefulness of the client-side method, given that most of the time the loading of the snippet and the extra requests that are required are being blocked. This is why server side analytics can be very handy again, allowing us to measure the “Ghost Traffic” as it is called in the article.

A very high level overview of both methods can be described like this:

Client-side:

  • Pros:
    • Lots of information
    • Easy to setup
  • Cons:
    • Extra requests and traffic
    • Can be blocked by browser extensions
    • The use of a third party entity raises some privacy concerns.

Server-side:

  • Pros:
    • Cannot be blocked,
    • Does not pose a privacy concern since it only records the requests for the website “pages” made by the user.
  • Cons:
    • Less detailed information,
    • If the server is behind a CDN, not all requests will hit the server.

The main issues with the use of log based tool is that they look ancient, some haven’t seem an update for a while and can take some work to setup. Nevertheless, they definitely can be very useful in order to understand the extent of the usage of blockers by visitors and even for the cases when we just need simple numbers. It also puts aside the privacy discussion since it only monitors the activity of the servers.

That’s the case of this blog, I do not run any analytics software here (because I do not see the need given its purpose) and when I’m curious about the traffic, I use a very cool tool called GoAccess, that goes over the nginx logs and generates some nice reports.

Give it a look, perhaps you don’t need Google Analytics everywhere or its results might not be as accurate as you think, specially if your audience has a significant percentage of tech-savvy people.

By Gonçalo Valério

Software developer and owner of this blog. More in the "about" page.