EU-Free and Open Source Software Auditing project

Today I stumbled on this blog post about a poll for the EU-FOSSA. I’m not familiarized with all aspects of this pilot project, however by the information I could gather, it seems to be a really great idea.

Most of us regularly use, up to a certain degree, several pieces of free (as in freedom) software on a daily basis. Many of these projects are essential to assure the security of our communications, documents and work. European institutions and countries make use of these tools as well, so why not spend a little time and money to assure they meet certain quality goals and are free of major bugs that can undermine the safety of its users?

This will also raise the public’s trust on these tools, so they can become standards over their proprietary counterparts, which we are unable to review and modify according to our needs, leading to many security questions.

One of its components is a sample review of one open-source project and until the 8th day of July you can give your opinion on which one. Go there it only takes 1 minute and it will help them understand that this is an important issue. Here is the link

Test driving ZeroNet

A few weeks ago the “Decentralized Web Summit” took place in San Francisco, even though there was a video stream available at the time, I wasn’t able to watch it, but later I saw some excerpts of it. One of the talks that caught my attention was about a new thing called ZeroNet. It seemed to be some kind of network where the assets and the contents of the websites are fetched from your peers while introducing clever mechanisms to give the owners control and allowing the existence of user generated content. It grabs concepts from either bitcoin and bittorrent, but for a better explanation bellow is an introduction by the creator of this technology:

The presentation is very high level, so on the website I found some slides with more details about how it works and I must say it is very interesting from a technical perspective, it even has an address naming system (“.bit”) if you don’t want to have some gibberish on the address bar.

Watching the video things seamed to be working pretty well (for something that was being presented for the first time), so I decided to join the network and give it a try. For those using docker it happens to be pretty easy, just run:

$ docker run -d -v <local_data_folder>:/root/data -p 15441:15441 -p 43110:43110 nofish/zeronet

then your node will be available on: http://127.0.0.1:43110/

After using it for 2 weekends, I have to say the level of polish of this project is amazing, all the pre-built apps work pretty well and are easy to use, the websites load super fast (at least in comparison with my expectations) and changes show up in real-time. The most interesting aspect of all was the amount of people trying and using it.

You may ask, what are the great advantages of using something like this? Based on what I’ve seen during these few days there are 3 points/use cases where this network shines:

  • Websites cannot be taken down, as long as there are peers serving it, it will be online.
  • Zero infrastructure costs (or pretty close) to run a website there, you create and sign the content, it gets delivered by the peers.
  • Website that you visit remain available while you are offline.

So to test this network further, I will do an experiment. During the next few weeks/months I will mirror this blog and make the new contents available on ZeroNet, starting with this post. The address is:

http://127.0.0.1:43110/1PLZ7PjfX91VSMmzU5revwswmrEkTz6Mpk

Note: In this initial stage it might not be always available, since at the moment I’m the only peer serving it from my laptop.

To know more about it, check the repository on Github.

Blogs, web feeds and the open web

As I already published before (twice), I’m a big supporter of an “ancient” and practically dead technology, at least as many like to call it, that still can be found in the Internet. It is the RSS, a very useful standard that is one of the foundations for publishing content to the Web in a open way (the way it initially was supposed to).

Today and following a recent blog post of Seth Godin, about reading more blogs and teaching a way to easily get their new content, I want to get back to the subject and address a few thoughts I have on the matter. Without publicizing a single solution, I want to explore and extend a few points made in that article.

First, I want to start with a basic explanation of how it works, at least for the user. The basic idea is that content creators, being professionals or hobbyists, along side with the content displayed on their website, also publish some structured file that is not supposed to be read by humans, with information about that content (and sometimes with parts of that content). The usefulness of these files, is that other entities can watch them and get a linear view of what was published over time. This way people that want to follow or consume that content can use cool little apps, that track their favorite authors and let them know when there is new stuff, along site with other features, such as keeping track of what you already read.

RSS
RSS Logo

If you check for the icon shown above, you will find it in many websites, this is hugely used in many areas from newspapers to blogs, from postcasts to itunes, etc.

Overall It is used broadly and it lets people do cool things with it, however (this is where I start to converge on the topic of Seth’s post) most big players don’t have an interest in an open web and slowly, over time, they started dropping support for it and undermining its usefulness, because they want people to publish and consume content only inside their platform, locking everyone’s content to their services. Two examples are given, Google and Facebook, but I’m sure there are others.

There are many benefits of following your favorite authors using these feeds, such as:

  • You control the content that you read. Lets face it, letting any middleman manipulate what kind of information you have access to is never a good thing, and it is not uncommon.
  • You are not stuck with a single interface where the content is lost overtime. You can choose your app and organize the content your own way.
  • Clear distinction of what you already read and what you didn’t.
  • Even if the content gets taken down, you might have your own copy. There is no risk of a service going out of business and everybody losing all they content.

One thing I’ve been seeing more often, is many people writing such nice content but their blog or platform does not expose RSS feeds (like this one). This saddens me because I do not remember every good source, if I can’t add it to my feed reader, I will often forget to check for new content. It is relatively easy to add it, most systems support it, and if it is a custom one there are plenty of libraries available to help you with that.

I comprehend the need to make the person visit the site in order to monetize the content, in these cases there is always the option to only add to the feed the title and a small excerpt, the reader will follow the link to reach the remaining of the content.

Django Friday Tips: Timezone per user

Adding support for time zones in your website, in order to allow its users to work using their own timezone is a “must” nowadays. So in this post I’m gonna try to show you how to implement a simple version of it. Even though Django’s documentation is very good and complete, the only example given is how to store the timezone in the users session after detecting (somehow) the user timezone.

What if the user wants to store his timezone in the settings and used it from there on every time he visits the website? To solve this I’m gonna pick the example given in the documentation and together with the simple django-timezone-field package/app implement this feature.

First we need to install the dependency:

 $ pip install django-timezone-field==2.0rc1

Add to the INSTALLED_APPS of your project:

INSTALLED_APPS = [
    ...,
    'timezone_field',
    ...
]

Then add a new field to the user model:

class User(AbstractUser):
    timezone = TimeZoneField(default='UTC'

Handle the migrations:

 $python manage.py makemigration && python manage.py migrate

Now we will need to use this information, based on the Django’s documentation example we can add a middleware class, that will get this information on every request and set the desired timezone. It should look like this:

from django.utils import timezone


class TimezoneMiddleware():
    def process_request(self, request):
        if request.user.is_authenticated():
            timezone.activate(request.user.timezone)
        else:
            timezone.deactivate()

Add the new class to the project middleware:

MIDDLEWARE_CLASSES = [
    ...,
    'your.module.middleware.TimezoneMiddleware',
    ...
]

Now it should be ready to use, all your forms will convert the received input (in that timeone) to UTC, and templates will convert from UTC to the user’s timezone when rendered. For different conversions and more complex implementations check the available methods.

Receive PGP encrypted emails, without the sender needing to know how to do it

One common trouble of people trying to secure their email communications with PGP, is that more often that not the other end doesn’t know how to use these kind of tools. I’ll be honest, at the current state the learning curve is too steep for the common user. This causes a huge deal of trouble when you desire to receive/sent sensitive information in a secure manner.

I will give you an example, a software development team helping a customer building his web business or application, may want to receive a wide variety of access keys to external services and APIs, that are in possession of the customer and are required (or useful) to be integrated in the project.

Lets assume that the customer is not familiarized with encryption tools, the probability of that sensitive material to be shared in an insecure way is too high, he might send it through a clear text email or post it on some shared document (or file). Both the previous situations are red flags, either by the communication channel not secure enough or the possibility of existing multiple copies of the information in different places with doubtful security, all of them in clear text.

In our recent “Whitesmith Hackathon”, one of the projects tried to address this issue. We though on a more direct approach to this situation based on the assumption that you will not be able to convince the customer into learning this kind of things. We called it Hawkpost, essentially it’s a website that makes use of OpenPGP.js, where you create unique links containing a form, that the user uses to submit any information, that will then be encrypted on his browser with your public key (without the need to install any extra software) and forwarded to your email address.

You can test and used it on https://hawkpost.co, but the project is open-source, so you can change it and deploy it on your own server if you prefer. It’s still in a green state at the moment, but we will continue improving the concept according with the received feedback. Check it out and tell us what you think.

Log based analytics are still useful

A long time ago, most of the modern website analytics software made the shift from relying on server logs to use client-side code snippets to gather information about the user, in this last category we can include as examples Google Analytics and Piwik. In fact, this paradigm allows to collect information with greater detail about the visitors of the website and gives developers more flexibility, however this can also be seen as the website owners imposing the execution of code on the user’s computing device that goes against his will and undermines his privacy (some people go as further as putting it in the same category as malware). Log based analytics software, last time i checked, is seen as a museum relic from the 90s and early 00s.

However, as have been explained in a blog post named: Why “Ad Blockers” Are Also Changing the Game for SaaS and Web Developers and further discussed by the Hacker News community, we might need look again to the server-side approach, since the recent trends of using Ad blockers (which have all legitimacy, given the excesses of the industry) can be undermining the usefulness of the client-side method, given that most of the time the loading of the snippet and the extra requests that are required are being blocked. This is why server side analytics can be very handy again, allowing us to measure the “Ghost Traffic” as it is called in the article.

A very high level overview of both methods can be described like this:

Client-side:

  • Pros:
    • Lots of information
    • Easy to setup
  • Cons:
    • Extra requests and traffic
    • Can be blocked by browser extensions
    • The use of a third party entity raises some privacy concerns.

Server-side:

  • Pros:
    • Cannot be blocked,
    • Does not pose a privacy concern since it only records the requests for the website “pages” made by the user.
  • Cons:
    • Less detailed information,
    • If the server is behind a CDN, not all requests will hit the server.

The main issues with the use of log based tool is that they look ancient, some haven’t seem an update for a while and can take some work to setup. Nevertheless, they definitely can be very useful in order to understand the extent of the usage of blockers by visitors and even for the cases when we just need simple numbers. It also puts aside the privacy discussion since it only monitors the activity of the servers.

That’s the case of this blog, I do not run any analytics software here (because I do not see the need given its purpose) and when I’m curious about the traffic, I use a very cool tool called GoAccess, that goes over the nginx logs and generates some nice reports.

Give it a look, perhaps you don’t need Google Analytics everywhere or its results might not be as accurate as you think, specially if your audience has a significant percentage of tech-savvy people.

Django Friday Tips: Secret Key

One thing that is always generated for you when you start a new django project is the SECRET_KEY string. This value is described in the documentation as:

A secret key for a particular Django installation. This is used to provide cryptographic signing, and should be set to a unique, unpredictable value.

The rule book mandates that this value should not be shared or made public, since this will defeat its purpose and many securing features used by the framework. Given that on any modern web development process we have multiple environments such as production and staging, or in the cases where we might deploy the same codebase different times for different purposes, we will need to generate and have distinct versions of this variable so we can’t rely solely on the one that was generated when the project was started.

There is no official way to generate new values for the secret key, but with a basic search on the Internet, you can find several sources and code snippets for this task. So which one to use? The django implementation has a length of 50 characters, chosen randomly from an alphabet with size 50 as well, so we might start with this as a requirement. Better yet, why not call the same function that django-admin.py uses itself?

So for a new project, the first thing to do is to replace this:

SECRET_KEY = "uN-pR3d_IcT4~ble!_Str1Ng..."

With this:

SECRET_KEY = os.environ.get("SECRET_KEY", None)

Then for each deployment we can generate a distinct value for it using a simple script like this one:

from django.utils.crypto import get_random_string

chars = 'abcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*(-_=+)'
print("SECRET_KEY={}".format(get_random_string(50, chars)))

Usage:

$ python script_name.py >> .env

Some people think the default function is not random enough and proposed a different alternative (that also works), if you feel the same way check this script.

0 A.D: a pleasant surprise

When I was younger, I remember being a great fan of real-time strategy games, specially those based of history. One of the main reasons I was really happy when I’ve got my first computer, was that from that moment i would be able to play the first “Age of Empires” game, which my dad bought together with the computer. During months I saved 100% of my allowance, just to be able to buy the first expansion pack the “Rise of Rome”. In the years that followed, I’d also bought the second version of the game and its expansion pack, spending countless hours playing them.

More than a decade after, which I went through without playing games (or at least on a regular basis), I’ve decided to find some RTS of this genre to play. Since the Age of Empires series do not run on Linux based operating systems, I had to start looking for similar alternatives. I didn’t took long to find the first contender, which is called 0 A.D., the game is open source and from the contents shown on the website it looked just what I was looking for.

In the game you can choose between 8 factions/civilizations from the ancient times (the website says that on the final release there will be 12), each of them with special characteristics, strengths and weaknesses. The Idea is that these civilizations should have had their peak between the 500 B.C. and 500 A.C., leaving many more contenders in the waiting list to be added to the possible choices.

The game is in 3D, where you have control over the camera and you can adjust it to the best angle on any given situation. The graphics look pretty good turning the game into a nice experience. Other aspect that I really liked is that even tough there are specialized units, many of them can assume roles on both worlds (the military and the civilian) which opens a whole range of possibilities.

According to the development team the game is still on “alpha”, or in other words it’s “far from completion”, however it already is playable both on single and multi-player (during the few hours I’ve spent playing it I didn’t found any annoying issue).

So if you like this kind of games give it a try, the official page of the game, where you can download the last version, is play0ad.com. On Debian (testing) you can use apt since the repositories are up to date.

Managing secrets

A few hours ago, I published a small article on Whitesmith’s blog about sharing and managing secrets, inside a software development environment. At first I dig a little into this problem that is very common and later I explain how we are addressing these issues. You can check it through the following link:

Managing Secrets (www.whitesmith.co/blog/managing-secrets/)

I mention some tools on the article that are very interesting in this area, but a more detailed analysis or walk-through was left for a future post as we get more familiarized with them.

Browsing folders of markdown files

If you are like me, you have a bunch of notes and documents written in markdown spread across many folders. Even the documentation of some projects involving many people is done this way and stored, for example, in a git repository. While it is easy to open the text editor to read these files, it is not the most pleasant experience, since the markup language was made to later generate readable documents in other formats (eg. HTML).

For many purposes setting up the required configuration of tools to generate documentation (like mkdocs) is not practical, neither it was the initial intent when it was written. So last weekend I took a couple of hours and built a rough (and dirty) tool to help me navigate and read the markdown documents with a more pleasant experience, using the browser (applying style as github).

I called it mdvis and it is available for download through “pip”. Here’s how working with it looks like:

It does not provide many features and is somewhat “green”, but it serves my current purposes. The program is open-source so you can check it here, in case you want to help improving it.