Categories
Python Technology and Internet

Django Friday Tips: Adding RSS feeds

Following my previous posts about RSS and its importance for an open web, this week I will try to show how can we add syndication to our websites and other apps built with Django.

This post will be divided in two parts. The first one covers the basics:

  • Build an RSS feed based on a given model.
  • Publish the feed.
  • Attach that RSS feed to a given webpage.

The second part will contain more advanced concepts, that will allow subscribers of our page/feed to receive real-time updates without the need to continuously check our feed. It will cover:

  • Adding a Websub / Pubsubhubbub hub to our feed
  • Publishing the new changes/additions to the hub, so they can be sent to subscribers

So lets go.

Part one: Creating the Feed

The framework already includes tools to handle this stuff, all of them well documented here. Nevertheless I will do a quick recap and leave here a base example, that can be reused for the second part of this post.

So lets supose we have the following models:

class Author(models.Model):

    name = models.CharField(max_length=150)
    created_at = models.DateTimeField(auto_now_add=True)

    class Meta:
        verbose_name = "Author"
        verbose_name_plural = "Authors"

    def __str__(self):
        return self.name


class Article(models.Model):

    title = models.CharField(max_length=150)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)

    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    short_description = models.CharField(max_length=250)
    content = models.TextField()

    class Meta:
        verbose_name = "Article"
        verbose_name_plural = "Articles"

    def __str__(self):
        return self.title

As you can see, this is for a simple “news” page where certain authors publish articles.

According to the Django documentation about feeds, generating a RSS feed for that page would require adding the following Feedclass to the views.py (even tough it can be placed anywhere, this file sounds appropriate):

from django.urls import reverse_lazy
from django.contrib.syndication.views import Feed
from django.utils.feedgenerator import Atom1Feed

from .models import Article


class ArticlesFeed(Feed):
    title = "All articles feed"
    link = reverse_lazy("articles-list")
    description = "Feed of the last articles published on site X."

    def items(self):
        return Article.objects.select_related().order_by("-created_at")[:25]

    def item_title(self, item):
        return item.title

    def item_author_name(self, item):
        return item.author.name

    def item_description(self, item):
        return item.short_description

    def item_link(self, item):
        return reverse_lazy('article-details', kwargs={"id": item.pk})


class ArticlesAtomFeed(ArticlesFeed):
    feed_type = Atom1Feed
    subtitle = ArticlesFeed.description

On the above snippet, we set some of the feed’s global properties (title, link, description), we define on the items() method which entries will be placed on the feed and finally we add the methods to retrieve the contents of each entry.

So far so good, so what is the other class? Other than standard RSS feed, with Django we can also generate an equivalent Atom feed, since many people like to provide both that is what we do there.

Next step is to add these feeds to our URLs, which is also straight forward:

urlpatterns = [
    ...
    path('articles/rss', ArticlesFeed(), name="articles-rss"),
    path('articles/atom', ArticlesAtomFeed(), name="articles-atom"),
    ...
]

At this moment, if you try to visit one of those URLs, an XML response will be returned containing the feed contents.

So, how can the users find out that we have these feeds, that they can use to get the new contents of our website/app using their reader software?

That is the final step of this first part. Either we provide the link to the user or we include them in the respective HTML page, using specific tags in the head element, like this:

<link rel="alternate" type="application/rss+xml" title="{{ rss_feed_title }}" href="{% url 'articles-rss' %}" />
<link rel="alternate" type="application/atom+xml" title="{{ atom_feed_title }}" href="{% url 'articles-atom' %}" />

And that’s it, this first part is over. We currently have a feed and a mechanism for auto-discovery, things that other programs can use to fetch information about the data that was published.

Part Two: Real-time Updates

The feed works great, however the readers need continuously check it for new updates and this isn’t the ideal scenario. Neither for them, because if they forget to regularly check they will not be aware of the new content, neither for your server, since it will have to handle all of this extra workload.

Fortunately there is the WebSub protocol (previously known as Pubsubhubbub), that is a “standard” that has been used to deliver a notification to subscribers when there is new content.

It works by your server notifying an external hub (that handles the subscriptions) of the new content, the hub will then notify all of your subscribers.

Since this is a common standard, as you might expect there are already some Django packages that might help you with this task. Today we are going to use django-push with https://pubsubhubbub.appspot.com/ as the hub, to keep things simple (but you could/should use another one).

The first step, as always, is to install the new package:

$ pip install django-push

And then add the package’s Feed class to our views.py (and use it on our Atom feed):

from django_push.publisher.feeds import Feed as HubFeed

...

class ArticlesAtomFeed(ArticlesFeed, HubFeed):
    subtitle = ArticlesFeed.description

The reason I’m only applying this change to the Atom feed, is because this package only works with this type of feed as it is explained in the documentation:

… however its type is forced to be an Atom feed. While some hubs may be compatible with RSS and Atom feeds, the PubSubHubbub specifications encourages the use of Atom feeds.

This no longer seems to be true for the more recent protocol specifications, however for this post I will continue only with this type of feed.

The next step is to setup which hub we will use. On the  settings.py file lets add the following line:

PUSH_HUB = 'https://pubsubhubbub.appspot.com'

With this done, if you make a request for your Atom feed, you will notice the following root element was added to the XML response:

<link href="https://pubsubhubbub.appspot.com" rel="hub"></link>

Subscribers will use that information to subscribe for notifications on the hub. The last thing we need to do is to tell the hub when new entries/changes are available.

For that purpose we can use the ping_hub function. On this example the easiest way to accomplish this task is to override the Article  model save() method on the models.py file:

from django_push.publisher import ping_hub

...

class Article(models.Model):
    ...
    def save(self, *args, **kwargs):
        super().save(*args, **kwargs)
        ping_hub(f"https://{settings.DOMAIN}{reverse_lazy('articles-atom')}")

And that’s it. Our subscribers can now be notified in real-time when there is new content on our website.

By Gonçalo Valério

Software developer and owner of this blog. More in the "about" page.