Tag: worker-planet

  • Meet the InfoSec Planet

    If you are a frequent reader of this blog, you might already know that I created a small tool to generate a simple webpage plus an RSS feed, from the content of multiple other RSS sources, called worker-planet.

    This type of tool is often known as a “planet”:

    In online media a planet is a feed aggregator application designed to collect posts from the weblogs of members of an internet community and display them on a single page.

    Wikipedia

    While the tool is open-source, a person needs to deploy it before being able to see it in action. Not great.

    This brings us to last week. I was reading a recent issue of a popular newsletter, when I found an OPML file containing 101 infosec related sources curated by someone else.

    Instead of adding them to my newsreader, which to be honest, already contains a lot of cruft that I never read and that I should remove anyway, I saw a great fit to build a demo site for `worker-planet`.

    Preparing the sources

    The first step was to extract all the valid sources from that file. This is important because there is the chance that many of the items might not be working or online at all, since the file is more than 2 years old.

    A quick python script can help us with this task:

    # Extract existing URLs
    urls = []
    tree = ET.parse(opml_file)
    for element in tree.getroot().iter("outline"):
        if url := element.get("xmlUrl"):
            urls.append(url)
    
    # Make sure they are working
    def check_feed(url):
        try:
            response = urlopen(url)
            if 200 <= response.status < 300:
                body = response.read().decode("utf-8")
                ET.fromstring(body)
                return url
        except Exception:
            pass
    
    working_urls = []
    with ThreadPoolExecutor(max_workers=20) as executor:
        for result in executor.map(check_feed, urls):
            if result:
                working_urls.append(result)

    As expected, from the 101 sources present in the file, only 54 seem to be working.

    Deploying

    Now that we already have the inputs we need, it is time to set up and deploy our worker-planet.

    Assuming there aren’t any customizations, we just have to copy the wrangler.toml.example to a new wrangler.toml file and fill configs as desired. Here’s the one I used:

    name = "infosecplanet"
    main = "./worker/script.js"
    compatibility_date = "2023-05-18"
    node_compat = true
    account_id = "<my_id>"
    
    workers_dev = true
    kv_namespaces = [
        { binding = "WORKER_PLANET_STORE", id = "<namespace_id_for_prod>", preview_id = "<namespace_id_for_dev"> },
    ]
    
    [vars]
    FEEDS = "<all the feed urls>"
    MAX_SIZE = 100
    TITLE = "InfoSec Planet"
    DESCRIPTION = "A collection of diverse security content from a curated list of sources. This website also serves as a demo for \"worker-planet\", the software that powers it."
    CUSTOM_URL = "https://infosecplanet.ovalerio.net"
    CACHE_MAX_AGE = "300"
    
    [triggers]
    crons = ["0 */2 * * *"]

    Then npm run build plus npm run deploy. And it is done, the new planet should now accessible through my workers.dev subdomain.

    The rest is waiting for the cron job to execute and also configure any custom routes / domains on Cloudflare’s dashboard.

    The final result

    The new “Infosec Planet” is available on “https://infosecplanet.ovalerio.net” and lists the latest content in those infosec related sources. A united RSS feed is also available.

    In the coming weeks, I will likely improve a bit the list of sources to improve the overall quality of the content.

    One thing I would like to highlight, is that I took a special precaution to not include the full content of the feeds in the InfoSec Planet’s output.

    It was done this way because I didn’t ask for permission from all those authors, to include the contents of their public feeds in the page. So just a small snippet is shown together with the title.

    Nevertheless, if some author wishes to remove their public feed from the page, I will gladly do it so once notified (by email?).

  • New release of worker-planet

    Two years ago, I made a small tool on top of Cloudflare’s Workers to generate a single feed by taking input from multiple RSS sources, a kind of aggregator or planet software as it was usually known a few years ago. You can read more about it here and here.

    This is a basic tool that is meant to be easy to deploy. The codebase itself doesn’t need too much maintenance.

    However, after all this time, the code started to become outdated to the point it could become unusable soon, since the ecosystem has moved on.

    So during this last week, I did a few upgrades and released a new version. The changes include:

    • It now uses a recent version of wrangler.
    • The development workflow was updated.
    • A new example template was added.
    • A new template helper and new context data were added to help with the development of new templates.

    You can grab a copy here. Any bugs or improvements, feel free to create new issues on the GitHub repository or contribute with new patches.