Local AI to the rescue

The last couple of years have been dominated by the advancements in the Artificial Intelligence (AI) field. Many of us witnessed and are currently experiencing some sort of renaissance of AI.

It started with generated images from prompts, then it was all types of written content, and in the last few weeks we’ve seen astonishing videos completely generated from a prompt.

Simultaneously, many other more specific tasks and fields started seeing the outcomes of specialized usage of these technologies.

Like any other tool ever produced by human ingenuity, it can be used for “good” and for “bad”. However, that’s not what I want to discuss in this post, just a general observation.

Like many others, I felt the curiosity to experiment with these new tools, to see where they can help me in my daily life, either at work or at a more personal level.

One thing that quickly caught my attention, was that many of the most well-known products are only accessible through the internet. You send your inputs to “Company X” servers, they run the trained models on their end, and eventually the result is transmitted back to you.

While understandable, given that the hardware requirements for AI stuff are massive, I find unsettling the continuation of the trend of all your data and interactions being shared with a remote company.

Let’s take programming as a simple example, an area where some companies are betting strongly on AI helpers, such as GitHub’s Copilot. I think many employers wouldn’t be too happy knowing that their proprietary code was being leaked to a third party through developer interactions with an assistant.

Even though the above example might not apply to all, it is a real concern and in many places, adopting such a tool would require few discussions with the security and legal teams.

That is why I turned my attention to how can a person run this stuff locally. The main obstacles to this approach are:

The models that are freely available might not be the best ones.
Your hardware might not be powerful enough

Regarding the first problem, a few companies already released models that you can freely use, so we are good. They might not be as good as the big ones, but they don’t need to tell you all the right answers, nor do the job for you, to be useful in some way. They just need to help you break barriers with less effort, as it is shown in a recent study:

Instead, it lies in helping the user to make the best progress toward their goals. A suggestion that serves as a useful template to tinker with may be as good or better than a perfectly correct (but obvious) line of code that only saves the user a few keystrokes.

This suggests that a narrow focus on the correctness of suggestions would not tell the whole story for these kinds of tooling.
Measuring GitHub Copilot’s Impact on Productivity

The hardware issue is a bigger limitation to running more general and bigger models locally, however my experience showed me that smaller or more specific models can also bring value to the table.

As a proof that this is viable, we have the example of 2 web browsers that started integrating AI functionality, both for different reasons:

With the case for Local AI on the table, the next question is: how?

My local setup

Next I’ll list and describe the tools I ended up with after some research and testing. It is very likely that this setup will change soon, since things are moving really fast nowadays. Nevertheless, presently, they have been working fine for me on all sorts of tasks.

I mostly rely on four pieces of software:

Ollama: To run the Large Language Models (LLM) on my computers and provide a standard API that other apps can use.
Continue.dev plugin for my text editor/IDE: it presents a nice interface to the LLMs and easily attaches context to the session.
ImaginAIry: For generating images and illustrations. It can also generate video, but I never explored that part.
Fabric: An tool that provides “prompts” for common tasks you would like the AI to do for you.

All of them work well, even on my laptop that doesn’t have a dedicated GPU. It is much slower than on my desktop, which is much more powerful, but usable.

To improve that situation, I installed smaller models on the laptop, for example, codellama:7b instead of the codellama:34b and so on.

And this is it for now, if you have other suggestions and recommendations for local AI tools that I should try, please let me know. I’m well aware that better things are showing up almost every day.

My local setup

More posts

Django: Deferred constrain enforcement

Django: Overriding translations from dependencies

Security.txt in the wild: 2025 edition

Status of old PyPI projects: archived