Python Software Development

Why you shouldn’t remove your package from PyPI

Nowadays most software developed using the Python language relies on external packages (dependencies) to get the job done. Correctly managing this “supply-chain” ends up being very important and having a big impact on the end product.

As a developer you should be cautious about the dependencies you include on your project, as I explained in a previous post, but you are always dependent on the job done by the maintainers of those packages.

As a public package owner/maintainer, you also have to be aware that the code you write, your decisions and your actions will have an impact on the projects that depend directly or indirectly on your package.

With this small introduction we arrive to the topic of this post, which is “What to do as a maintainer when you no longer want to support a given package?” or ” How to properly rename my package?”.

In both of these situations you might think “I will start by removing the package from PyPI”, I hope the next lines will convince you that this is the worst you can do, for two reasons:

  • You will break the code or the build systems of all projects that depend on the current or past versions of your package.
  • You will free the namespace for others to use and if your package is popular enough this might become a juicy target for any malicious actor.

TLDR: your will screw your “users”.

The left-pad incident, while it didn’t happen in the python ecosystem, is a well known example of the first point and shows what happens when a popular package gets removed from the public index.

Malicious actors usually register packages using names that are similar to other popular packages with the hope that a user will end up installing them by mistake, something that already has been found multiple times on PyPI. Now imagine if that package name suddenly becomes available and is already trusted by other projects.

What should you do it then?

Just don’t delete the package.

I admit that in some rare occasions it might be required, but most of the time the best thing to do is to leave it there (specially for open-source ones).

Adding a warning to the code and informing the users in the README file that the package is no longer maintained or safe to use is also a nice thing to do.

A good example of this process being done properly was the renaming of model-mommy to model-bakery, as a user it was painless. Here’s an overview of the steps they took:

  1. A new source code repository was created with the same contents. (This step is optional)
  2. After doing the required changes a new package was uploaded to PyPI.
  3. Deprecation warnings were added to the old code, mentioning the new package.
  4. The documentation was updated mentioning the new package and making it clear the old package will no longer be maintained.
  5. A new release of the old package was created, so the user could see the deprecation warnings.
  6. All further development was done on the new package.
  7. The old code repository was archived.

So here is what is shown every time the test suite of an affected project is executed:

/lib/python3.7/site-packages/model_mommy/ DeprecationWarning: Important: model_mommy is no longer maintained. Please use model_bakery instead:

In the end, even though I didn’t update right away, everything kept working and I was constantly reminded that I needed to make the change.

By Gonçalo Valério

Software developer and owner of this blog. More in the "about" page.

9 replies on “Why you shouldn’t remove your package from PyPI”

My problem was on npm and was because packages were not deleted. In particular it was with cryptojs or Crypto-js (and many case sensitive variations) – all turned out to be broken in some way. Some had additional suffixes and I eventually tracked down one that responded to bug reports. Since the Python version worked I was able to compare and port the changes to that particular JS package. There is now a way to make packages as deprecated for npm but delete no longer works.
The problem I see is really that broken packages can occupy the namespace and never get updated, security-fixed or removed. I think this applies to any package system and single level names make it worse.
The alternative to reuse after deletion is to lock that package name as unusable, but only if the package was detected to have security vulnerabilities or was malware. Although given the number of dependencies in typical programs these days and the regularity of security vulnerability detection this will likely cause chaos.

This is simply an important variant on the “Once it’s on the internet, LEAVE IT IN PLACE” meme. Right down to mimicking the “This page is out of date please look ☞ over there ☞” thing.

Good to think about!

I’m not a package maintainer, but this raises an important problem that I’ve seen using packages.

If I maintain package “X”, and it has a hard dependency on packages “a”, “b”, and “c” – what do I do when one or more packages disappear and/or fail to be maintained?

Case #1:
Both I, and the users of my package are just plain effed. As a consequence, my package is as useful as teats on a boar hog and anyone who depends on it now comes after ME because my package is “broken”.

Case #2 is essentially a restatement of case #1 except things fail for different reasons and the users go after the other maintainer with guns, knives and pitchforks.

How do we solve this?
AFAIK, the only way to guarantee this is to either hope someone else takes over the project, or take it over myself – which I may not have the time or skill to do – especially if it’s a security-sensitive package.

So what’s a poor user/maintainer to do?

You can also “yank” a release on PyPi. This will remove the version of the package from a generic install of the package but allow those who pip install a package with the version number to install it. Pip will warn you that a package has been yanked.

Comments are closed.