The house of Cards that is our software ecosystem
A somewhat unstructured rant about current trends in the IT sector, and how we are about to fall face first.
Krisztián Pintér, 2026
pinterkr@gmail.com
Disclaimer ⇡
This writing is highly technical, very argumentative, and a bit of a rambling. But it still offers value to the non-technical reader, although maybe with a low signal to noise ratio. Stick around if you feel up to it.
Inciting Incident ⇡
In the second half of March 2026, the threat actor TeamPCP initiated a series of high profile attacks against popular software libraries, injecting malwares that collect sensitive information, keys and access codes from millions of users.
Despite the severity of the breach, the overall reaction can be best described as lax. Mainstream media coverage was nowhere to be seen, only security related sites reported on it. Very few people reads those, thus most people, including software developers weren't even informed. If you were affected, there is a high probability you would never have learned about it at all.
Most of the modern languages have some kind of package manager which can download and install libraries with a single command. These package managers also automatically install any other libraries that the one requested needs, called dependencies. And then the dependencies of those. Even if you read a discussion forum or blog related to your main library of interest, there are many dozens you don't even know you have.
This cuts the other way too. Any given library might have millions of users that are unknown to the developer team. There is no user registration. There are no mechanisms to notify them.
So the question naturally arises: how do I know if my libraries are hacked? What can I do to not be in the dark? Chasing the answer led me down a rabbit hole I didn't expect.
Auditing the Auditor ⇡
If we want notifications about relevant software vulnerabilities, the proposed (and basically only) solution today is to install some software that regularly goes through the list of libraries on the machine, and checks if they are listed in vulnerability databases.
There are many such databases. For example much of the world's source code is hosted by GitHub, which offers the Github Advisory Database. Vulnerabilities are usually reported hours after discovery. For Python, the very similar Python Packaging Advisory Database exists. There is also a Common Vulnerabilities and Exposures (CVE) database that covers all platforms. And there are many others.
Since my code base is in Python, I turned to a dedicated solution called pip_audit. My project runs in an environment where small deployment size is valued, so I was somewhat surprised to see that 20MB was added. It is acceptable, but jeez.
I immediately ran a scan mostly as a test. There were a handful of issues listed, and one of them was in a package called pygments. I didn't remember installing or using this package, so I went ahead and looked it up. This is a syntax highlighter that can nice-format source codes written in different programming languages, and common data formats like html. Why do I have this? I've never needed syntax highlighting in any of my projects. It must be one of those dependencies I mentioned, a requirement for some other package. We can ask the installer about such dependencies, so this is what I did.
And sure enough, pygments is a requirement for a package named rich. I didn't remember this package either. Let's see what it does. Turns out this is a library for creating colorful and nicely formatted text in the terminal, which is the text interface programmers/admins often use to interact with the computer. Although I do use some coloring in my command line tools, never used this library, nor I knew about it, nor it occurred to me to even look for it. Then why is it there? See its dependents.
The dependent is pip_audit.
That's right. A tool doing security audit installed an actual vulnerability on my machine, in an attempt to impress me with beautiful visuals. To be honest, especially considering the 20MB installation size, I'm not too impressed. I posit that simple monochrome text would have been enough.
Out of curiosity, I listed all installed packages. There were some two dozen. A library named boolean.py caught my eye, not for any other reason than its unusual name. Typically, we don't add a '.py' ending to library names, and boolean is a very basic data type which is unlikely to need an entire library to work with. Let's check.
"This library helps you deal with boolean expressions and algebra with variables and the boolean functions AND, OR, NOT. You can parse expressions from strings and simplify and compare expressions. You can also easily create your custom algebra and mini DSL and create custom tokenizers to handle custom expressions."
Okay, this is getting out of hand. Under no circumstances should something like this be on my box. Who the hell needed this? You know the drill, let's look it up!
The requester is license-expression, which "is a comprehensive utility library to parse, compare, simplify and normalize license expressions (such as SPDX license expressions) using boolean logic". Seriously? I don't even understand what this means. Okay, who is the culprit for this one?
It would be cyclonedx-python-lib. I already don't like the name, but let's see what do we have here. "OWASP CycloneDX is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction. This Python package provides data models, validators and more, to help you create/render/read CycloneDX documents." Mindblowing. Okay, let's go on, I can't wait to see where it ends.
Where it ends is: pip_audit. Again! Turns out, this is because pip_audit supports a 'cyclonedx' output format. So that's where the 20MB went. I wanted to install a utility that checks for vulnerabilities, and ended up with a pretty printing toolset, a syntax highlighter, a customizable boolean expression evaluator, something about licenses, and some behemoth document manager infrastructure. And a vulnerability as cherry on top.
Software Jenga ⇡
Something is deeply wrong with how do we do software development these days. We are seeing software packages growing out of the ground like mushrooms after rain. Going from inception to hundreds of millions of users in five years, adding dozens of features monthly, releasing weekly, targeting many platforms. The rate at which new software is coming out is staggering.
Start where the current chain of events began: the Trivy security scanner. The irony, isn't it? The promise of this software is that it looks through your project, and finds vulnerabilities. By finding vulnerabilities, they mean it has a large list of typical mistakes, like "don't include security keys in repositories" or "don't allow public access to your file storages", and so on. Basically it is an automated "best practices" list, which you don't need to read, because the program actually looks at your code, and raises the alarm.
It might come in handy indeed. However, think about it this way. In earlier times, we simply learned the best practices together with each technology we decided to employ. Trivy seems to serve the new world in which we don't learn new technology, but vibe code it. Whether we use AI or copy and paste, code snippets end up in our projects without us properly understanding them. And then Trivy comes in, and points to a "best practices" entry whenever we violated one. Truly a 21st century solution. Oh, one more note: Trivy, as of now, stands at version 0.69.3.
Another victim of this campaign was a Python library called LiteLLM. This module lets you talk to different LLMs using the same "language", so to speak. Each LLM offers some API, which is a way for computers to talk to each other. In this case, it is your software talking to the LLM. Since the offerings are similar, these APIs are typically somewhat similar, but not the same. Think of different order forms different companies might hand out, all will have basically the same data, but different layout, and maybe some minor differences. LiteLLM gives you a unified interface and translates your requests to the format these LLMs want.
In a way, the same pattern repeats. This is but a convenience item. Instead of studying a new LLM you eye, you just trust LiteLLM developers to handle that for you. We want to move fast. By the time someone learns a new API, however similar it is to the familiar ones, the modern programmer has already finished the project.
And this is the kind of pattern we discovered in pip_audit as well. If you want colorful display, use a library. If you want an output file format, no matter how straightforward it is, just use a library. There is a library for everything, no matter now arcane or how simple and unnecessary it is. Nobody has time to properly design and properly implement these libraries. Many of them has a single author, often maintaining multiple if not dozens of other libraries, while having a full time job. But even those maintained by companies often prioritize for volume rather than quality, and ignore what is not directly related to the product, like code quality, size, efficiency or security.
To beat this horse some more, lets take the packege PyJWT. Jwt is a way for web pages and other software to authenticate a user. It is similar to a ticket, which can be acquired from an authentication server, and then can easily and quickly be checked for validity by simple clients. Being deeply related to security, one might expect extra scrutiny and pedantry in its implementation. The package is widely used, and has a version number of 2.12. Yet, last month a security issue was found: it disregarded one of the mandatory requirements listed in the standard. This was not a bug. This was an omission of a feature that the documentation clearly demands to be implemented (and the wording is, in true RFC fashion: MUST, in all caps). The feature in question is a data element called "crit", which tells the recipient that the elements listed in it are important, and must not be ignored. As an example, imagine your jwt ticket have a data element called "color" with the value "red", and the "crit" element having "color" in it. But you have no idea what color means with relation to a jwt ticket. In this case, you must reject the ticket. This library disregarded the crit field altogether. How could such a violation of the standard survive more than 40 releases, well into version 2? The code base is maintained by, you guessed it, a single person who works on many other projects while having a day job. How is this acceptable to the industry?
Imagine the situation. You are a developer, and you want to validate a jwt token. Do a Google search, and see the PyJWT coming up. If you look it up, you see what I just described, single developer doing it as a hobby for all you know. There are no software audits, no team, no test suites. Do you decide to use this package? Why?
Notice that there is no fraud or misconduct here. Security or compliance often isn't even promised, but even if promised, it is standard legal babbling with no substance. It is not the fault of the implementor, it is entirely on us, users. We are installing unvetted software purely because it is easy and because a lot of other people do too. We are installing megabytes upon megabytes of software we don't really need, out of pure convenience. We are playing Jenga of a spectacular magnitude, and we have no idea how stable the tower currently is. We do know that it is tall.
Unsolicited Advice ⇡
In this section, I'm dispensing some advice you probably shouldn't follow. Not playing the jenga is safe. But when everyone else is playing it, you are not any safer staying out, it will collapse on your head regardless. So the optimal behavior is to just go with the flow. However, it is wise to know the rules before violating them, so please keep these rules somewhere in sight. Occasionally obey one or the other, out of pure defiance. Remember, when we go to programmer heaven, at the gates we will be asked if we made anything beautiful, not just useful. We will be judged accordingly.
Rule #1: before you add a feature, justify it. Don't develop swiss army knife libraries, separate concerns. When considering a nonessential feature, your first thought must be to reject it. Only add it, if the benefit clearly outweighs the cost, and the cost should include development time, code complexity, maintenance requirement, attack surface, and above all, disgust. Learn to ignore user requests.
Rule #2: before you add a dependency, justify it. You don't need a library to emit a formatted text file, unless the format specification is enterprisey nonsense. But even so spend a day or two understanding it before giving up. You don't need a wrapper around an API, again, unless the specification is incomprehensible. Log to text files, and have some external tool reading those into reporting systems. Write your own SQL. Forget the batteries included approach, you don't need to act as middleman. Return raw data, and trust your users to pass it to other libraries, which they install themselves.
Rule #3: thoroughly vet any library before using. It is not enough to have a hundred stars in GitHub, or a million downloads a week. Look for signs. Is the code a bloat? Are they adding ten new features a month? The developer is a reputable group or company? Beware, a nice website or 100 million angel investment is not reputation. Do they have vulnerability reporting? Is there anything there already? Do they have a version number starting with zero? What is the ratio of buzzwords on their website? How many dependencies they bring, and what are those? Vet them too.
Rule #4: use library versions at least two weeks old. Can be much older, unless there are security updates or relevant fixes. Consider using curated repositories, like those provided by Linux distros. Before installing a version, check vulnerability databases.
Rule #5: keep monitoring vulnerability databases regularly, preferably automated. Security issues can be found at any time, and you want to know about them immediately. Set up some alarm system.