Packages with infostealer found in PyPI repository | Kaspersky official blog

Our Global Research and Analysis Team (GReAT) experts have discovered two malicious packages in the Python Package Index (PyPI) – a popular third-party software repository for Python. According to the packages’ descriptions, they were libraries that allowed to work with popular LLMs (large language models). However, in fact, they imitated the declared functionality using the demo version of ChatGPT, and their main purpose was to install JarkaStealer malware.

The packages were available for download for more than a year. Judging by the repository’s statistics, during this time they were downloaded more than 1700 times by users from more than 30 countries.

Malicious packages and what were they used for

The malicious packages were uploaded to the repository by one author and, in fact, differed from each other only in name and description. The first was called “gptplus” and allegedly allowed access to the GPT-4 Turbo API from OpenAI; the second was called “claudeai-eng” and, according to the description, also promised access to the Claude AI API from Anthropic PBC.

The descriptions of both packages included usage examples that explained how to create chats and send messages to language models. But in reality, the code of these packages contained a mechanism for interaction with the ChatGPT demo proxy in order to convince the victim that the package was working. Meanwhile, the __init__.py file contained in the packages decoded the data contained inside and downloaded the JavaUpdater.jar file from the GitHub repository. If Java was not found on the victim’s machine, it also downloaded and installed the Java Runtime Environment (JRE) from Dropbox. The jar file itself contained the JarkaStealer malware, which was used to compromise the development environment and for undetected exfiltration of stolen data.

What is JarkaStealer malware, and why is it dangerous?

JarkaStealer is malware, presumably written by Russian-speaking authors, which is used primarily to collect confidential data and send it to the attackers. Here’s what it can do:

Steal data from various browsers;
Take screenshots;
Collect system information;
Steal session tokens from various applications (including Telegram, Discord, Steam, and even a Minecraft cheat client);
Interrupt browser processes to retrieve saved data.

The collected information is then archived, sent to the attacker’s server, and then deleted from the victim’s machine.

The malware authors distribute it through Telegram using the malware-as-a-service (MaaS) model. However, we also found the source code of JarkaStealer on GitHub, so it’s possible that this campaign didn’t involve the original authors of the malware.

How to stay safe

We promptly informed PyPI administrators about the malicious implants in the gptplus and claudeai-eng packages, and as of now they’ve already been removed from the repository. However, there’s no guarantee that this (or a similar) trick won’t be pulled on some other platform. We continue to monitor activity related to the JarkaStealer malware and look for other threats in open source software repositories.

For those who downloaded and used one of the malicious packages, the main recommendation is to immediately delete it. The malware doesn’t have persistence functionality, so it’s launched only when the package is used. However, all passwords and session tokens that were used on a victim’s machine could have been stolen by JarkaStealer, and so should be immediately changed or reissued.

We also recommend that developers be especially vigilant when working with open source software packages, and inspect them thoroughly before integrating them into their projects. This includes a detailed analysis of the dependencies and the respective supply chain of software products – especially when it comes to such a hyped topic as the integration of AI technologies.

In this case, the author’s profile’s creation date on PyPI could have been a red flag. If you look closely at the screenshot above, you can see that both packages were published on the same day, while the account that published them was registered just a couple of days earlier.

In order to minimize the risks of working with third-party open source software packages and avoid an attack on the supply chain, we recommend including in DevSecOps processes the Kaspersky Open Source Software Threats Data Feed, which is designed specifically for monitoring used open source components in order to detect threats that might be hidden inside.

Kaspersky official blog – Read More