Effective AI adoption for optimizing SOC analysts’ work

There are various ways artificial intelligence can be used in cybersecurity – from threat detection to simplifying incident reporting. However, the most effective uses are those that significantly reduce human workload without requiring large, ongoing investments to keep the machine learning models up to date and performing well.

In a previous article, we discussed how difficult and labor-intensive it is to maintain a balance between reliable cyberthreat detection and low false-positive rates in AI models. Thus, the question posed in the title is easy to answer: AI can’t replace experts – but it can alleviate some of their workload by handling “simple” cases. Moreover, as the model learns over time, the range of these “simple” cases will expand. To really save the time of cybersecurity staff, we need to identify areas of work where changes occur more slowly than in direct cyberthreat detection. One promising candidate for automation is the processing of suspicious events (triage).

The detection funnel

To gather enough data to detect complex threats, the SOC of a modern organization has to collect millions of events daily from sensors across the network and connected devices. After grouping and initial filtering with SIEM algorithms, these events are distilled into thousands of alerts about potentially malicious activity. These alerts must usually be investigated by humans, but only a small fraction of these messages contain real threats. According to Kaspersky MDR’s data for 2023, our clients’ infrastructures generated billions of events daily, resulting in 431,512 alerts about potentially malicious activity identified throughout the year; however, only 32,294 alerts were linked to genuine security incidents. This means that machines effectively sifted through hundreds of billions of events, while only sending a tiny percentage to humans for review. However, 30 to 70% of these events are immediately flagged by analysts as false positives, and around 13% are confirmed as incidents after a deeper investigation.

Role of “Auto-Analyst” in the SOC

The Kaspersky MDR team has developed an “Auto-Analyst” for the initial filtering of alerts. This supervised machine-learning system trains on alerts from the SIEM system, combined with the SOC verdict on each alert. The goal of the training is for the AI to confidently identify false positives generated by legitimate network activity. Because this area is less dynamic than threat detection, it’s easier to apply machine learning to.

Machine learning here is based on CatBoost – a popular gradient-boosting library. The trained “Auto-Analyst” filters alerts and only forwards for human review the ones with a probability of a real incident above a specified threshold, determined by the acceptable error rate. As a result, around 30% of alerts are handled by the Auto-Analyst, freeing up the SOC team for more complex tasks.

Practical nuances of the Auto-Analyst’s work

Processes are paramount in SOC operations, and new technologies require adapting or building new processes around them. For AI systems, these processes include:

Controlling training data. To ensure that the AI learns from the correct data, the training set needs to be thoroughly reviewed in advance to confirm that the analysts’ verdicts therein were accurate.
Prioritization of incoming data. Every alert contains numerous information fields, but their importance varies. Part of the training involves assigning “weights” to these different fields. The feature vector used by the machine-learning model is based on fields selected by experts from SIEM alerts, and the field list depends on the type of specific alert. Note that the model can perform such prioritization on its own, but the results should be supervised.
Selective review of results. The SOC team double-checks approximately 10% of the Auto-Analyst’s verdicts to ensure the AI isn’t making errors (especially false negatives). If such errors occur and exceed a certain threshold (for example, more than 2% of the verdicts), retraining the AI is necessary. Incidentally, selective reviews are also conducted for the human analysts’ verdicts in the SOC — because people often make mistakes as well.
Interpreting the results. The ML model should be equipped with interpretation tools so we can understand its verdict rationale and the influencing factors. This helps adjust the training dataset and input weights. For example, one case required adjustment when the AI started flagging network communications as “suspicious” without considering the “Source IP address” field. Analyzing the AI’s work using this tool is an essential part of the selective review.
Excluding AI analysis for certain alerts. Some detection rules are so critical that even a small chance of the AI filtering them out is unacceptable. In such cases, there should be a flag in the rule to “exclude from AI processing”, and a process for prioritizing these alerts.
Optimizing filtering. Another regular process necessary for the effective work of the AI analyst in the SOC is identifying similar alerts. If the AI analyst rejects dozens of similar alerts, there should be a process to upgrade these verdicts to filtering rules within the SIEM. Ideally, the AI analyst itself generates a request to create a filtering rule, which is then reviewed and approved by a responsible SOC analyst.

To effectively counter cyberthreats, organizations need to acquire deeper expertise in various technological areas, including storing and analyzing vast amounts of data, and now machine learning, too. For those who want to quickly compensate for a shortage of skilled personnel or other resources, we recommend getting this expertise in a ready-made form with the Kaspersky Managed Detection and Response service. This service provides continuous threat hunting, detection and response for your organization.

Kaspersky official blog – Read More