How machine learning helps IT admins (and hackers, too)

6 minutes reading

16 / 05 / 2022

Artificial intelligence (AI) and machine learning (ML) are two of the most discussed topics in IT. When it comes to digital security, some hope they will one day provide the ultimate protection against malware, while others fear their promise might give way to more sophisticated cyberattacks. Both perspectives are correct.

Artificial intelligence is not just machine learning

Artificial intelligence and machine learning have both been around for quite a while. However, the potential for change that these technologies bring is not yet fully known. One thing is certain: the artificial intelligence we see on movie screens remains in the future.

The terms artificial intelligence and machine learning are often incorrectly used synonymously. In the case of artificial intelligence, the point is that the machine can learn and act independently and "intelligently," without human interaction and solely based on external inputs. Machine learning, in turn, uses data processing algorithms to perform certain tasks independently. The computer can quickly identify structures and anomalies in large amounts of data and break them down into smaller units essential for the problem (model generation). Nevertheless, ML is mostly treated as the core of AI.

Humans and androids? The dream team that can defeat hackers

Machine learning is of great importance in the fight against cybercrime, especially when it comes to malware detection. Using huge amounts of data, ML is trained to correctly filter files and samples into categories of "harmless" and "malicious." Thanks to this, new and unknown elements can be automatically assigned to one of two categories. This requires a huge amount of input data – each piece of information must be correctly categorized. It is often incorrectly stated that an algorithm can label new elements perfectly just because it was provided with large amounts of data. The truth is that verification conducted by a human via a final check in the event of questionable results remains necessary.

Humans are still better than machines at learning from contexts and acting creatively. That is an area where algorithms have a room for improvement. For example, professional malware developers can cleverly disguise the real purpose of their code. Malicious code might be concealed in the pixels of a clean image file, and snippets of malicious code may hide in particular files. The malicious effect may only present itself once the individual elements are combined. The ML algorithm may be unable to identify this process. In contrast, a human “virus hunter” might recognize the danger more accurately based on their training, experience, and gut feelings. Therefore, humans and machines must work together to actively prevent harmful actions.

Machine learning is only a small part of the IT security strategy

Machine learning has been an important part of IT security strategies since the 1990s. The past digital decade has taught us that there are no simple solutions to complex problems. This is especially true in cyberspace, where conditions can change within seconds. In today's world, it would be unwise to rely solely on one technology to build a resilient cyber defense. IT decision-makers need to realize that while ML is an undoubtedly valuable tool in the fight against cybercrime, it should only be one part of an organization's overall security strategy. Implementing sophisticated IT solutions still calls for the expertise of real people: security officers as well as IT admins.

Cyber criminals are also keeping up with the "smart" era

Machine learning is also popular in the cybercrime industry. More and more hackers are using ML to locate and exploit potential victims or steal valuable data via spam and phishing campaigns. At the same time, machine learning can be used to find gaps and weak points. Criminals also employ machine learning algorithms to protect their own IT infrastructure (e.g., botnets).

Companies that use machine learning on a larger scale are sometimes particularly attractive to attackers. By infecting input data sets, for example, cybercriminals cause the otherwise functional systems to produce faulty results and make poor strategic choices – causing chaos, operational disruptions, and sometimes even irreparable damage.

Malware built on ML: Emotet

Emotet, a malware based on machine learning, has been circulating on the internet for years. Hackers have used it to automatically download unwanted applications, for example, banking Trojans, into their victim's computer. Thanks to machine learning, Emotet is able to select its victims very specifically. At the same time, the malware is surprisingly good at escaping botnet trackers or honeypots.

As part of the attack, Emotet collects telemetry data from potential victims and sends it to the attacker's command and control (C&C) server for analysis. In return, it receives commands or binary modules from the server. Based on this data, the software only selects those modules corresponding with its orders. It also appears to be able to distinguish real human actors from virtual machines and automated environments used by cybersecurity researchers and investigators.

Particularly striking is Emotet's ability to learn the difference between legitimate and artificial processes. Initially, the latter are accepted, but then blacklisted within a few hours. While the victim’s device keeps transmitting data, on blacklisted machines/bots, the malicious code falls into a kind of dormant mode and ceases any malicious activity.

Machine learning and IoT

From the beginning, the Internet of Things (IoT) has been a popular target for attackers. These days, the number of routers, surveillance cameras, and other smart devices has been increasing. In many cases, however, these devices are extremely unreliable and can often be spied on or otherwise misused. That typically happens due to default factory settings, weak passwords, or other well-known vulnerabilities.

With the help of ML algorithms, attackers can take advantage of security vulnerabilities. For example, they may:

Find previously unknown vulnerabilities in IoT devices and collect large amounts of data about traffic and user behavior, which can then be used to train algorithms to improve their hidden mechanisms.
Learn standard behaviors and processes of certain types of rival malware in order to remove them if necessary or to use them for their own purposes.
Create training sets every year with the most commonly-used passwords, based on millions of leaked passwords and passphrases. This can make it even easier for the attackers to penetrate comparable IoT devices.

How to withstand online threats

Thanks to big data and improved computing power, machine learning has been widely used across various areas in recent years, including IT security. But the world of digital security is constantly evolving, making it impossible to protect your company infrastructure against the ever-changing threats solely with ML algorithms. A combination of multilayered solutions and skilled, experienced staff is your best option for staying ahead of the hackers.