Artificial intelligence (AI) and machine learning (ML) are part of one of the most discussed topics in IT. When it comes to digital security, some hope they will one day provide the ultimate protection against malware, while others fear their promise might give way to more sophisticated cyberattacks. Both perspectives are correct.
This is an audio version of the article:
Did you like it? Let us know what you think about this initiative by completing a 2-min survey.
Artificial intelligence is not just machine learning
Artificial intelligence and machine learning have both been discussed for a long time. However, these technologies’ potential for change is not yet fully known. But one thing is for sure: the artificial intelligence we see on movie screens remains in the future.
The terms artificial intelligence and machine learning are often incorrectly used synonymously. In the case of artificial intelligence, the point is that the machine can learn and act independently and "intelligently," without human interaction and solely based on external inputs. Machine learning, in turn, uses data processing algorithms to perform specific tasks independently. The computer can quickly identify structures and anomalies in large amounts of data and break them down into smaller units essential for the problem (model generation). Nevertheless, ML is mainly treated as the core of AI.
Humans and androids? The dream team that can defeat hackers
Machine learning is of great importance in the fight against cybercrime, especially concerning malware detection. Using vast amounts of data, ML is trained to correctly filter files and samples into categories of "harmless" and "malicious." This allows new and unknown elements to be automatically assigned to one of two categories. This requires a considerable amount of input data – each piece of information must be correctly categorised. It is often incorrectly stated that an algorithm can label new elements perfectly just because it was provided with large amounts of data. The truth is that verification conducted by a human via a final check in the event of questionable results remains necessary.
Humans are still better than machines at learning from contexts and acting creatively. That is an area where algorithms have room for improvement. For example, professional malware developers can cleverly disguise the real purpose of their code. Malicious code might be concealed in the pixels of a clean image file, and malicious code snippets may hide in particular files. The malicious effect may only present itself once the individual elements are combined. The ML algorithm may be unable to identify this process and make a wrong decision. In contrast, a human “virus hunter” might recognise the danger more accurately based on their training, experience, and gut feelings. Therefore, humans and machines must work together to prevent harmful actions actively.
Machine learning is only a small part of the IT security strategy
Machine learning has been essential to IT security strategies since the 1990s. The past digital decade has taught us that there are no simple solutions to complex problems. This is especially true in cyberspace, where conditions can change within seconds. In today's world, relying solely on one technology to build a resilient cyber defence would be unwise. IT decision-makers need to realise that while ML is an undoubtedly valuable tool in the fight against cybercrime, it should only be one part of an organisation’s overall security strategy. Implementing sophisticated IT solutions still calls for the expertise of real people: security officers and IT admins.
Cyber criminals are also keeping up with the "smart" era
Machine learning is also popular in the cybercrime industry. More and more hackers are using ML to locate and exploit potential victims or steal valuable data via spam and phishing campaigns. At the same time, machine learning can find gaps and weak points. Criminals also employ machine learning algorithms to protect their IT infrastructure (e.g., botnets).
Companies that use machine learning on a larger scale are sometimes particularly attractive to attackers. By infecting input data sets, cyber criminals cause the otherwise functional systems to produce faulty results. They make poor strategic choices – causing chaos, operational disruptions, and sometimes even irreparable damage.
Malware built on ML: Emotet
Emotet, a malware based on machine learning, has been circulating on the internet for years. Hackers have used it to automatically download unwanted applications, for example, banking Trojans, into their victim's computer. Thanks to machine learning, Emotet can select its victims very specifically. At the same time, the malware is surprisingly good at escaping botnet trackers or honeypots.
As part of the attack, Emotet collects telemetry data from potential victims and sends it to the attacker's command and control (C&C) server for analysis. In return, it receives commands or binary modules from the server. Based on this data, the software only selects those modules corresponding with its orders. It also appears to distinguish real human actors from virtual machines and automated environments used by cybersecurity researchers and investigators.
Emotet's ability to learn the difference between legitimate and artificial processes is particularly striking. Initially, the latter is accepted but blacklisted within a few hours. While the victim’s device keeps transmitting data on blacklisted machines/bots, the malicious code enters dormant mode and ceases any malicious activity.
Machine learning and IoT
From the beginning, the Internet of Things (IoT) has been a popular target for attackers. These days, the number of routers, surveillance cameras, and other smart devices has been increasing. In many cases, however, these devices are extremely unreliable and can often be spied on or otherwise misused. That typically happens due to default factory settings, weak passwords, or other well-known vulnerabilities.
With the help of ML algorithms, attackers can take advantage of security vulnerabilities. For example, they may:
- Find previously unknown vulnerabilities in IoT devices and collect large amounts of data about traffic and user behaviour, which can then be used to train algorithms to improve their hidden mechanisms.
- Learn standard behaviours and processes of certain rival malware in order to remove them if necessary or to use them for their own purposes.
- Create training sets every year with the most used passwords, based on millions of leaked passwords and passphrases. This might make it even easier for attackers to penetrate comparable IoT devices.
How to withstand online threats
Thanks to big data and improved computing power, machine learning has been widely used across various areas in recent years, including IT security. But the world of digital security is constantly changing. It is therefore impossible to always protect your company infrastructure against the frequently changing threats solely with ML algorithms. Multilayered solutions combined with talented and skilled people are the only way to remain one step ahead of hackers.