June 20, 2018

Episode 1: Phishing-AI

The very first episode of CypherOwl Security Experience. Nikoloz reviews new phishing AI research, an AI that allegedly generates phishing URLs and beats modern defense mechanisms.

Episode 1: Phishing-AI

The very first episode of CypherOwl Security Experience. Nikoloz reviews new phishing AI research, an AI that allegedly generates phishing URLs and beats modern defense mechanisms.

Other ways to listen:

Anchor    Apple Podcasts (iTunes)    Google Podcasts    Breaker    CastBox    Pocket Casts    Radio Public    Spotify    Stitcher    RSS

Transcript

PHISH-AI Research

Today’s topic is a security research in the area of phishing AI. I’ve read an article and had a look at the research of scientists from Cyxtera Technologies who have built a piece of code named DeepPhish, which is a machine-learning software that, allegedly generates phishing URLS that beat modern defense mechanisms.

So, scientists used a database of one million phishing URLS collected from Phishtank (which a service offered by OpenDNS that collects massive amounts of phishing attempts for researching purposes). Phishtank helped them in identifying 3 threat actors by following URLs with similar patterns hosted on the same compromised domains. They took those domains and clustered them to better understand the strategies used by these malicious actors. And the goal was to analyze how do these best phishers succeed, what patterns are they using, that allow those URLs to bypass the AI phishing detection algorithms. This analysis would later allow security researchers to create a pseudo-malicious AI, for conducting phishing attacks.

It is worth noting that according to various statistics some of the modern AI defensive systems are successful at identifying phishing attempts with 98.7 % accuracy. Meaning that vast majority of phishing attacks are failing against modern AI. Which is a remarkable score.

However, here comes the interesting part, by using the DeepPhish, this pseudo-malicious AI, guys at Cyxtera where able to increase the effectiveness of phishing campaigns by about 18%.

I will provide a little bit of a background of how Cyxtera was training the AI, it might be interesting for some of you. After analyzing the success rate of malicious URLs from various sources, they fed this information into a (LSTM). LSTM, Long short-term memory, is a block which is used for building layers of a recurrent neural network, so think about it as something that helps build a neural network itself. So, scientist fed this information of millions of phishing attacks to LSTM to learn the general structure and extract features from the malicious URLs - for example the second threat actor commonly used some .html file extension in its address. All the text from effective URLs were taken to create sentences and encoded into a vector and fed into the LSTM, where it is trained to predict the next character given the previous one. Over time it learns to generate a stream of text to simulate a list of pseudo URLs that are similar to the ones used as input. Of course, the overall effectiveness rate isn’t very high as a lot of the data what comes out the LSTM is gibberish, containing strings of forbidden character, but this research is already something, it shows us the potential.

So, let's think for a moment about AI as a tool that will definitely become the core part of major cyber-attacks and I do not think that this will be the case in a distant future. Quite opposite, I believe that most state sponsored actors will start using them very soon, if they are not already using it. Especially if we think in the scope of financial resources and wide variety of skills that are required for creating, feeding, maintaining, training and properly utilizing the AI, then probably the first customers for let's say malicious AI would be state sponsored actors such as Advanced Persistent Threats and intelligence agencies.

But keep in mind that history of information security teaches us two very serious lessons, first: if there is an advanced tool used by governments or clandestine groups it's only a matter of time for those tools to become available for others, like cyber criminals. and second, in most cases others are more sophisticated

with such tools. So, I cannot wait to see the developments in this field and am quite confident that with the advancement in AI and machine-learning, the massive adoption rate of Internet of Things and development of quantum computing, we will quickly move to the new Era of cyber security.

Don't forget to subscribe below, stay up-to-date with latest podcasts and other developments!