Cyrill Gössi
Identifying Attackers

Identifying attackers by using machine learning on unstructured cyber threat intelligence


Using cyber threat intelligence in automated processing systems requires structured data as input. Cyber threat intelligence on the Internet, however, often appears as unstructured cyber threat reports (CTRs) written by cyber security researchers. Such CTRs often, without knowing the actual adversary behind an attack, contain information regarding techniques and tools applied by the adversary during the execution of the attack. Due to their lack of structure, CTRs are difficult to work with and information must be manually extracted. In this work, we developed a machine learning based tool that helps identifying the adversary having executed the attack described by a CTR. The tool takes a CTR as input and extracts, among others, the techniques and tools described by the report to have been used during the attack. Based on this information, the tool then calculates the similarity to a set of previously learned adversary group profiles and outputs the calculated similarities in a sorted ranking. The higher the rank of an adversary group in the ranking, the more the information contained in the CTR resembles the profile previously learned about the adversary group. Evaluating the tool over 57 adversary groups with a total of 227 CTRs finds that the tool can output the correct adversary group with a probability of 63.54% on rank 1. Furthermore, as the tool allows to assign weights to different parts of a profile, we could confirm that assigning weights in accordance with the hierarchy proposed by the Pyramid of Pain outperforms assigning equal weights. This is seen as an interesting confirmation of the concept proposed by the Pyramid of Pain.