Skip to main navigation Skip to search Skip to main content

Beyond CWEs: Mapping Weaknesses in Unstructured Threat Intelligence Text

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

9 Downloads (Pure)

Abstract

In real-world cyberattacks, adversaries frequently exploit a combination of vulnerabilities, bugs, and misconfigurations to compromise systems. To systematically analyze the root causes behind these issues, the Common Weakness Enumeration (CWE) framework provides a standardized taxonomy of software weaknesses.

While vulnerability databases are central to cataloging known issues, many security-relevant descriptions first appear in informal sources such as blog posts, CTI reports, and social media. Although these sources predominantly offer broader cybersecurity insights, they occasionally yield details that may indicate underlying weaknesses not captured in formal databases.

We propose a two-step approach to extract these security-related descriptions from unstructured threat intelligence and automatically map them to their corresponding CWE categories. First, a binary classifier detects sentences resembling CVE descriptions, identifying information relevant to security teams. Then, we apply a self-supervised learning model to predict the most appropriate CWE, enabling structured analysis even in the absence of formal vulnerability tracking.

As no ground truth exists for this task, we conduct expert-driven validation. Our results show strong performance, with an F1-score of 98.17% for correctly assigning CWE labels, improving by at least 64% points over state-of-the-art reasoning LLMs. This demonstrates the feasibility of automating weakness classification in unstructured cybersecurity text.

Original languageEnglish
Title of host publicationCryptology and Network Security
Subtitle of host publication24th International Conference, CANS 2025, Osaka, Japan, November 17–20, 2025, Proceedings
EditorsYongdae Kim, Atsuko Miyaji, Mehdi Tibouchi
Place of PublicationSingapore
PublisherSpringer
Pages493-517
Number of pages25
Edition1
ISBN (Electronic)978-981-95-4434-9
ISBN (Print)978-981-95-4433-2
DOIs
Publication statusPublished - 14 Nov 2025
Event24th International Conference on Cryptology and Network Security, CANS 2025 - Osaka International Convention Center, Osaka, Japan
Duration: 17 Nov 202520 Nov 2025
Conference number: 24
https://cy2sec.comm.eng.osaka-u.ac.jp/miyaji-lab/event/cans2025/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume16351
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Conference on Cryptology and Network Security, CANS 2025
Abbreviated titleCANS 2025
Country/TerritoryJapan
CityOsaka
Period17/11/2520/11/25
Internet address

Keywords

  • CTI
  • CVE-like extraction
  • CWE mapping
  • Security blog posts
  • Vulnerability analysis

Fingerprint

Dive into the research topics of 'Beyond CWEs: Mapping Weaknesses in Unstructured Threat Intelligence Text'. Together they form a unique fingerprint.

Cite this