TY - JOUR
T1 - AIOps for log anomaly detection in the era of LLMs
T2 - A systematic literature review
AU - De la Cruz Cabello, Miguel
AU - Prince Sales, Tiago
AU - Machado, Marcos R.
N1 - Publisher Copyright:
© 2025 The Authors.
PY - 2025/12
Y1 - 2025/12
N2 - Modern IT systems generate large volumes of log data that challenge timely and effective anomaly detection. Traditional methods often require intensive feature engineering and struggle to adapt to dynamic operational environments. This Systematic Literature Review (SLR) analyzes how Artificial Intelligence for IT Operations (AIOps) benefits from advanced language models, emphasizing Large Language Models (LLMs) for more effective log anomaly detection. By comparing state-of-art frameworks with LLM-driven methods, this study reveals that prompt engineering – the practice of designing and refining inputs to AI models to produce accurate and useful outputs – and Retrieval Augmented Generation (RAG) boost accuracy and interpretability without extensive fine-tuning. Experimental findings demonstrate that LLM-based approaches significantly outperform traditional methods across evaluation metrics that include F1-score, precision, and recall. Furthermore, the integration of LLMs with RAG techniques has shown a strong adaptability to changing environments. The applicability of these methods also extends to the military industry. Consequently, the development of specialized LLM systems with RAG tailored for the military industry represents a promising research direction to improve operational effectiveness and responsiveness of defense systems.
AB - Modern IT systems generate large volumes of log data that challenge timely and effective anomaly detection. Traditional methods often require intensive feature engineering and struggle to adapt to dynamic operational environments. This Systematic Literature Review (SLR) analyzes how Artificial Intelligence for IT Operations (AIOps) benefits from advanced language models, emphasizing Large Language Models (LLMs) for more effective log anomaly detection. By comparing state-of-art frameworks with LLM-driven methods, this study reveals that prompt engineering – the practice of designing and refining inputs to AI models to produce accurate and useful outputs – and Retrieval Augmented Generation (RAG) boost accuracy and interpretability without extensive fine-tuning. Experimental findings demonstrate that LLM-based approaches significantly outperform traditional methods across evaluation metrics that include F1-score, precision, and recall. Furthermore, the integration of LLMs with RAG techniques has shown a strong adaptability to changing environments. The applicability of these methods also extends to the military industry. Consequently, the development of specialized LLM systems with RAG tailored for the military industry represents a promising research direction to improve operational effectiveness and responsiveness of defense systems.
KW - UT-Gold-D
KW - Large Language Models
KW - Log anomaly detection
KW - Retrieval Augmentation Generation
KW - AIOps
UR - https://www.scopus.com/pages/publications/105022056227
U2 - 10.1016/j.iswa.2025.200608
DO - 10.1016/j.iswa.2025.200608
M3 - Review article
AN - SCOPUS:105022056227
SN - 2667-3053
VL - 28
JO - Intelligent Systems with Applications
JF - Intelligent Systems with Applications
M1 - 200608
ER -