Comparative Analysis of Traditional Machine Learning and Sequential Deep Learning Models for Spam Email Classification

Figure Ilustration AI

FORMOSA NEWS - Blitar - LSTM and GRU Models Outperform Traditional Algorithms in Short-Text Spam Filtering. Digital communication channels are increasingly flooded with sophisticated spam, malware, and phishing attempts that threaten individual and organizational cybersecurity

. To address this challenge, researchers Harliana and Achmad Alvi Yudanuari from Universitas Nahdlatul Ulama Blitar, collaborating with Hartatik from Universitas AMIKOM Yogyakarta, conducted a comprehensive comparative study evaluating how different artificial intelligence (AI) models classify text-based spam

. Published in the 2026 edition of the Formosa Journal of Computer and Information Science (FJCIS), this timely research establishes a standardized evaluation framework to determine whether traditional machine learning or modern sequential deep learning models filter short messages more effectively

. The findings provide critical insights for software engineers and cybersecurity firms developing next-generation communication filters to secure digital spaces

.

The Evolving Threat of Unstructured Spam Messages
As digital communication continues to grow rapidly, malicious actors constantly change their tactics

. Modern spam messages no longer rely on simple keywords; instead, they utilize complex linguistic variations, text structural manipulation, and content obfuscation to mimic legitimate human conversations

. Traditional rule-based security systems fail to adapt to these unstructured and dynamic text patterns, making automated machine learning solutions necessary

.For years, IT systems deployed traditional models like Logistic Regression combined with statistical features like TF-IDF

. While computationally efficient and stable, these systems treat words independently and completely ignore word order and sentence structure

. Sequential deep learning models—such as Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU)—were engineered to solve this issue by capturing long-term dependencies within a text

. However, prior to this study by Harliana, Hartatik, and Yudanuari, the industry lacked a fair evaluation framework, as previous experiments used different datasets, inconsistent preprocessing techniques, and varying evaluation metrics

.

Standardizing the Experimental Framework
To resolve the inconsistencies in existing literature, the research team from Universitas Nahdlatul Ulama Blitar and Universitas AMIKOM Yogyakarta built a controlled empirical evaluation framework

. They utilized a publicly available SMS spam dataset consisting of 5,572 messages

. The dataset exhibited a stark class imbalance typical of real-world scenarios, containing 4,825 legitimate messages (ham) and only 747 spam messages

. The team addressed this imbalance by applying an oversampling technique exclusively to the training subset

. This balanced the minority class distribution and prevented data leakage into the testing subset, maintaining strict experimental integrity

. The data was systematically split into an 80:20 ratio for training and testing

. The text preprocessing pipeline was standardized across all four tested configurations

Logistic Regression (with TF-IDF): Served as the statistical, non-sequential baseline model.
Recurrent Neural Network (RNN): A basic sequential deep learning model.
Long Short-Term Memory (LSTM): An advanced network with specialized gating mechanisms to prevent data loss.
Gated Recurrent Unit (GRU): A streamlined, computationally efficient alternative to LSTM.

Key Performance Findings and Trade-Offs

The controlled experiments revealed that all four models achieved exceptionally high classification capabilities, with overall accuracy scores ranging from 0.98 to 0.99. The traditional Logistic Regression baseline excelled at catching actual spam, securing a strong recall rate of 0.95. However, it registered the lowest precision score (0.93), generating 11 false positives. This indicates that keyword-dependent traditional machine learning often misclassifies legitimate messages as spam when they share identical vocabulary. Conversely, the basic RNN proved to be highly conservative. It achieved a near-perfect precision of 0.99 by triggering only 1 false alarm, but it missed a significant amount of actual spam, dropping its recall score down to 0.92 due to the vanishing gradient problem. The clear winners of the comparative study were the LSTM and GRU neural networks, both achieving peak F1-scores of 0.97. By leveraging gating structures to retain contextual data over text sequences, they maintained a precision of 0.99 while raising the recall rate to 0.94. Analysis of the training curves indicated that the GRU model offered slightly better stability and faster training convergence than the LSTM, making it a highly reliable architecture for text classification.

Real-World Impact and Industrial Applications
The research by Harliana, Hartatik, and Yudanuari delivers immense value to digital enterprises, software vendors, and cybersecurity policymakers. In professional email systems and messaging applications, a false positive where an essential business message or notification is incorrectly flagged as spam severely damages the user experience. By proving that LSTM and GRU models lower false positives down to just a single occurrence, this study provides a clear architectural roadmap for communication platforms aiming to maximize filtering reliability. Furthermore, the study confirms that traditional statistical models remain highly competitive. For startups, small businesses, or edge-computing environments with restricted processing power, Logistic Regression remains a practical, low-cost solution.

Research Profiles
Harliana, S.Kom., M.Kom. is a faculty member and leading computer science researcher at Universitas Nahdlatul Ulama Blitar, specializing in Natural Language Processing (NLP) and machine learning applications.
Hartatik, S.Si., M.Cs. is an academician and data science expert affiliated with Universitas AMIKOM Yogyakarta, focusing on computational intelligence and advanced algorithms.
Achmad Alvi Yudanuari, S.Kom., M.M. is a researcher at Universitas Nahdlatul Ulama Blitar whose professional work centers on information systems and data analytics.

Source
Harliana, Hartatik, Achmad Alvi Yudanuri (2026). Comparative Analysis of Traditional Machine Learning and Sequential Deep Learning Models for Spam Email Classification. Formosa Journal of Computer and Information Science (FJCIS), Vol. 5, No. 1, 2026
DOI: https://doi.org/10.55927/fjcis.v5i1.16502
URL: https://journal.formosapublisher.org/index.php/fjcis

Comparative Analysis of Traditional Machine Learning and Sequential Deep Learning Models for Spam Email Classification

Diposting oleh Sarah Hutabarat

Posting Komentar

0 Komentar

Tampilkan Iklan

FORMOSA NEWS

200ribu/Bulan

Facebook

Home Top Ad

Iklan right sidebar

Main Tags

Random Posts

The Efforts to Preserve and Develop Jakarta's Old City to Meet the Criteria as a World Heritage Site

Perbedaan SPSS, AMOS, dan SMART PLS: Kapan dan dalam Situasi Apa Menggunakannya?

Deretan Artis Indonesia yang Terlibat Skandal Perselingkuhan

Latest

Popular Posts

The Efforts to Preserve and Develop Jakarta's Old City to Meet the Criteria as a World Heritage Site

Pemetaan Desa Wisata di Maluk Dorong Diversifikasi Ekonomi Pascatambang di Sumbawa Barat

Tips Menyelesaikan Skripsi Atau Tesis Tepat Waktu

Footer Menu Widget

Contact form

Comparative Analysis of Traditional Machine Learning and Sequential Deep Learning Models for Spam Email Classification

Diposting oleh Sarah Hutabarat

Anda mungkin menyukai postingan ini

Posting Komentar

0 Komentar

Tampilkan Iklan

FORMOSA NEWS

200ribu/Bulan

Facebook

Home Top Ad

Iklan right sidebar

Main Tags

Social Plugin

Random Posts

The Efforts to Preserve and Develop Jakarta's Old City to Meet the Criteria as a World Heritage Site

Perbedaan SPSS, AMOS, dan SMART PLS: Kapan dan dalam Situasi Apa Menggunakannya?

Deretan Artis Indonesia yang Terlibat Skandal Perselingkuhan

Latest

Popular Posts

The Efforts to Preserve and Develop Jakarta's Old City to Meet the Criteria as a World Heritage Site

Pemetaan Desa Wisata di Maluk Dorong Diversifikasi Ekonomi Pascatambang di Sumbawa Barat

Tips Menyelesaikan Skripsi Atau Tesis Tepat Waktu

Footer Menu Widget

Contact form