← Back to VOLUME 15, ISSUE 6, JUNE 2026
This work is licensed under a Creative Commons Attribution 4.0 International License.
Multi-Modal AI Agent for Intelligent Email Categorization and Auto-Reply
RANJINI, J.LIN EBY CHANDRA
π 3 viewsπ₯ 0 downloads
Abstract: The dramatic increase in daily email volumes across enterprise, healthcare, and e-governance sectors has created an urgent need for intelligent systems capable of autonomous email understanding, classification, and response generation. This paper proposes MMEA-Net (Multi-Modal Email Agent Network), a novel deep learning framework that integrates transformer-based language models, visual document encoders, and metadata-driven contextual reasoning to perform fine-grained email categorization and context-aware auto-reply generation. Unlike prior work relying solely on email body text, MMEA-Net processes three complementary modalities: textual content encoded via DeBERTa-v3-Large, visual layout of attached documents processed through LayoutLMv3, and structural metadata including sender reputation scores, thread depth, and temporal patterns encoded by a dedicated MLP module. The three modality streams are fused through a Gated Cross-Modal Attention (GCMA) mechanism that dynamically weights each modality's contribution based on input context. A reinforcement-learning-based Auto-Reply Generator (ARG) then produces professional, intent-aligned responses conditioned on the predicted category and a domain-specific policy knowledge base. Experiments on the Enron Email Dataset, TREC 2007, and a newly constructed Healthcare Email Corpus demonstrate that MMEA-Net achieves 95.3% overall accuracy, 94.1% macro-F1, BLEU-4 of 41.2, and human acceptability of 89.6%, outperforming all evaluated baselines by statistically significant margins.
Keywords: Multi-Modal Learning; Email Categorization; Auto-Reply Generation; Transformer; Gated Cross-Modal Attention; DeBERTa; Reinforcement Learning from Human Feedback
Keywords: Multi-Modal Learning; Email Categorization; Auto-Reply Generation; Transformer; Gated Cross-Modal Attention; DeBERTa; Reinforcement Learning from Human Feedback
How to Cite:
[1] RANJINI, J.LIN EBY CHANDRA, βMulti-Modal AI Agent for Intelligent Email Categorization and Auto-Reply,β International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2026.15687
