Transformer Visualizer

Akarshan Gupta; Karthikeyen Nair; Yash Rawat; Sumit Sharma; Avinash Sonule

doi:10.17148/IJARCCE.2025.14442

Transformer Visualizer

Akarshan Gupta, Karthikeyen Nair, Yash Rawat, Sumit Sharma, Avinash Sonule

Abstract: This paper focuses on unraveling the inner work- ings of the transformer architecture, a cornerstone of modern enabling parallel processing and long-range dependency cap- ture. From this seminal work, w√e adopt the core attention large language models (LLMs). While transformers have driven mechanism formula (Q × KT )/ dk and the multi-head at- breakthroughs in natural language processing through self- attention mechanisms, their internal operations remain complex and opaque. Using GPT-2 as an illustrative case study, we develop an interactive visualization framework to map information flow, display attention patterns, and illustrate token embeddings and layer interactions. These visualizations aim to deepen compre- hension of transformer mechanics, enhance model transparency, and guide future advancements in AI design.

Keywords: Transformer Architecture (TA): Neural network ar- chitecture based on self-attention mechanisms; Large Language Models (LLMs): Advanced AI models trained on vast text datasets; Natural Language Processing (NLP): AI technology for understanding and processing human language; Self-Attention Mechanism (SAM): Method allowing models to weigh importance of different input elements.

| DOI: 10.17148/IJARCCE.2025.14442

International Journal of Advanced Research in Computer and Communication Engineering

Transformer Visualizer

Call for Papers

Author Center

IJARCCE Management

Archives