Abstract
The Model Context Protocol (MCP) has quickly become the standard for enabling agentic AI systems to interact with external tools, data sources, and services. Since its debut in 2024, MCP has been adopted by companies such as Google, Apple, Meta, and IBM. While this integration greatly improves the capabilities of large language models (LLMs), it also creates a new attack surface that the security community is only beginning to understand systematically.
A key architectural challenge is MCP's fundamental reliance on implicit trust: servers often run locally with high privileges, tool descriptions are accepted without question, and external servers are presumed trustworthy by default. This approach opens the door to various exploits, including prompt injection, tool poisoning, privilege escalation, and cross-server lateral movement.
This paper offers an analysis of MCP security by categorizing known vulnerabilities using the SAFE-MCP framework and systematically matching eleven existing mitigation strategies, tools, and benchmarks to those categories. Through this process, the study highlights critical security gaps, especially in lateral movement, credential access, evasion techniques, and command-and-control, and discusses the architectural implications of shifting from implicit trust to zero-trust models. The paper concludes with specific recommendations for future research and stresses the urgent need to thoroughly evaluate proposed defenses before they are widely deployed in enterprise environments.
Faculty Advisor/Mentor
Yue Xiao
Document Type
Presentation
Disciplines
Digital Communications and Networking
DOI
10.25776/5f25-zk03
Publication Date
4-14-2026
Upload File
wf_yes
Included in
Landscaping of MCP: An Overview of MCP Mitigations and Tools
The Model Context Protocol (MCP) has quickly become the standard for enabling agentic AI systems to interact with external tools, data sources, and services. Since its debut in 2024, MCP has been adopted by companies such as Google, Apple, Meta, and IBM. While this integration greatly improves the capabilities of large language models (LLMs), it also creates a new attack surface that the security community is only beginning to understand systematically.
A key architectural challenge is MCP's fundamental reliance on implicit trust: servers often run locally with high privileges, tool descriptions are accepted without question, and external servers are presumed trustworthy by default. This approach opens the door to various exploits, including prompt injection, tool poisoning, privilege escalation, and cross-server lateral movement.
This paper offers an analysis of MCP security by categorizing known vulnerabilities using the SAFE-MCP framework and systematically matching eleven existing mitigation strategies, tools, and benchmarks to those categories. Through this process, the study highlights critical security gaps, especially in lateral movement, credential access, evasion techniques, and command-and-control, and discusses the architectural implications of shifting from implicit trust to zero-trust models. The paper concludes with specific recommendations for future research and stresses the urgent need to thoroughly evaluate proposed defenses before they are widely deployed in enterprise environments.