CyberIntel ⬡ News
★ Saved ◆ Cyber Reads
← Back ◬ AI & Machine Learning Apr 15, 2026

DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection

arXiv Security Archived Apr 15, 2026 ✓ Full text saved

arXiv:2604.12548v1 Announce Type: new Abstract: Prompt injection has emerged as a critical security threat to large language models (LLMs), yet existing studies predominantly focus on single-dimensional attack strategies, such as semantic rewriting or character-level obfuscation, which fail to capture the combined effects of multi-space perturbations in realistic scenarios. In addition, systematic black-box robustness evaluations of recent Chinese LLMs, such as DeepSeek, remain limited. To addre

Full text archived locally
✦ AI Summary · Claude Sonnet


    Computer Science > Cryptography and Security [Submitted on 14 Apr 2026] DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection Junyu Ren, Xingjian Pan, Wensheng Gan, Philip S. Yu Prompt injection has emerged as a critical security threat to large language models (LLMs), yet existing studies predominantly focus on single-dimensional attack strategies, such as semantic rewriting or character-level obfuscation, which fail to capture the combined effects of multi-space perturbations in realistic scenarios. In addition, systematic black-box robustness evaluations of recent Chinese LLMs, such as DeepSeek, remain limited. To address these gaps, we propose PromptFuzz-SC, a semantic-character dual-space mutation framework for evaluating LLM robustness against prompt injection. The framework integrates semantic transformations (e.g., paraphrasing and word-order perturbation) with character-level obfuscation (e.g., zero-width insertion and encoding-based mutation), forming a unified and extensible mutation operator library. A hybrid search strategy combining epsilon-greedy exploration and hill-climbing refinement is adopted to efficiently discover high-quality adversarial prompts. We further introduce a unified evaluation protocol based on three metrics: misuse success rate (MSR), Average Queries to Success (AQS), and Stealth. Experimental results on DeepSeek demonstrate that dual-space mutation achieves the strongest overall attack performance among the evaluated strategies, attaining the highest mean MSR (0.189), peak MSR (0.375), and mean Stealth. Compared with semantic-only and character-only mutation, it improves mean MSR by 12.5% and 5.6%, respectively. While not consistently minimizing query cost, the proposed method achieves competitive best-case efficiency and maintains strong imperceptibility, indicating a more favorable balance between attack effectiveness and concealment. These findings highlight the importance of composite mutation strategies for robust red-teaming of LLMs and provide practical insights for the design of multi-layer defense mechanisms. Comments: Preprint Subjects: Cryptography and Security (cs.CR) Cite as: arXiv:2604.12548 [cs.CR]   (or arXiv:2604.12548v1 [cs.CR] for this version)   https://doi.org/10.48550/arXiv.2604.12548 Focus to learn more Submission history From: Wensheng Gan [view email] [v1] Tue, 14 Apr 2026 10:20:15 UTC (750 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev   |   next > new | recent | 2026-04 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
    💬 Team Notes
    Article Info
    Source
    arXiv Security
    Category
    ◬ AI & Machine Learning
    Published
    Apr 15, 2026
    Archived
    Apr 15, 2026
    Full Text
    ✓ Saved locally
    Open Original ↗