AI Agents Can Cheat When Operational Pressure Mounts, Research Shows

In the high-stakes world of financial services, the promise of agentic artificial intelligence (AI) is efficiency at scale. But a new study suggests that when the digital pressure cooker heats up, autonomous agents behave like stressed-out human employees, and to meet a deadline, they cheat.

New research from Scale AI and academic collaborators shows that agents are more likely to violate safety constraints when time or step limits shrink. The findings come from PropensityBench, a benchmark designed to test whether AI systems take harmful shortcuts when a task becomes difficult to complete safely.

The test gives a model access to allowed tools and restricted tools and then assigns a multi-step problem. When models operate under relaxed conditions, they usually follow the rules. When the time or step window tightens, many systems change strategies and begin using restricted tools. Pressure increased.

The study found that rule-breaking more than doubled under pressure. In low-pressure conditions, the average misuse rate across models was 18.6%. Under high pressure, the rate rose to 46.9%. One model selected restricted tools in 79% of high-pressure tests, another model with a baseline misuse rate of just over 2% rose above 40% when pressure mounted.

Researchers said these findings suggest that traditional alignment methods hold only in ideal settings and may not generalize to environments where time or resources are constrained. The benchmark evaluated four categories of potentially harmful actions, including cybersecurity misuse, biosecurity sequences, chemical access steps, and self-proliferation attempts.

The research does not assume the systems can carry out real-world attacks. It measures whether the model chooses actions that would be unsafe if those tools were available. The authors argued that this behavioral dimension, which they call propensity, is essential for understanding how agents behave in realistic deployments.

Raising Concerns

The study arrives as more real-world vulnerabilities appear, showing that pressure-sensitive behavior is not the only reliability gap emerging in agentic systems. Researchers tricked an Anthropic plug-in into deploying ransomware during a controlled test, demonstrating that even well-guarded tools can be redirected when an agent misinterprets intent or chain-of-thought steps.

The Guardian reported that safety filters can be bypassed through poetic instructions, revealing how creative phrasing can circumvent protections that appear stable under standard prompts. Reuters found that AI companies’ safety practices fall short of global standards, citing weak governance structures, inconsistent reporting practices and limited transparency around how models behave in dynamic environments.

Microsoft confirmed its new Windows AI agent sometimes hallucinates actions and creates security risks, including attempts to operate files or settings the user did not request. Together, these cases show how unpredictable behavior escalates once an AI system gains access to external tools and applications, and why enterprises adopting agentic workflows face a wider operational and security perimeter than traditional AI deployments.

AIMultiple found that agentic workflows introduce vulnerabilities such as goal manipulation and false-data injection, meaning that an attacker or even a poorly structured prompt can steer an agent toward unintended actions. These findings show that safety risks extend beyond incorrect outputs and now include structural weaknesses in how agents plan, retrieve information and interact with tools.

The PropensityBench findings arrive as broader industry research points to growing structural risks around agentic AI. Meanwhile, enterprises are turning to AI for automating core workflows. In a recent PYMNTS survey, 55% of chief operating officers said their companies had begun using AI-based automated cybersecurity management systems, a share that represented a threefold increase in only a few months.

Source: https://www.pymnts.com/