The AI arms race will be won on mathematical proof

The AI arms race will be won on mathematical proof

This commentary is published in coordination with the 2025 Global Security Forum, of which Defense One is a media partner.

The AI-powered weapons and systems that the Pentagon is racing to build will come with a significant vulnerability: our inability to determine how they will behave under real battlefield conditions.

The Defense Department, National Security Agency, and Defense Advanced Research Projects Agency call this the “software understanding gap.” Increasingly, users do not understand the digital building blocks of their systems, which leads to an inability to predict, verify, and secure their systems’ actions. Without a deeper understanding of our most complex software, the United States’ world-class arsenal will be unreliable across modern battlefields, and adversaries may be emboldened to exploit those vulnerabilities.

So how do we ensure these autonomous systems perform as needed? Whereas traditional military hardware was designed to operate within clear and defined boundaries, AI-driven systems employ automated decision-making that learns and adapts, making them susceptible to manipulation or error beyond human anticipation. Conventional testing methods that sufficed for deterministic systems with finite failure modes, such as a jet engine or a radar system, are woefully inadequate for the complexities of AI. No amount of simulation or red-teaming can secure a system that is learning and adapting faster than any human can follow—especially in the face of an adversary intent on targeting its blind spots.

We need something more robust and sophisticated than testing. We need mathematical proof.

Testing is akin to checking each link in a chain to make sure they’re strong. This may work well for a few hundred links, but as the number of links increases, the likelihood that you miss a weakness increases as well. Mathematical proof takes a different approach. Instead of checking links one by one, proofs show that the entire chain is unbreakable—no matter how long it stretches.

A proof starts by defining the fundamental rules: what the chain is made of, how much force it must withstand, and how it is connected. Once these assumptions are established, the proof confirms the first link is strong, then follows a logical progression to ensure that every connected link must also hold. Whether the chain is ten links long or an infinite number of links, the proof guarantees that there are no weak points.

That is why proof is essential for AI-driven military systems. Testing can tell us if a system works under the conditions created in the test environment. But testing alone cannot cover every battlefield, cyberattack, or possible manipulation. A proof guarantees that no matter what conditions the system faces—even the ones not yet imagined—it will behave as intended. It produces justifiable confidence.

Complete confidence in software systems is not theoretical—it has been proven. DARPA showed this in its High-Assurance Cyber Military Systems program, where researchers used mathematical proof to make a quadcopter’s flight software unbreakable. After implementing these guarantees, DARPA tried to hack it—bringing in expert teams whose best efforts could not compromise its defenses. The lesson was clear: mathematically verified software does not just resist attacks, it eliminates entire classes of vulnerabilities.

The private sector has reached the same conclusion. Amazon Web Services applies mathematical reasoning to its cloud infrastructure, not only to strengthen security, but to build better systems by ensuring faster deployment, fewer errors, and more reliable performance.

The takeaway is simple: proof does not just prevent failure, it drives innovation.

Now we must bring this level of certainty to the AI-powered systems that are defining modern warfare. Consider the stakes: an autonomous air defense system in the Taiwan Strait detects an incoming missile. The system has mere seconds to classify and neutralize the threat. If China can surreptitiously corrupt sensor data by exploiting an undiscovered flaw, it could quickly neutralize one of the most important components of Taiwan’s defense.

The same holds true for AI-driven cyber and electronic warfare. If adversaries can manipulate battlefield signals in ways our algorithms cannot recognize, entire operations could be compromised. These scenarios underscore a stark reality: without mathematical guarantees, we are entrusting our national defense to systems that could fail unpredictably and with disastrous consequences.

China is setting the pace in autonomous warfare. The People’s Liberation Army has placed AI-driven military systems at the center of its modernization strategy—called “intelligentized warfare”—developing autonomous weapons, cyber tools, and electronic attack capabilities with speed. The U.S. is moving just as fast, but speed alone is not enough. China’s closed ecosystem, where the government can scrutinize every line of code, improves its confidence in its capabilities. The United States, with its open and fragmented technology supply chain, faces a harder problem. And it must be met with more sophisticated tools.

Modern conflicts will not be won by those who simply build the most capable AI-powered systems. It will be won by those who can definitively prove theirs will work when the stakes are highest. The U.S. and its allies cannot afford to deploy autonomous systems without mathematical rigor.

This is not just a testing problem; it’s a national security imperative. The defense community must act now to ensure that mathematical proof is powering every system we build. Anything less is a bet we cannot afford to take.

Anjana Rajan served as the Assistant National Cyber Director for Technology Security at the White House from 2022 to 2025.

Jonathan Ring served as the Deputy Assistant National Cyber Director for Technology Security at the White House from 2022 to 2025.



Read the full article here