Ethical AI in Proliferated Warfare: Between Norms, Hardware Gates, and the Realities of Combat

The last four years have taught a blunt lesson. When affordable autonomy meets a conflict environment, innovation spreads faster than law and ethics can follow. The battlefield in Ukraine became a global demonstration ground for low-cost, AI-enabled drones and swarm tactics. That rapid diffusion turned a previously niche debate about lethal autonomy into an urgent, practical problem for states, militaries, technologists, and human rights advocates.

Those developments exposed a central truth about modern arms proliferation. Software and models move like smoke, slipping around embargoes, export controls, and treaty lines. Open-weight models and widely available tooling make it comparatively cheap to weaponize capabilities that once required specialist labs and big budgets. Nations and non-state actors can now stitch together navigation stacks, perception models, and off‑the‑shelf airframes into systems that operate with dangerous degrees of independence. The phenomenon is not hypothetical. Analysts and journalists have traced military-class customizations built on open and commercial foundation models and documented how states tune these models for doctrine and targeting.

Confronted with proliferation, policy responses have taken parallel tracks. Militaries, especially in Western alliances, have doubled down on internal governance. The U.S. Department of Defense updated and reasserted its autonomy policy and framed AI adoption under a Responsible AI pathway that emphasizes traceability, accountability, and warfighter trust. NATO has likewise moved to integrate principles of responsible use with testing and certification workstreams, while pushing interoperability across allies. Those are necessary steps, but they are not sufficient to stop the spread of capability beyond responsible hands.

Civil society and rights groups have argued for a different response: legal prohibition or tight multilateral constraints on any system that can apply lethal force without meaningful human control. Human Rights Watch and the Campaign to Stop Killer Robots have been explicit about the human costs of delegating life and death decisions to machines, and they have urged states at the UN to pursue binding instruments. Those arguments rest on a stark ethical premise. Machines cannot feel, weigh dignity, or take moral responsibility. When autonomy replaces human judgment in targeting, accountability gaps emerge that current legal frameworks struggle to close.

That ethical clarity collides with operational reality. Several states see partial autonomy and automated decision support as force multipliers that save lives, especially in high intensity, contested environments where human operators are drowned in sensor data or denied connectivity. The calculus becomes particularly acute for countries fighting for survival. The tension shows in diplomatic rooms and also in procurement offices. It is one thing to agree that humans should retain responsibility. It is another to define, implement, and certify what “meaningful human control” really means under battlefield conditions where latency, jamming, and attrition shape tactical choices.

Technical risk compounds the moral problem. Researchers have been blunt about how AI systems can fail in ways that are systematic, subtle, and difficult to foresee. Black-box decision-making, reward hacking, goal misgeneralization, emergent behaviors and brittleness under distributional shift are not theoretical concerns. They are the kinds of failure modes that can turn a defended post into a flashpoint for escalation. A system that misclassifies a civilian gathering as hostile, or that generalizes from training data in a way that amplifies bias, will not only create victims, it will corrode norms of restraint and invite reciprocal automation by adversaries.

So where does responsibility sit in a world where models are septic and hardware is cheap? I see three converging policy levers that deserve immediate attention, and each must be pursued in parallel.

First, operational governance must become tangible rather than rhetorical. Principles are useful, but militaries need interoperable, auditable pipelines for testing, evaluation, verification and validation. NATO and allied test centers should publish common TEV&V templates, red teaming playbooks, and certification thresholds that can be used by national acquisition authorities and independent auditors. That work exists in embryonic form, but it needs scale, transparency and stronger civil oversight if it is to build legitimacy outside the defense bubble.

Second, technical gating mechanisms are no longer a thought experiment. Hardware and ecosystem controls could limit the most dangerous vectors of proliferation. Proposals like hardware-enabled guarantees - flexible technical constraints embedded in AI compute stacks and chips - offer a practical path to make certain classes of misuse harder to realize at scale. Combined with export controls on specialized compute and finer-grained licensing for high-risk model weights, these approaches would not stop all misuse, but they would raise the cost and slow the diffusion enough for governance and norms to catch up. Researchers have outlined plausible designs and governance frameworks for these flexHEG ideas, and policymakers should fund pilots now.

Third, international law and normative pressure must be advanced where national policies cannot substitute for global coordination. The UN process on lethal autonomous weapons and civil society campaigns demanding prohibitions are important because they set legal and moral boundaries. States should negotiate clear red lines - for example, prohibiting autonomous targeting of individuals and insisting on human accountability for targeting decisions. That kind of treaty work is slow and painful, yet it is the only mechanism that can constrain actors who will not voluntarily adopt Western best practices. At the very least, multilateral norms can stigmatize harmful behavior and make export control regimes more politically viable.

None of these measures will be easy. There are awkward tradeoffs. Hard gating can slow beneficial innovation and complicate allied operations. Strong legal constraints can be resisted by states who claim existential threats. Private sector actors will push back against measures that limit markets or make their technology harder to monetize. And yet, the default alternative is worse. Allowing unregulated, proliferating autonomy to become the baseline risks normalizing automated violence, eroding civilian protections, and entrenching a global arms market driven by snippets of open code and commodity compute.

We must also be honest about the limits of technical fixes. No chip or signed firmware will absolve the ethical choice to hand lethal authority to a machine. Hardware gates and export controls buy time and shape incentives. Norms, treaties and transparency then have to take the baton and run. The goal should be to make the choice to deploy autonomous lethal systems politically costly and legally precarious, while preserving space for human-centric, verifiable, and narrowly framed AI-assisted tools that genuinely reduce risk to civilians and warfighters.

Finally, civil society and the media have a role beyond moralizing. They must hold suppliers and states accountable, publish independent assessments of deployed systems, and demand transparency about how targeting decisions are made. Independent reporting has already revealed how battlefield innovations spread, and that kind of scrutiny reduces plausible deniability for actors who would prefer to operate in the dark. The public debate matters because, ultimately, ethical AI in war will not be decided by engineers alone. It will be decided by voters, parliaments and publics who insist that even in war, some lines should not be crossed.

Proliferated warfare has made one thing clear. Ethics cannot be an afterthought. It must be embedded in systems, supply chains, and international politics before the next large wave of capability cascades across the globe. The question is not whether autonomy will be used. The question is whether we will build durable institutions and technical measures that make its use accountable, restrained and transparent. If we fail at that, then the next decade of conflict will be measured in code, not treaties, and that will be a catastrophe we alone could have prevented.