Widespread access to defense-dominant technologies can still increase risk.
Or otherwise be suboptimal policy.
A hot topic in the current AI policy discourse is the offense–defense balance of specific AI models or capabilities.1 While there are many definitions of the offense–defense balance,2 it generally refers to “the relative ease of carrying out and defending against attacks,”3 often as measured by the ratio of resources between defenders and attackers.4
For the purposes of this post, we can borrow from Garfinkel & Dafoe (2019) and define the offense–defense balance of some AI capability C as “the ratio of the defender’s investment to the minimum offensive investment that would allow the attacker to secure some expected level of success,”5 when both the attacker and the defender have access to C. We can say that C is defense-dominant if, once both attacker and defender gain access to C, the defender needs to invest fewer resources to successfully defend against an attacker with a constant investment level (relative to the amount the defender would have needed to invest before both had access to C).
The reason the concept has relevance to AI is the debate over how widely accessible certain AI systems ought to be, whether in the form of ability to use systems with C or direct access to the weights of some model M that has capability C.6 If C is defense-dominant, by definition defenders with C will need to invest less to defend against attackers with access to C.
It is therefore tempting to conclude that, if C is defense-dominant, we should, as a policy matter, encourage widespread access to it.
C being defense-dominant is a very good pro tanto reason7 to enable widespread access to C. In many—perhaps most—cases where C is defense-dominant, it will make sense as a policy matter to allow, encourage, or perhaps even require widespread access to C.8
But I also think that there theoretically could be circumstances in which:
C is defense-dominant, but
Widespread access to C would overall increase risk, or otherwise be a bad policy response.
Thus, before allowing/encouraging/requiring widespread access to C, policymakers may wish to consider the following points, in addition to the offense–defense balance.
Defense-dominant technologies may raise the social costs of attacks
If C is defense-dominant, by definition defenders with C will need to invest less to defend against attackers with access to C (as compared to the status quo ante, and holding attackers’ resources constant). But this does not imply that no attacks will succeed. Factors other than the offense–defense balance determine whether any given attack is likely to succeed.
In particular, an attacker may have more resources than a defender, such that the defender may not have sufficient resources to mount a successful defense.
It is also worth keeping in mind that the offense–defense balance outputs a probability of success.9 Luck will play some role. If there are many attacks, then with nontrivial probability of success, some will likely succeed.
This is all true before and after access to C is widespread. But C may increase the social costs of an attack.
The classic example in warfare is nuclear weapons. Counterintuitively, nuclear weapons are defense-dominant (between nuclear powers), because they make it extremely costly for attackers to attack a nuclear-armed defender.10 But as we know, the costs of a successful nuclear attack—nuclear war—would be much higher than a conventional total war. Accordingly, proliferation of nuclear weapons could overall increase risk despite their defense-dominance, by raising the social costs of a nuclear conflict.
Together, this all implies that increased access to some capability C might simultaneously:
Make attacks less likely to succeed, but
Make attacks more socially costly.
Which one of these effects will dominate may be hard to predict. In principle, it seems like the increased social costs of attacks could be larger in some cases.
Not all defenders are good guys
Defense-dominance is good if defenders are good people doing good things. In a context in which virtually everyone is a possible defender, most defenders will be good.11
But there is no rule of the universe that says that all defenders will be engaging in morally good activities.12 One can in principle imagine a capability that is both defense-dominant but primarily useful to bad guys who wish to defend their bad activities from interference by good guys. For example, you can imagine an AI with the capability “laundering money already in one’s possession.”13
Admittedly, this example is contrived. I expect some will object that such capabilities are either so unlikely as to be irrelevant, or else bound up with (or a special case of) capabilities that are much more general and therefore more likely to be usable by good-guy defenders. Maybe so. But these seem like empirical questions. Some AI systems really are pretty narrow, not general-purpose systems. Even some AI systems built on more general foundations, such as LLMs, have been specialized to perform best on relatively narrow tasks. And there really are some conceivable defensive tasks for which most defenders would be bad guys. We therefore should not a priori rule out the possibility that capabilities that are simultaneously defense-dominant and mostly useful to bad guys could exist.14
It may be more cost-effective to give access to a few identifiable defenders
It is one thing to say that, if nearly everyone has access to a capability, defenders will gain on net. This does not imply, however, that widespread access is the optimal policy. This is because we may be able to further increase the benefits to defenders while reducing the extent to which we aid attackers. One way to do this is to differentially allocate the technology to the few identifiable defenders, but otherwise restrict access to it.15
This is most likely to be an attractive policy option if the number of potential defenders are few and easily identifiable, but attackers are many and diffuse.
This is how many countries regulate guns: they give guns to presumed good guys who are easy to select/identify (police, military), and dramatically reduce access to guns for everyone else.
One can imagine AI systems whose defensive capabilities are similarly defensively useful for a small set of identifiable actors. For example, suppose you had an AI system that was very good at finding vulnerabilities in US governmental cyber infrastructure, but had limited applications in other security domains (perhaps because governmental infrastructure is so specialized). This could be used by the government to find and remedy vulnerabilities, or by attackers to exploit them. Even if such a system was defense-dominant, it may make sense for the government to reduce access to such a system.
Conclusion
The offense–defense balance of some AI capability, C, is a very important input into the expected costs and benefits associated with widespread access to that capability. In many cases, policy should encourage broad access to defense-dominant capabilities. However, there are some cases where it may nevertheless be harmful to enable widespread access to a defense-dominant capability. This may include cases where C also raises the costs of a successful attack, or where the most defenders are themselves morally bad.
It is in no part the claim of this post that such exceptions are very likely. In fact, I think they will be unlikely for any given defense-dominant capability. It just seems important to remain open to their possibility when deciding optimal policy for some AI capability.
Where defenders in a position to leverage C are good and few, and C may also be usefully employed by attackers, policymakers may also consider whether those defenders can be differentially advantaged, such as by granting access to all and only defenders. Again, I do not claim that this will be a frequent occurrence. It is just worth noting as a theoretical possibility.
At a broader level, this discussion shows that the offense–defense implications of a capability is a useful tool for deciding whether and how to regulate that capability, but a limited tool. Offense–defense theory was developed to explain the behavior of states in strategic competition with each other and predict the likelihood that they would go to war given prevailing technological conditions. It is not designed to give us an all-things-considered view of how good or bad widespread access—at the individual level—of a new technology will be. This is because many of the harms that can come from such access have significant dissimilarities to interstate competition and conflict, such as the ability to rely on normal law enforcement activities and severe resource constraints facing individuals.
E.g., Ben Garfinkel & Allan Dafoe, How does the offense-defense balance scale?, 42 J. Strategic Stud. 736 (2019), https://doi.org/10.1080/01402390.2019.1631810; Andrew Lohn & Krystal Jackson, Will AI Make Cyber Swords or Shields (Aug. 2022), https://cset.georgetown.edu/publication/will-ai-make-cyber-swords-or-shields/; Jeremy Howard, AI Safety and the Age of Dislightenment, Fast.ai (July 10, 2023), https://www.fast.ai/posts/2023-11-07-dislightenment.html; Sayash Kapoor & Rishi Bommasani et al., On the Societal Impact of Open Foundation Models (2024) (preprint), https://arxiv.org/abs/2403.07918.
See Garfinkel & Dafoe, supra note 1, at 738–39.
Id. at 738.
See id. at 740–41.
See id. at 741.
See Howard, supra note 1; Kapoor & Bommasani et al., supra note 1; Elizabeth Seger et al., Open-Sourcing Highly Capable Foundation Models: An evaluation of risks, benefits, and alternative methods for pursuing open-source objectives (2023) (preprint), https://arxiv.org/abs/2311.09227; Markus Anderljung et al., Frontier AI Regulation: Managing Emerging Risks to Public Safety (2023) (preprint), https://arxiv.org/abs/2307.03718.
“In ethics we use the term ‘pro tanto’, meaning ‘to that extent’, to refer to things that have some bearing on what we ought to do but that can be outweighed.” Amanda Askell, In AI ethics, “bad” isn’t good enough, Amanda Askell (Dec. 14, 2020), https://askell.io/posts/2020/12/bad-isnt-good-enough.
Note that the question of whether and how to regulate access to some specific capability C is distinct from the question of whether and how to regulate access to the weights of a model M with capability C, especially if the model has multiple capabilities with policy significance. This post only focuses on the question of access to C. Optimal policy for M will consider all significant capabilities that M has, and arrive at some aggregated conclusion based on the ideal policy for each such capability.
See Garfinkel & Dafoe, supra note 1, at 740–41.
E.g., Keir A. Lieber, Grasping the Technological Peace: The Offense-Defense Balance and International Security, 25 Int’l Sec. 71, 96–97 (2000), https://www.jstor.org/stable/2626774.
Accord Howard, supra note 1 (“There will still be Bad Guys looking to use [AI models] to hurt others or unjustly enrich themselves. But most people are not Bad Guys. Most people will use these models to create, and to protect.”).
Cf. Richard K. Betts, Must War Find a Way?: A Review Essay, in Offense, Defense, and War 333, 336–37 (Michael E. Brown et al. eds., 2004) (reviewing Stephen Van Evera, Causes of War: Power and the Roots of Conflict (1999)) (challenging Van Evera’s implicit assumption that strategic stability over the status quo is normatively desirable)
This is a defensive capability because the launderer is already in possession of the money. Law enforcement wishing to seize the money would need to “attack”—i.e., use the legal process—to get the money back.
At the same time, when making policy for a model with many capabilities, it is likely a mistake to focus solely on narrow capabilities, or narrow use-cases for capabilities. A lot of bad encryption policy was made by focusing on the narrow use-case of bad guys encrypting their communications, without corresponding attention to the benefits of secure private communication to beneficial economic and social activity.
See Seger et al., supra note 6, at 25.
What about straightforward negative externalities and arms races? These can imply that it would be better to coordinate restrict a technology even if it woulld be individually advantageous to use freely.
(They don't seem very directly mentioned by your article.)
As far as negative externalities, consider a defense dominant technology which also emits lead fumes. You'd potentially prefer to coordinate so that no one uses this technology.
As far as arms races, imagine a new technology which has military value, but no direct economic usefulness. Further suppose the technology only interacts with itself and undefended targets (e.g. it can defend against other applications of the technology, it can attack undefended targets, and it can (with sufficient resources) take down defenses). In this case the only use of the technology for a defender is defending against an attacker, so we'd prefer to coordinate to avoid proliferation.
In the case of AI, my main concern is that the potential for rogue/powerseeking AI poses a negative externality from the perspective of humanity.