Chatbots are not the only AI models to have advanced in recent years. Specialized models trained on biological data have similarly leapt forward, and could help to accelerate vaccine development, cure diseases, and engineer drought-resistant crops. But the same qualities that make these models beneficial introduce potential dangers. For a model to be able to design a vaccine that is safe, for instance, it must first know what is harmful.
That is why experts are calling for governments to introduce mandatory oversight and guardrails for advanced biological models in a new policy paper published Aug. 22 in the peer-reviewed journal Science. While today’s AI models probably do not “substantially contribute” to biological risk, the authors write, future systems could help to engineer new pandemic-capable pathogens.
“The essential ingredients to create highly concerning advanced biological models may already exist or soon will,” write the authors, who are public health and legal professionals from Stanford School of Medicine, Fordham University, and the Johns Hopkins Center for Health Security. “Establishment of effective governance systems now is warranted.”
“We need to plan now,” says Anita Cicero, deputy director at the Johns Hopkins Center for Health Security and a co-author of the paper. “Some structured government oversight and requirements will be necessary in order to reduce risks of especially powerful tools in the future.”
Humans have a long history of weaponizing biological agents. In the 14th century, Mongol forces are thought to have catapulted plague-infested corpses over enemy walls, potentially contributing to the spread of the Black Death in Europe. During the Second World War, several major powers experimented with biological weapons such as plague and typhoid, which Japan used on several Chinese cities. And at the height of the Cold War, both America and the Soviets ran expansive biological weapons programs. But in 1972, both sides—along with the rest of the world—agreed to dismantle such programs and ban biological weapons, resulting in the Biological Weapons Convention.
This international treaty, while largely considered effective, did not fully dispel the threat of biological weapons. As recently as the early 1990s, the Japanese cult Aum Shinrikyo repeatedly tried to develop and release bioweapons such as anthrax. These efforts failed because the group lacked technical expertise. But experts warn that future AI systems could compensate for this gap. “As these models get more powerful, it will lower the level of sophistication a malicious actor would need in order to do harm,” Cicero says.
Not all pathogens that have been weaponized can spread from person to person, and those that can tend to become less lethal as they become more contagious. But AI might be able to “figure out how a pathogen could maintain its transmissibility while retaining its fitness,” Cicero says. A terror group or other malicious actor is not the only way this could happen. Even a well-intentioned researcher, without the right protocols in place, could accidentally develop a pathogen that gets “released and then spreads uncontrollably,” says Cicero. Bioterrorism continues to attract global concern, including from the likes of Bill Gates and U.S. Commerce Secretary Gina Raimondo, who has been leading the Biden administration’s approach to AI.
The gap between a virtual blueprint and a physical biological agent is surprisingly narrow. Many companies allow you to order biological material online, and while there are some measures to prevent the purchase of dangerous genetic sequences, they are applied unevenly both within the U.S. and abroad, making them easy to circumvent. “There’s a lot of little holes in the dam, with water spurting out,” Cicero explains. She and her co-authors encourage mandatory screening requirements, but note even these are insufficient to fully guard against the risks of biological AI models.
To date, 175 people—including researchers, academics, and industry professionals from Harvard, Moderna, and Microsoft—have signed a set of voluntary commitments contained in the Responsible AI x Biodesign community statement, published earlier this year. Cicero, who is one of the signatories, says she and her co-authors agree that while these commitments are important, they are insufficient to protect against the risks. The paper notes that we do not rely on voluntary commitments alone in other high-risk biological domains, such as where live Ebola virus is used in a lab.
The authors recommend governments work with experts in machine learning, infectious disease, and ethics to devise a “battery of tests” that biological AI models must undergo before they are released to the public, with a focus on whether they could pose “pandemic-level risks.”
Cicero explains “there needs to be some kind of floor. At the very minimum, the risk-benefit evaluations and the pre-release reviews of biological design tools and highly capable large language models would include an evaluation of whether those models could lead to pandemic- level risks, in addition to other things.”
Because testing for such abilities in an AI system can be risky in itself, the authors recommend creating proxy assessments—for example, whether an AI can synthesize a new benign pathogen as a proxy for its ability to synthesize a deadly one. On the basis of these tests, officials can decide whether access to a model should be restricted, and to what extent. Oversight policies will also need to address the fact that open-source systems can be modified after release, potentially becoming more dangerous in the process.
The authors also recommend that the U.S. creates a set of standards to guide the responsible sharing of large-scale datasets on “pathogenic characteristics of concern,” and that a federal agency be empowered to work with the recently created U.S. AI Safety Institute. The U.K. AI Safety Institute, which works closely with its U.S. counterpart, has already conducted safety testing, including for biological risks, on leading AI models; however, this testing has largely focused on assessing the capabilities of general-purpose large language models rather than biology-specific systems.
“The last thing we want to do is cut the industry off at the knees and hobble our progress,” Cicero says. “It’s a balancing act.” To avoid hampering research through over-regulation, the authors recommend regulators initially focus only on two kinds of models: those trained with very large amounts of computing power on biological data, and models of any size trained on especially sensitive biological data that is not widely accessible, such as new information that links viral genetic sequences to their potential for causing pandemics.
Over time, the scope of concerning models may widen, particularly if future AIs are capable of doing research autonomously, Cicero says. Imagine “100 million Chief Science Officers of Pfizer working round the clock at 100 times the speed of the real one,” says Cicero, pointing out that while this could lead to incredible breakthroughs in drug design and discovery, it would also greatly increase risk.
The paper emphasizes the need for international collaboration to manage these risks, particularly given that they endanger the entire globe. Even so, the authors note that while harmonizing policies would be ideal, “countries with the most advanced AI technology should prioritize effective evaluations, even if they come at some cost to international uniformity.”
Due to predicted advances in AI capabilities and the relative ease of both procuring biological material and hiring third-parties to perform experiments remotely, Cicero thinks that biological risks from AI could manifest “within the next 20 years, and maybe even much less,” unless there is proper oversight. “We need to be thinking not just of the current version of all of the available tools, but the next versions, because of the exponential growth that we see. These tools are going to be getting more powerful,” she says.