AI and policy leaders debate web of effective altruism in AI security | The AI Beat

News January 15, 2024 techietr

Last month, I reported on the widening web of connections between the effective altruism (EA) movement and AI security policy circles — from top...

Last month, I reported on the widening web of connections between the effective altruism (EA) movement and AI security policy circles — from top AI startups like Anthropic to DC think tanks like RAND Corporation. These are linking EA, with its laser-focus on preventing what its adherents say are catastrophic risks to humanity from future AGI, to a wide swath of DC think tanks, government agencies and congressional staff.

Critics of the EA focus on this existential risk, or ‘x-risk,’ say it is happening to the detriment of a necessary focus on current, measurable AI risks — including bias, misinformation, high-risk applications and traditional cybersecurity.

Since then, I’ve been curious about what other AI and policy leaders outside the effective altruism movement — but who are also not aligned with the polar opposite belief system, effective accelerationism (e/acc) — really think about this. Do other LLM companies feel equally concerned about the risk of LLM model weights getting into the wrong hands, for example? Do DC policy makers and watchers fully understand EA influence on AI security efforts?

At a moment when Anthropic, well known for its wide range of EA ties, is publishing new research about “sleeper agent” AI models that dupe safety checks meant to catch harmful behavior, and even Congress has expressed concerns about a potential AI research partnership between the National Institute of Standards and Safety (NIST) and RAND, this seems to me to be an important question.

In addition, EA made worldwide headlines most recently in connection with the firing of OpenAI CEO Sam Altman, as its non-employee nonprofit board members all had EA connections.

What I discovered in my latest interviews is an interesting mix of deep concern about EA’s billionaire-funded ideological bent and its growing reach and influence over the AI security debate in Washington DC, as well as an acknowledgement by some that AI risks that go beyond the short-term are an important part of the DC policy discussion.

The EA movement, which began as an effort to ‘do good better,’ is now heavily-funded by tech billionaires who consider preventing an AI-related catastrophe its number one priority, particularly through funding AI security (which is also described as AI ‘safety’) efforts — especially in the biosecurity space.

In my December piece, I detailed the concerns of Anthropic CISO Jason Clinton and two researchers from RAND Corporation about the security of LLM model weights in the face of threats from opportunistic criminals, terrorist groups or highly-resourced nation-state operations.

Clinton told me that securing the model weights for Claude, Anthropic’s LLM, is his number one priority. The threat of opportunistic criminals, terrorist groups or highly-resourced nation-state operations accessing the weights of the most sophisticated and powerful LLMs is alarming, he explained, because “if an attacker got access to the entire file, that’s the entire neural network.”

RAND researcher Sella Nevo told me that within two years it was plausible AI models will have significant national security importance, such as the possibility that malicious actors could misuse them for biological weapon development.

All three, I discovered, have close ties to the EA community and the two companies are also interconnected thanks to EA — for example, Jason Matheny, RAND’s CEO, is also a member of Anthropic’s Long-Term Benefit Trust and has longtime ties to the EA movement.

My coverage was prompted by Brendan Bordelon’s ongoing Politico reporting on this issue, including a recent article which quoted an anonymous biosecurity researcher in Washington calling EA-linked funders “an epic infiltration” in policy circles. As Washington grapples with the rise of AI, Bordelon wrote, “a small army of adherents to ‘effective altruism’ has descended on the nation’s capital and is dominating how the White House, Congress and think tanks approach the technology.”

Cohere pushes back on EA fears about LLM model weights

First, I turned to Nick Frosst, co-founder of Cohere, an OpenAI and Anthropic competitor which focuses on developing LLMs for the enterprise, for his take on these issues. He told me in a recent interview that he does not think large language models pose an existential threat, and that while Cohere protects its model weights, the company’s concern is the business risk associated with others getting access to the weights, not an existential one.

“I do want to make the distinction…I’m talking about large language models,” he said. “There’s lots of interesting things you could talk about that are philosophical, like I think one day we might have true artificial general intelligence. I don’t think it’s happening soon.”

Cohere has also criticized the effective altruism movement in the past. For example, CEO Aidan Gomez reportedly criticized the “self righteousness” of the effective altruism movement and those overly concerned with the threat of an AI doomsday in a letter to his staff.

Frosst said that EA “doesn’t seem to exist much beyond its AI focus these days” and pushed back on their belief system. “If you find yourself in a philosophical worldview that ultimately provides moral justification, indeed, moral righteousness, for the massive accumulation of personal wealth, you should probably question that worldview,” he said.

A big flaw in effective altruism, he continued, is to “assume that you can look at the good you’re doing and assign a number and know exactly how effective this is. It ends up in weird places like, hey, we should make as much money as possible. And we should put it all [towards combating] the existential risk of AI.”

AI21 Labs co-founder says model weights are not ‘key enabler’ of bad actors

Meanwhile, Yoav Shoham, co-founder of another Anthropic and OpenAI competitor, the Tel Aviv-based AI21 labs, also said his company has kept its model weights secret for trade-secret reasons.

“We are very sensitive to potential abuse of technology,” he said. “That said, we tend to think that model weights aren’t necessarily the key enabler of bad actors.”

He pointed out that in an era of a geopolitical AI race, “only certain aspects can be dealt with via policy.” Instead, he explained, “we’re doing our bit with strict terms of use, focus on task-specific models which by their very nature are less prone to abuse, and close collaboration with our enterprise customers, who share our commitment to beneficial uses of AI.”

Shoham emphasized that he and AI21 are not members of the EA movement. “As outsiders, we see there is a combination of thoughtful attention to responsible use of AI, [along] with less grounded fear-mongering.”

RAND researcher says EA beliefs ‘not particularly helpful’

While RAND Corporation has been in the crosshairs of criticism over its EA connections, there are also researchers at RAND pushing back.

Marek Posard, a RAND researcher and military sociologist, spoke out last month on the RAND blog about how AI philosophical debates like effective altruism and e/acc are a ‘distraction’ for AI policy.

“This is a new technology and so there’s a lot of unknowns,” he told me in a recent interview. “There’s a lot of hype. There’s a lot of bullshit, I would argue there’s a lot of real, very real concerns in flux. There’s all of these beliefs and ideologies, philosophies, theories that are floating around, I think, essentially people are latching on to at all.”

But neither EA or e/acc are “particularly helpful,” he added. “They’re also assumptions of what a small group thinks the world is. The reality is we know there are very real problems today.”

Still, Posard did not say that EA voices were not valued at RAND. In fact, he maintained that RAND promotes diversity of thought, which he said is the “secret sauce” of the nonprofit global policy think tank.

“It’s about diversity of thought, of people’s backgrounds, disciplines and experiences,” he said. “I invite anyone to try to push an ideological agenda — because it is not set up to do that.”

Traditional cybersecurity is focused on present-day risks

While many (including myself) may conflate AI security and traditional cybersecurity — and their techniques do overlap, as RAND’s recent report on securing LLM model weights makes clear — I wonder whether the traditional cybersecurity community is fully aware of the EA phenomenon and its impact on AI security policy, especially since the industry tends to focus on present-day risks as opposed to existential ones.

For example, I spoke to Dan deBeaubien, who leads AI research and chairs both the AI policy and product working groups at the SANS Institute, a Rockville, MD-based company specializing in cybersecurity training and certification. While he knew of the EA movement and said that “it’s definitely a force that’s out there,” deBeaubien didn’t seem to be fully aware of the extent of effective altruism’s focus on the existential catastrophic risks of AI — and saw it more as an ethical AI organization.

“We don’t have a lot of effective altruism conversations per se,” he said, pointing out that he was more concerned about understanding the current security risks related to people’s usage of LLM chatbots within organizations. “Do I lie awake worrying that somebody is going to pull a lever and AI is going to take over — I guess I don’t really think much about that.”

Some experts seem to be coexisting with EA concerns

Other DC-focused policy experts, however, seemed well aware of the EA influence on AI security, but appeared focused on coexisting with the movement rather than speaking out strongly on the record.

For example, I spoke to Mark Beall, former head of AI policy at the U.S. Department of Defense, who is now the co-founder and CEO at Gladstone AI, which offers AI education and AI test and evaluation solutions to government and industry entities. He emphasized that Gladstone has not accepted any venture capital or philanthropic funding.

Beall said that the risks of AI are clear — so the conventional tech approach of ‘move fast and break things’ is reckless. Instead, DC requires common sense safeguards, driven by technical realities, that bridge the policy-tech divide, he explained.

“I helped set up the Joint AI Center at the Pentagon, and the fact is, many of those charged with safeguarding American interests have been working on AI long before self-promoted ‘effective altruists’ stumbled into Washington policymaking circles,” he said. “At DoD, we established responsible AI policy and invested heavily in AI safety. Our mission has always been to accelerate responsibly. And for those on the fringes who think that US officials haven’t been independently tracking AI risks — or that they are somehow being duped — are wrong.”

‘Ungoverned AI’ was named a top geopolitical risk

I also reached out to Ian Bremmer, president and founder of Eurasia Group, which last week published its list of the top geopolitical risks of 2024 — with ‘ungoverned AI’ in the number four spot.

Bremmer focused squarely on present-day risks like election disinformation: “GPT-5 is going to come out ahead of the US elections, and “will be so powerful it will make GPT-4 look like a toy in comparison,” he predicted. “Not even its creators truly understand its full potential or capabilities.”

That said, he maintained there is a “legitimate debate” about the value of open vs closed source, and the importance of securing model weights. “I think it would be wrong to assume, as many do, that the push to secure model weights is motivated purely by cynical business calculations,” he said.

However, if effective altruism’s focus is really altruism, Bremmer added that “we need to make sure that AI isn’t aligning with business models that undermine civil society — that means testing models not just for misuse but also to see how normal expected use impacts social behavior (and the development of children—a particular concern).” Bremmer added that he has “seen very little of that from the EA movement to date.”

The problem with EA, he concluded, is that “when you start talking about the end of the world as a realistic possibility—logically every other kind of risk pales into insignificance.”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Source link