A group of researchers at Stanford University and Google have created a miniature RPG-style virtual world similar to The Sims, where 25 characters, controlled by ChatGPT and custom code, live out their lives independently with a high degree of realistic behavior. They wrote about their experiment in a preprint academic paper released on Friday.
“Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day,” write the researchers in their paper, “Generative Agents: Interactive Simulacra of Human Behavior.”
To pull this off, the researchers relied heavily on a large language model (LLM) for social interaction, specifically the ChatGPT API. In addition, they created an architecture that simulates minds with memories and experiences, then let the agents loose in the world to interact. And humans can interact with them, too.
“Users can observe and intervene as agents plan their days, share news, form relationships, and coordinate group activities,” they write. It’s the work of Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein.
Computer and video games have included computer-controlled characters since the 1970s, but never before have they been able to simulate a social environment with the complexity of natural language that might now be possible thanks to generative AI models like ChatGPT. While the group’s research is not necessarily a “game,” it could be a prototype of a future where dynamic RPG characters interact in complex and unexpected ways.
“Imagine killing an NPC and coming back to the city and seeing a funeral for them,” joked a Twitter user named Dennis Hansen when replying to a thread about the paper’s emergent implications. Judging by this research, that might not be a far-fetched scenario.
Life in Smallville
To study the group of AI agents, the researchers set up a virtual town called “Smallville,” which includes houses, a cafe, a park, and a grocery store. For human interaction purposes, the world is represented on screen from an overhead view using retro-style pixel graphics reminiscent of a classic 16-bit Japanese RPG.
Smallville is home to a community of 25 distinct individuals, each represented by a basic sprite avatar. To capture each agent’s identity and their connections with other members of the community, the researchers created a paragraph of natural language description as a seed memory. These descriptions include details about each agent’s occupation and relationships with other agents. For example, here’s an excerpt of one such seed memory provided in the paper:
John Lin is a pharmacy shopkeeper at the Willow Market and Pharmacy who loves to help people. He is always looking for ways to make the process of getting medication easier for his customers. John Lin is living with his wife, Mei Lin, who is a college professor, and son, Eddy Lin, who is a student studying music theory. John Lin loves his family very much.
As a virtual environment, Smallville is broken into both areas and objects. Human users can enter the world as an existing or new agent, and both users and agents can influence the state of objects through actions. Human users can also interact with AI agents through conversation or by issuing directives as an “inner voice.” Users communicate in natural language, specifying a persona that the agent perceives them as, or can use the inner voice to influence the agent’s actions.