During a research expedition in Hawaii back in 2018, Yuening Zhang SM ’19, PhD ’24 quickly recognized the challenges of coordinating a ship effectively. The urgent need to map underwater landscapes turned the mission into a pressure-filled experience, as team members often had varying interpretations of their responsibilities amidst dynamic conditions. It was during this trip that Zhang envisioned how an AI companion could have streamlined their work and bolstered cooperation among her crew.
Fast forward six years, and as a research assistant at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), Zhang has acted on that vision. Collaborating with her colleagues, she has developed an innovative AI assistant designed to enhance communication among team members, ensuring everyone is on the same page and working toward a common objective. Their findings were showcased at the International Conference on Robotics and Automation (ICRA) and subsequently published on IEEE Xplore on August 8. This system oversees a blend of human and AI agents, stepping in to improve teamwork effectiveness in critical fields such as search-and-rescue operations, surgical procedures, and competitive gaming.
The CSAIL team has pioneered a “theory of mind” model for AI agents, allowing them to grasp how humans perceive each other’s potential actions during collaborative tasks. By analyzing the behaviors of various agents, the AI team coordinator can predict their intentions and gauge their understanding of one another based on previous beliefs. When discrepancies arise, the assistant intervenes, harmonizing their understanding and guiding their actions while asking clarifying questions as necessary.
Take a search-and-rescue operation, for instance: teams must make split-second decisions while triaging victims based on their perspectives of one another’s roles and progress. This type of planning—termed epistemic planning—can be significantly enhanced by CSAIL’s technology, which communicates agents’ intentions to help prevent redundancy and ensure comprehensive coverage. For example, the AI assistant might inform the team that an agent has already entered a specific area or highlight a location that hasn’t yet been searched.
“Our research acknowledges the complex belief structures in team dynamics,” emphasizes Zhang, who now serves as a research scientist at Mobi Systems. “Imagine being part of a team and wondering what each member is doing or how they know what you’re planning. We model how individuals perceive the larger framework of the team’s goals and how they communicate their tasks to achieve collective success.”
AI: The Game-Changer
In high-stakes scenarios, clarity regarding roles is crucial for both human and robotic agents to minimize confusion and error. This is particularly critical during search-and-rescue missions, where locating an endangered individual requires swift action over expansive territories. With the support of the newly developed robotic assistant, communication among search teams could improve significantly, allowing them to share updates about their search areas and strategies more effectively.
Surgery presents another vital application. In this setting, coordination must be flawless; for example, a nurse brings a patient to the operating room, an anesthesiologist administers medication, and surgeons work in tandem while constantly monitoring the patient. Our AI coordinator is designed to oversee these interactions and offer assistance to clarify roles and tasks as confusion arises.
Even in gaming, such as the popular title “Valorant,” where teamwork is essential for navigating attacks and defenses online, an AI assistant could alert players to potential misunderstandings regarding their objectives.
Prior to her contributions to this AI model, Zhang developed EPike—a computational framework that allows an AI to function as a team participant. In a 3D simulation, this algorithm directed a robotic agent that needed to align a drink container with a human’s choice. Despite its intelligence, issues often arose when the AI failed to grasp its human counterpart’s intentions. The enhanced AI coordinator can proactively correct any misunderstandings among agents, ensuring the task is completed correctly. This system effectively communicated accurate human intentions to the robot, leading to successful outcomes.
“In our exploration of human-robot collaboration, we’re continually inspired by the adaptability of human partners,” states Brian C. Williams, MIT professor of aeronautics and astronautics and senior author of the study. “Consider a couple coordinating their morning routine with children: one parent can sense the need to quickly adjust their actions based on what the other is doing, all without explicit communication. Our research on epistemic planning seeks to replicate this fluidity in human interaction.”
The team’s method utilizes probabilistic reasoning and recursive mental modeling to enable the AI assistant to make informed decisions under uncertainty. They prioritize understanding the agents’ actions and plans, complementing previous efforts to model beliefs about the environment or current tasks. While the AI assistant infers beliefs from established past scenarios, the MIT team aims to integrate machine learning techniques for real-time hypothesis generation. Their goal is not only to streamline rich plan representations but to continue reducing computation costs.
Joining Zhang and Williams in this groundbreaking research are Dynamic Object Language Labs President Paul Robertson, Johns Hopkins University Assistant Professor Tianmin Shu, and former CSAIL affiliate Sungkweon Hong PhD ’23. Their project receives substantial backing from the U.S. Defense Advanced Research Projects Agency (DARPA) under the Artificial Social Intelligence for Successful Teams (ASIST) program.
Photo credit & article inspired by: Massachusetts Institute of Technology