A.I.
Penn Engineers recreate Star Trek’s Holodeck using ChatGPT and video game assets
UNIVERSITY OF PENNSYLVANIA SCHOOL OF ENGINEERING AND APPLIED SCIENCE
In Star Trek: The Next Generation, Captain Picard and the crew of the U.S.S. Enterprise leverage the holodeck, an empty room capable of generating 3D environments, to prepare for missions and to entertain themselves, simulating everything from lush jungles to the London of Sherlock Holmes. Deeply immersive and fully interactive, holodeck-created environments are infinitely customizable, using nothing but language: the crew has only to ask the computer to generate an environment, and that space appears in the holodeck.
Today, virtual interactive environments are also used to train robots prior to real-world deployment in a process called “Sim2Real.” However, virtual interactive environments have been in surprisingly short supply. “Artists manually create these environments,” says Yue Yang, a doctoral student in the labs of Mark Yatskar and Chris Callison-Burch, Assistant and Associate Professors in Computer and Information Science (CIS), respectively. “Those artists could spend a week building a single environment,” Yang adds, noting all the decisions involved, from the layout of the space to the placement of objects to the colors employed in rendering.
That paucity of virtual environments is a problem if you want to train robots to navigate the real world with all its complexities. Neural networks, the systems powering today’s AI revolution, require massive amounts of data, which in this case means simulations of the physical world. “Generative AI systems like ChatGPT are trained on trillions of words, and image generators like Midjourney and DALLE are trained on billions of images,” says Callison-Burch. “We only have a fraction of that amount of 3D environments for training so-called ‘embodied AI.’ If we want to use generative AI techniques to develop robots that can safely navigate in real-world environments, then we will need to create millions or billions of simulated environments.”
Enter Holodeck, a system for generating interactive 3D environments co-created by Callison-Burch, Yatskar, Yang and Lingjie Liu, Aravind K. Joshi Assistant Professor in CIS, along with collaborators at Stanford, the University of Washington, and the Allen Institute for Artificial Intelligence (AI2). Named for its Star Trek forebear, Holodeck generates a virtually limitless range of indoor environments, using AI to interpret users’ requests. “We can use language to control it,” says Yang. “You can easily describe whatever environments you want and train the embodied AI agents.”
Holodeck leverages the knowledge embedded in large language models (LLMs), the systems underlying ChatGPT and other chatbots. “Language is a very concise representation of the entire world,” says Yang. Indeed, LLMs turn out to have a surprisingly high degree of knowledge about the design of spaces, thanks to the vast amounts of text they ingest during training. In essence, Holodeck works by engaging an LLM in conversation, using a carefully structured series of hidden queries to break down user requests into specific parameters.
Just like Captain Picard might ask Star Trek’s Holodeck to simulate a speakeasy, researchers can ask Penn’s Holodeck to create “a 1b1b apartment of a researcher who has a cat.” The system executes this query by dividing it into multiple steps: first, the floor and walls are created, then the doorway and windows. Next, Holodeck searches Objaverse, a vast library of premade digital objects, for the sort of furnishings you might expect in such a space: a coffee table, a cat tower, and so on. Finally, Holodeck queries a layout module, which the researchers designed to constrain the placement of objects, so that you don’t wind up with a toilet extending horizontally from the wall.
To evaluate Holodeck’s abilities, in terms of their realism and accuracy, the researchers generated 120 scenes using both Holodeck and ProcTHOR, an earlier tool created by AI2, and asked several hundred Penn Engineering students to indicate their preferred version, not knowing which scenes were created by which tools. For every criterion — asset selection, layout coherence and overall preference — the students consistently rated the environments generated by Holodeck more favorably.
The researchers also tested Holodeck’s ability to generate scenes that are less typical in robotics research and more difficult to manually create than apartment interiors, like stores, public spaces and offices. Comparing Holodeck’s outputs to those of ProcTHOR, which were generated using human-created rules rather than AI-generated text, the researchers found once again that human evaluators preferred the scenes created by Holodeck. That preference held across a wide range of indoor environments, from science labs to art studios, locker rooms to wine cellars.
Finally, the researchers used scenes generated by Holodeck to “fine-tune” an embodied AI agent. “The ultimate test of Holodeck,” says Yatskar, “is using it to help robots interact with their environment more safely by preparing them to inhabit places they’ve never been before.”
Across multiple types of virtual spaces, including offices, daycares, gyms and arcades, Holodeck had a pronounced and positive effect on the agent’s ability to navigate new spaces.
For instance, whereas the agent successfully found a piano in a music room only about 6% of the time when pre-trained using ProcTHOR (which involved the agent taking about 400 million virtual steps), the agent succeeded over 30% of the time when fine-tuned using 100 music rooms generated by Holodeck.
“This field has been stuck doing research in residential spaces for a long time,” says Yang. “But there are so many diverse environments out there — efficiently generating a lot of environments to train robots has always been a big challenge, but Holodeck provides this functionality.”
In June, the researchers will present Holodeck at the 2024 Institute of Electrical and Electronics Engineers (IEEE) and Computer Vision Foundation (CVF) Computer Vision and Pattern Recognition (CVPR) Conference in Seattle, Washington.
This study was conducted at the University of Pennsylvania School of Engineering and Applied Science and at the Allen Institute for Artificial Intelligence (AI2).
Additional co-authors include Fan-Yun Sun, Jiajun Wu, and Nick Haber at Stanford; Ranjay Krishna at the University of Washington; Luca Weihs, Eli Vanderbilt, Alvaro Herrasti, Winson Han, Aniruddha Kembhavi, and Christopher Clark at AI2.
Essentially, Holodeck engages a large language model (LLM) in a conversation, building a virtual environment piece by piece.
CREDIT
Yue Yang
METHOD OF RESEARCH
Computational simulation/modeling
Enhanced AI tool TeXGPT powers up academic writing
Digital Science announces an update to Writefull for Overleaf, which uses AI to help academic authors write better, faster and with more confidence in LaTeX
DIGITAL SCIENCE
Digital Science is further broadening its range of AI innovations in a major new release from AI-based academic language service Writefull, which is to be used in the collaborative authoring tool Overleaf.
The enhanced feature in Writefull for Overleaf is a new AI ‘context menu’ user interface, with two key updates. This new menu is directly integrated in Overleaf. Firstly, it makes TeXGPT more accessible as it is now activated by users simply pressing ‘space’ on a new line or clicking on the popup menu when text is highlighted. Second, and most excitingly, is the addition of new AI options, which go to work on any selected text and offer options to paraphrase, change style, split or join sentences, or even summarize or explain whole paragraphs. Combined, these enhancements make communicating research clearly so much simpler and effective for authors.
These updates to Writefull for Overleaf build on the initial release of TeXGPT in early 2023, which helps with most aspects of creating documents in Overleaf, such as creating LaTeX code for formatting tables, figures, and formulas. TeXGPT makes the whole process more efficient, working in a similar way to now familiar ChatGPT interactions of prompts and responses.
This updated Writefull feature is the latest AI-powered release from Digital Science, embedding AI into Digital Science products to support existing and new use cases like (among others):
- Dimensions Research GPT providing the user AI-generated answers to research-related questions on the GPT platform informed by Dimensions’ huge database, making ChatGPT more research-specific for topic exploration
- AI-driven summarization in the Dimensions web application – enabling all users to accelerate the identification of the most relevant content for their research questions, with short, concise summaries available for every record in a given search result list with just one click
- The Papers AI Assistant – providing the Papers user the ability to use AI to chat with her/his publications and documents.
Digital Science’s responsible development of AI tools is designed to harness the power of AI for researchers.
Commenting on the update, Digital Science Chief Product Officer Christian Herzog says: “This update represents how Digital Science is innovating on several fronts with AI, moving forward both responsibly and at pace to deliver genuinely useful technologies to users across the research ecosystem. Digital Science is committed to building technologies that empower researchers at every step of their journey, delivering fast, efficient ways to harness the power of AI to solve problems and accelerate progress. Each development we release from Digital Science brings that vision closer.”
About Digital Science
Digital Science is an AI-focused technology company providing innovative solutions to complex challenges faced by researchers, universities, funders, industry and publishers. We work in partnership to advance global research for the benefit of society. Through our brands – Altmetric, Dimensions, Figshare, ReadCube, Symplectic, IFI CLAIMS Patent Services, Overleaf, Writefull, OntoChem, Scismic and metaphacts – we believe when we solve problems together, we drive progress for all. Visit www.digital-science.com and follow @digitalsci on Twitter/X or on LinkedIn.
About Writefull
Writefull offers services to help researchers with their English writing and to help publishers improve efficiencies across their submission, copy editing, and quality control workflows. Writefull’s AI language models are trained via machine learning on billions of journal article sentences; its Deep Learning algorithms give researchers feedback on their full texts, helping them to convey research in a more precise and meaningful way. Visit writefull.com and follow @Writefullapp on X.
About Overleaf
Overleaf is the market-leading scientific and technical writing platform – it’s a LaTeX editor that’s easy enough for beginners and powerful enough for experts. Loved by over 16 million users, it’s trusted by top research institutions and Fortune 500 companies around the world. Users can collaborate easily with colleagues, track changes in real-time, write in LaTeX code or a visual editor, and work in the cloud or on-premises. With Overleaf, anyone can write smarter – creating complex, beautifully formatted documents with ease.
Developing best practices for human-AI collaboration in engineering design
Lehigh University researcher A. Emrah Bayrak earns NSF CAREER award for project that could help determine how the engineering workforce can most effectively team up with artificial intelligence on complex design tasks
LEHIGH UNIVERSITY
As artificial intelligence is inevitably woven into the workplace, teams of humans will increasingly collaborate with robots on complex design problems, such as those in the auto, aviation, and space industries.
“Right now, design is mainly done by humans, and it’s based on their expertise and intuitive decision-making, which is learned over time,” says A. Emrah Bayrak, an assistant professor of mechanical engineering and mechanics in Lehigh University’s P.C. Rossin College of Engineering and Applied Science. “Usually, you’re not creating something totally new. You take something that works already, understand how it works, and make incremental changes. But introducing AI could make the process a lot faster—and potentially more innovative.”
However, best practices for integrating AI in a way that both maximizes productivity and the job satisfaction of the human worker remain unclear. Bayrak recently won support from the National Science Foundation’s Faculty Early Career Development (CAREER) program for his proposal to allocate portions of complex design problems to human and AI teams based on their capabilities and preferences.
The prestigious NSF CAREER award is given annually to junior faculty members across the U.S. who exemplify the role of teacher-scholars through outstanding research, excellent education, and the integration of education and research. Each award provides stable support at the level of approximately $500,000 for a five-year period.
Bayrak will explore the problem of dividing a complex task between human designers and AI from both a computational and experimental perspective. For the former, he’ll use models that predict how a rational human being would explore the design of, say, the powertrain in an electric vehicle.
“We know that decision-making is a sequential process,” he says. “People will make a decision, look at the outcome, and revise their next decision accordingly. In order to maximize the range of an EV, when humans consider the design of the powertrain, they have to make decisions about gear ratios, motor size, and battery size. These are all mathematical variables that we can feed into a model to predict what the next decision should be if a human is a rational person.”
AI, in contrast, makes decisions based on training data. Feed it data on good decisions regarding gears, motors, and batteries, and it can then estimate possible vehicle designs that will yield an acceptable range. AI could also use that knowledge to think about what the next design decision should be.
Bayrak’s model will also contain different human archetypes. For example, a person who trusts AI completely versus one who does not, and those who hover somewhere in the middle. The model will combine the mathematical variables that represent decision-making with the full range of archetypes to determine strategies for the division of labor between humans and AI.
Bayrak will then test those findings experimentally. Study participants will be asked to work together with AI to virtually approach the design of a vehicle.
“We give them a design problem and tell the people which decisions they’re responsible for making and which are the responsibility of the AI. They work together, and the goal is to collect the data and see if the computational results reflect what happens in the experimental findings. In other words, do designers act as predicted by the computational models or do those designers who don’t fully trust AI end up satisfied with the division of labor?” says Bayrak.
The ultimate goal, he says, is not to replace humans in the workplace. Rather, it’s to develop principles for how and to what extent AI should be integrated into complex design projects. And those guidelines will reflect different priorities—for example, a team may want to incorporate AI as merely an assistant, or it may want to give it significant responsibility. Teams may want to prioritize quick decision-making, innovation, or job satisfaction.
“The idea is that we’ll have quantitative evidence that reveals which practices work well to achieve specific objectives and which do not,” he says. “This work could potentially shape how organizations are structured in the future, and that is very exciting.”
About A. Emrah Bayrak
Alparslan Emrah Bayrak is an assistant professor in the Department of Mechanical Engineering and Mechanics in the P.C. Rossin College of Engineering and Applied Science at Lehigh University. He joined Lehigh in January 2024.
Bayrak’s research focuses on bridging computational methods and human cognition to develop human-computer collaboration architectures for the design and control of smart products and systems. He is particularly interested in developing artificial intelligence systems that can effectively collaborate with humans considering unique capabilities of humans and computational systems. His research uses methods from design, controls, game theory, and machine learning, as well as human-subject experiments on virtual environments such as video games.
Bayrak earned his MS and PhD in mechanical engineering from the University of Michigan and a BS in mechatronics engineering from Sabanci University (Turkey).
Related Links
- Faculty Profile: A. Emrah Bayrak
- NSF Award Abstract (# 2339546): CAREER: Problem Partitioning and Division of Labor for Human-Computer Collaboration in Engineering Design
USC and Capital One establish new center for responsible AI in finance
Housed under the USC School of Advanced Computing, the center aims to advance responsible AI for financial services
UNIVERSITY OF SOUTHERN CALIFORNIA
USC and Capital One announced today the USC-Capital One Center for Responsible AI and Decision Making in Finance (CREDIF). Supported by a $3-million gift from Capital One, the joint research center will focus on advancing foundations for algorithmic, data and software innovations for artificial intelligence (AI) and its applications to finance.
Combining USC’s world-class research with Capital One’s domain expertise, it is the first center launched under the auspices of the USC School of Advanced Computing (SAC), a unit of the USC Viterbi School of Engineering. A cornerstone of USC President Carol Folt’s Frontiers of Computing “moonshot,” the SAC serves as a nexus for advanced computing research and education across the university. The center will stand as its inaugural beacon, said Gaurav Sukhatme, director of the SAC and executive vice dean of the USC Viterbi School of Engineering.
“This new center is emblematic of the fast-moving and far-reaching impact of computing today,” said Sukhatme, a professor of computer science and electrical and computer engineering. “Responsible, human-centered decision making—a cornerstone of the USC School of Advanced Computing—is at its very heart.”
Home to some of the world’s leading minds in advanced computing, the center will explore how emerging technologies in AI and analytics can be applied to financial systems and services at scale, advancing cross-disciplinary knowledge between finance and technology.
Prem Natarajan, chief scientist and head of enterprise AI at Capital One, who spearheaded the idea for the center with Sukhatme, said the new center will help Capital One leverage the immense resources within the SAC to address complex challenges and opportunities in the financial sector.
“At Capital One, we believe multidisciplinary partnerships and initiatives like CREDIF can advance the state of the art in AI while also ensuring diverse perspectives and equities when developing, testing, and deploying AI capabilities,” said Natarajan. “USC’s leading-edge faculty, students, and research resources in combination with Capital One’s mission-driven focus and world-class industry talent create a unique opportunity to leverage AI to solve some of the most challenging problems in financial services and provide differentiated value to millions of customers.”
Amongst its goals, the center will support research projects focused on the development of cutting-edge technology and approaches to improve business and finance innovation. Each year, USC faculty members will be invited to submit proposals for faculty-led research efforts. An annual fellowship for doctoral students, named Capital One Fellows, will equip students with the skills and knowledge needed to excel in the field of AI in finance. In addition, USC and Capital One will also host annual joint research symposium and workshops to share insights with the wider community.
“We are thrilled to partner with Capital One to advance responsible AI and decision making in finance,” said Yannis C. Yortsos, dean of the Viterbi School of Engineering. “Our talented USC Viterbi students and faculty will use their outstanding technical computational competence to provide human-and societal- centric financial decision making. Such strong partnerships allow the development of extraordinary new solutions to real-world problems that help advance innovation, productivity and the pursuit of human flourishing.”
Petros Ioannou, a professor of electrical and computer engineering, aerospace and mechanical engineering, and industrial and systems engineering, will serve as the center’s inaugural director.
Ioannou, the director of the Center for Advanced Transportation Technologies, holds the A.V. ‘Bal’ Balakrishnan Chair in Engineering and was recently appointed as a University Professor, USC’s highest academic honor. In 2008, Ioannou developed USC’s Master’s of Science in Financial Engineering program in collaboration with the USC Marshall School of Business, one of Viterbi’s most successful master’s programs.
“I am delighted to be part of this important research center, where cutting-edge computing and AI techniques will be applied to solving complex financial problems,” said Ioannou. “My priority is to motivate and attract researchers and PhD students and strengthen our capabilities in financial engineering. The potential for solving complex financial problems using computational techniques and AI is enormous.”
A member of the USC faculty community since 1982, Ioannou is a leading authority in control systems neural networks, nonlinear systems, and intelligent transportation systems. He was recently inducted into the National Academy of Engineering and made a Fellow of the European Academy of Science. He is also a member of the Academia Europaea and a Fellow of the National Academy of Inventors. He has published 9 books, more than 170 journal articles and book chapters, nearly 250 conference papers, and holds 3 patents.
“With his rich history of contributions to the applications of engineering to many disciplines including networks and transportation, and his experience establishing and directing the successful master’s program in financial engineering, Professor Ioannou is well-positioned to serve as the inaugural director of this multidisciplinary center,” said Sukhatme.
ERC wants to see what shapes the stories AI tells us
Artificial Intelligence - Humanities
Professor Jill Walker Rettberg, Co-Director of the Centre for Digital Narrative at the University of Bergen, is awarded an ERC Advanced Grant for the project AI STORIES. The grant consists of 2.5 million Euro over 5 years. This is Rettberg's second ERC Grant.
“The AI STORIES project builds on the premise that storytelling is central to human culture, with narratives shaping our understanding of the world. We will study artificial intelligence and how it creates new narratives,” says Rettberg.
Generative AI has been dubbed a “stochastic parrot”, mimics of language patterns who doesn't really understand what they are saying. AI STORIES posits that the large language models (LLMs) which form the foundations for ChatGPT are more influenced by deep narrative structures than previously recognized. To manage AI bias, we need to consider the underlying narratives in the training data and not just proximity of words and images.
When Microsoft's AI chatbot expressed its love for a journalist in 2023, was it really in love? Most likely not. Generative AI is, after all, a statistical game, and not actual feelings. This new research will test the hypothesis that the AI said “I love you” because it is trained on so many of our sci-fi stories where AI gains conscience and human emotions.
AI stories are our stories, but to what end?
Earlier Rettberg has talked about how AI can replace or homogenise stories from certain storytelling traditions, like the Norwegian childrens' story When the Robbers Came to Cardamom Town:
"This story is more than a shared cultural reference – it supports the Norwegian criminal justice system’s priority of rehabilitation over punishment. It is distinct from Disney movies, with their unambiguous villains who are punished at the end, and from Hollywood bank heists and gangster movies that glorify criminals.”
The project will conduct case studies on Scandinavian, Australian, and Indian or Nigerian narratives, contrasting them with the dominant American and English-speaking narratives in LLMs.
“Generative AI might well bury stories like Cardamom Town by stuffing chatbot responses and search results worldwide with homogenized American narratives,” says Rettberg.
A new narratology for AI
“I think what we need is a new narratology, to see how narrative theory shapes and can be used when we develop and use AI,” says Rettberg.
The new narratology will inform policymakers, developers, and educators on the future direction of AI. Current LLMs has mainly been developed by computer scientists and linguists, but Rettberg posits that narratologists should perhaps be just as important to the AI future.
Rettberg and her colleagues will cooperate with developer industry and developer communities.
“I congratulate Jill Walker Rettberg, who solidifies her position at the top of her field, demonstrating both quality and originality. Her achievements serve as an inspiration for others at UiB,” says University of Bergen's Rector Margareth Hagen.
With Rettberg's project, researchers at the University of Bergen have so far secured a total of 50 ERC grants spanning the period from 2010 to 2024.
1 of 255 among 1829 – strong competition
Rettberg is among the 255 selected outstanding research leaders in Europe to receive this grant, according to a press release from the ERC.
The competition attracted 1,829 proposals which were reviewed by panels of internationally renowned researchers. The funding is amongst the EU’s most prestigious and competitive, providing leading senior researchers with the opportunity to pursue ambitious, curiosity-driven projects that could lead to major scientific breakthroughs.
The new grants are part of the EU’s Horizon Europe programme.
ERC grants
Prestigious research funding with applicants from all over the world. Awarded by the European Research Council (ERC).
ERC Advanced Grants are awarded to established world-class researchers with up to €2.5 million over 5 years.
Large Language Models (LLM)
Large language models are algorithmic foundations for generative AI chatbots like OpenAI’s ChatGPT and Google’s Bard. LLMs are fed vast datasets, including articles, books, and internet resources, with the goal of generating human-like responses to questions or prompts from the users.
Based on the training data it can be used to generate everything from wedding speeches to fact boxes to frameworks for the coming robot uprising.
Generative AI has uses across a wide range of industries, but it also poses potential challenges and risks, such as cybercrime, the creation of fake news or deepfakes that can deceive or manipulate people.
Jill Walker Rettberg and ERC
AI STORIES is Professor Rettbergs second ERC Grant, the first was a ERC Consolidator Grant on the project Machine Vision, which ended in 2023. She is also the Co-Director of the Center for Digital Narrative.
Centre for Digital Narrative (CDN)
Humanities-driven research in electronic literature, games studies, digital culture, and computation to advance understanding of digital narrative.
CDN is a Norwegian Centre of Research Excellence funded by the Norwegian Research Council from 2023-2033. CDN focuses on algorithmic narrativity, new environments and materialities, and shifting cultural contexts.