Sam Altman Admits That OpenAI Doesn’t Actually Understand How Its AI Works
“We certainly have not solved interpretability.”
OpenAI, a company that has raised billions to develop groundbreaking AI technologies, is grappling with a significant issue: understanding the very systems it creates. During last week’s International Telecommunication Union AI for Good Global Summit in Geneva, Switzerland, OpenAI’s CEO Sam Altman, was put on the spot. When asked about how their their large language models (LLMs) “think”, Altman admitted, “We certainly have not solved interpretability,” as reported by the Observer .
This prompted further scrutiny when The Atlantic’s CEO Nicholas Thompson questioned whether this lack of understanding should halt the release of ever more powerful models. Altman’s reply that the AIs are “generally considered safe and robust” felt less like reassurance and more like an acknowledgment of an ongoing problem in AI development.
Our website does not collect, store, or share any user data. If you enjoy our content and value your privacy, consider supporting us.
Researchers everywhere are finding it challenging to decode the “thinking” processes of AI systems. Chatbots, with their endless ability to respond to any question a user might pose, tend to “hallucinate” and deliver odd or incorrect responses (or just flatout gaslight their users) and it’s incredibly difficult to understand how they arrive at their answers. Despite efforts, linking these outputs back to their original training data is not a simple task. Ironically, OpenAI—despite its name—remains tight-lipped about the data sets used for training.
A recent report involving 75 experts, commissioned by the UK government , highlighted that many AI developers “understand little about how their systems operate,” describing our current scientific grasp as “very limited”. The report notes that while there are emerging techniques for explaining AI models, they are still in their infancy and require much more research.
Other companies are also struggling with these challenges. Anthropic, one of OpenAI’s competitors, has been heavily investing in interpretability research. In a blog post , the company detailed their initial exploration into their latest LLM, Claude Sonnet. However, they acknowledged that this work is only the beginning. Regardless, understanding these features doesn’t necessarily explain how the model employs them.
Altman’s comments on interpretability aren’t just of technical concern; they touch on vital issues of AI safety and ethics, sparking heated debates among experts. The notion of rogue artificial general intelligence (AGI) poses what some see as an existential threat to humanity (if taking everyone’s jobs wasn’t enough).
Altman said, “It does seem to me that the more we can understand what’s happening in these models, the better”.
That sounded downright sensible, pointing to a crucial and obvious need for transparency and rigorous research, as well as a possible reconsideration of how rapidly we advance AI technology.
Ironically, in the latest “you can make this stuff up” move, Altman disbanded OpenAI’s “Superalignment” team – tasked with oversight of AI systems more intelligent than humans – and formed a new “safety and security committee” under his own leadership. So, erm… yeah..