He’s used it as his own teacher’s assistant, for help with crafting a syllabus, lecture, an assignment and a grading rubric for MBA students.
“You can paste in entire academic papers and ask it to summarize it. You can ask it to find an error in your code and correct it and tell you why you got it wrong,” he said. “It’s this multiplier of ability, that I think we are not quite getting our heads around, that is absolutely stunning,” he said.
A convincing — yet untrustworthy — bot
But the superhuman virtual assistant — like any emerging AI tech — has its limitations. ChatGPT was created by humans, after all. OpenAI has trained the tool using a large dataset of real human conversations.
“The best way to think about this is you are chatting with an omniscient, eager-to-please intern who sometimes lies to you,” Mollick said.
It lies with confidence, too. Despite its authoritative tone, there have been instances in which ChatGPT won’t tell you when it doesn’t have the answer.
That’s what Teresa Kubacka, a data scientist based in Zurich, Switzerland, found when she experimented with the language model. Kubacka, who studied physics for her Ph.D., tested the tool by asking it about a made-up physical phenomenon.
“I deliberately asked it about something that I thought that I know doesn’t exist so that they can judge whether it actually also has the notion of what exists and what doesn’t exist,” she said.
ChatGPT produced an answer so specific and plausible sounding, backed with citations, she said, that she had to investigate whether the fake phenomenon, “a cycloidal inverted electromagnon,” was actually real.
When she looked closer, the alleged source material was also bogus, she said. There were names of well-known physics experts listed – the titles of the publications they supposedly authored, however, were non-existent, she said.
“This is where it becomes kind of dangerous,” Kubacka said. “The moment that you cannot trust the references, it also kind of erodes the trust in citing science whatsoever,” she said.
Scientists call these fake generations “hallucinations.”
“There are still many cases where you ask it a question and it’ll give you a very impressive-sounding answer that’s just dead wrong,” said Oren Etzioni, the founding CEO of the Allen Institute for AI, who ran the research nonprofit until recently. “And, of course, that’s a problem if you don’t carefully verify or corroborate its facts.”
An opportunity to scrutinize AI language tools
Users experimenting with the free preview of the chatbot are warned before testing the tool that ChatGPT “may occasionally generate incorrect or misleading information,” harmful instructions or biased content.
Sam Altman, OpenAI’s CEO, said earlier this month it would be a mistake to rely on the tool for anything “important” in its current iteration. “It’s a preview of progress,” he tweeted.
The failings of another AI language model unveiled by Meta last month led to its shutdown. The company withdrew its demo for Galactica, a tool designed to help scientists, just three days after it encouraged the public to test it out, following criticism that it spewed biased and nonsensical text.
Similarly, Etzioni says ChatGPT doesn’t produce good science. For all its flaws, though, he sees ChatGPT’s public debut as a positive. He sees this as a moment for peer review.
“ChatGPT is just a few days old, I like to say,” said Etzioni, who remains at the AI institute as a board member and advisor. It’s “giving us a chance to understand what he can and cannot do and to begin in earnest the conversation of ‘What are we going to do about it?’ “
The alternative, which he describes as “security by obscurity,” won’t help improve fallible AI, he said. “What if we hide the problems? Will that be a recipe for solving them? Typically — not in the world of software — that has not worked out.”