Science fiction author Charles Stross took Google’s “Bard” for a test drive. Bard is what popular culture calls “Artifical Intelligence,” a.k.a., but which is more properly called a Large Language Model (LLM); or, to use Ted Chiang’s more general nomenclature, it’s merely Applied Statistics.
In any case, Stross asked Google Bard to provide five facts about Charles Stross. Because he has an unusual name, he was fairly certain there were no other Charles Strosses to confuse Google Bard. The results? “Bard initially offers up reality-adjacent tidbits, but once it runs out of information it has no brakes and no guardrails: it confabulates without warning and confidently asserts utter nonsense.”
Stross concludes his post with a warning: “LLMs don’t answer your questions accurately — rather, they deliver a lump of text in the shape of an answer.” However, a commenter adds nuance to Stross’s warning: “Bard is clearly showing signs of prompt exhaustion, and that should have triggered a ‘this answer is out of confidence’ error and terminated the output. In a well-designed system you would not have seen those answers.” But even admitting that Bard is a poorly-designed LLM, how would the average user know which LLM is well-designed and which is not?
LLMs deliver answer-shaped text — with no way of judging how accurate it is.