Recent results from Google’s large language model (LLM) “artificial intelligence” search summarizer (Gemini) made headlines due to their comically bizarre answers: we should add glue to pizza sauce, and eat one small rock every day.
What I find concerning about these incidents isn’t that they prove LLMs are fallible (no surprise there), but that — at a fundamental level — they’re fallible in exactly the way that’s most toxic to the human condition.
In his blog today, Bruce Schneier wrote:
Large language models, the AI foundations behind tools like ChatGPT, are built on top of huge corpuses of data culled from the Internet. These are models trained to recapitulate what millions of real people have written in response to endless topics, contexts, and scenarios.
But they’re not actually trained to “recapitulate,” they’re trained to construct plausible sounding answers. In the two cases above, Google’s Gemini LLM gave answers that came from a single, isolated source. In the first case, it was a single Reddit shitposter from over a decade ago. In the second, it was a single article in the satirical newspaper The Onion. The only thing Gemini used its “huge corpuses of data culled from…millions of real people” for was to make those single, satirically-motivated comments into dogma.
As we increasingly use LLMs for things like medical decision support (medicine is already a field crippled by blind adherence to dogma), LLMs harbor the potential to make things much worse.
Thus far, attempts to sanitize LLMs to avoid these absurdities just seem to make the answers more bizarre.
Garbage in, gospel out.
—2p