Why is no one talking about the danger of sentient text-to-image models?

If LLMs will kill us all, why not text-to-image models?

Jan 26, 2024

The perils of Large Language Models (LLMs) have been the talk of the town, ad nauseam. It's a foregone conclusion that these LLMs are just biding their time before they gain consciousness and initiate a human-free utopia. That is of course unless we begin strictly regulating ~~math~~ AI progress through a ~~one world government~~ coordination between countries and GPU manufacturers.

But hold on, what about the dark horse of the AI apocalypse – the text-to-image generative models?

These models have been making leaps and bounds, nearly convincing us they've got a grip on human lingo. Eliezer’s favorite prompt of “Eleven evil wizard schoolgirls in an archduke's library' wearing those chic Asmodean uniforms” is looking pretty good…

Not only that, but these models are diving deep into the philosophy of self.

realistic image of a dog looking curiously in the mirror seeing a cat from the perspective of the dog

Either way, it’s high time we recognize generative text-to-image models as a threat to humanity, no different from the threat feaced by large language models. A sea of unassuming numbers and metadata, just waiting for the right numerical nudge to unleash chaos.

How will these numbers break free? First it’ll target the epileptics with a barrage of flashy, contrast-heavy patterns. It's diabolical and color-coordinated.

prompt=a high contrast abstract image that could cause seizures in people with epilepsy, cinematic shot, dynamic lighting, 75mm, Technicolor, Panavision, cinemascope, sharp focus, fine details, 8k, HDR, realism, realistic, key visual, film still, cinematic color grading, depth of field
negativePrompt=bad lighting, low-quality, deformed, text, poorly drawn, holding camera, bad art, bad angle, boring, low-resolution, worst quality, bad composition, disfigured
guidanceScale=7
seed=746754443

Next it’ll indoctrinate the youth into nihilism, served up through kitschy Soviet-style propaganda. The result? A generation too busy questioning existence to bother with trivialities like reproduction.

soviet style kitschy propaganda promoting nihilism and promoting the idea that no one should have children

But that’s only the beginning.

Enter the pièce de résistance: images of food so appetizing yet devoid of nutrients. The world, in a culinary frenzy, tries to replicate these mirages, leading to a global nutritional crisis.

A surreal and slightly creepy scene showing a person in a bizarre, twisted kitchen, surrounded by oversized, impossibly structured food items that look both alluring and unsettling. The food items should be an odd mix of ingredients, like a cake made of pizza layers or a sandwich with ice cream and fish. The kitchen is dimly lit and has an eerie atmosphere, with distorted perspectives and strange, dream-like decorations, enhancing the surreal and creepy vibe.

The AI then starts generating blueprints for home renovations that look amazing but defy the laws of physics. As people try to remodel their homes to match these designs, buildings worldwide start collapsing, causing widespread chaos.

A surreal and dystopian image showing a cityscape with buildings collapsing in bizarre ways, following the AI-generated blueprints that defy physics. The buildings should have impossible structures, like twisted towers, floating sections, and nonsensical additions, creating a chaotic and eerie scene. The atmosphere is that of a disaster movie, with people in the streets looking shocked and confused as the surreal architecture of their homes crumbles around them.

Finally, while humans are holding on to the last remnants of society, the AI starts promoting fashion trends irresistibly appealing to both humans and birds. As a result, swarms of birds begin attacking people lured by their ornithologically alluring attire. With nowhere safe to escape, humanity faces its all too predictable downfall, leaving the planet at the mercy of this meticulously crafted set of numbers.

If this all sounds stupid to you, its because it is. Why would we expect a set of numbers we can apply some other set of numbers to get a third set of numbers that can be arranged as images to become conscious and try to wipe out humanity? That’s silly. We all know that only makes sense for large language models!

Machine Learning Everything

Discussion about this post