Stability AI, the startup funding a series of generative AI experiments, has released a new version of Stable Diffusion, the text-to-image AI system that was one of the first to rival OpenAI’s DALL-E 2.
called Stable diffusion XL or SDXL, the new system – which is available in beta through DreamStudio, Stability AI’s generative art tool – improves on the original in significant ways. Tom Mason, the CTO of Stability AI, says it brings a “richness” of image generation that the old model (Stable Diffusion 2.1) lacked, with improvements most notable in applications such as graphic design and architecture.
“We are pleased to announce the latest iteration in our Stable Diffusion series of imaging solutions,” he said in a canned statement. “[It’s] transforming across industries…with the results before our very eyes.”
Exaggeration aside, SDXL does indeed seem on par with – and perhaps even better than – the latest release of MIdJourney’s model, the model responsible for “Balenciaga pope(among other memes).
While the previous version of Stable Diffusion and many other text-to-image systems struggle immensely to mimic certain anatomy, such as hands, SDXL doesn’t have those problems. The hands aren’t always… well, realistic. But they are miles ahead of the nightmare fuel that SDXL’s predecessor often produced.
SDXL is also supposedly better at generating text, a task that historically loop generative AI art models. But it still has a way to go if my short test is any indication,
In a press release, Stability AI also claims that SDXL features “improved image composition and face generation” and doesn’t need long, detailed directions to create “descriptive images”, unlike its predecessor. In addition, SDXL has functionality beyond just text-to-image prompting, including image-to-image prompting (inserting one image to get variations of that image), inpainting (reconstructing missing parts of an image), and outpainting ( constructing a seamless extension of an existing image).
As a wild card, I tried to mimic the Balenciaga Pope meme with the shortest possible prompt: “Balenciaga Pope”. The difference in the results was more than I expected, I must say, with SDXL posing runway models in what could pass for designer clothes versus the straightforward religious-looking clothes the old Stable Diffusion conjured up.
Once it exits beta, SDXL will be open-source, says Stability AI, just like previous iterations of Stable Diffusion. In addition to DreamStudio, SDXL is currently available through Stability’s API, also in early access.
As generative AI art technology advances, tools like SDXL have left companies in trouble because of the way they are built and commercialized. Stability AI is in the crosshairs of one legal case that alleges that the company violated the rights of millions of artists by developing its tools using web-scraped, copyrighted images. stock Image provider Getty Images has also taken Stability AI to court Reportedly using images from his site without permission to create the original Stable Diffusion.
The open source release of Stable Diffusion has also become the subject of controversy due to its relatively light usage restrictions. Some communities on the internet have used it to generate pronographic celebrity deepfakes and graphic depictions of violence. To date, at least one U.S. lawmaker has called for regulation to crack down on the release of models like Stable Diffusion that “don’t adequately moderate content.”
In response to the lawsuits, Stability AI recently pledged to honor artists’ requests to remove their art from Stable Diffusion’s training dataset, but that didn’t apply to SDXL – just the next generation of Stable Diffusion models, codenamed “Stable Diffusion 3.0.” Artists have so far removed more than 78 million works of art from the training data, according to Spawning, the organization leading the opt-out effort.
Legal challenges be damned, Stability AI is under pressure to monetize its sprawling AI endeavors, which range from art and animation to biomedical and generative audio. Stability AI CEO Emad Mostaque has hinted at plans for an IPO, but Semafor recently reported that Stability AI – which raised more than $100 million in venture capital last October at a reported valuation of more than $1 billion – “burns through cash and has been slow to generate revenue,”