Couldn’t attend Transform 2022? Check out all the top sessions in our on-demand library now! Look here.
An image created from scratch by a video game designer using an AI tool recently won an art competition at the Colorado State Fair, as has been widely reported. Some artists are alarmed, but should they be?
For several years AI is included in tools used daily by artists, from computational photography within the Apple iPhone to image enhancement tools from Topaz Labs and lightrickseven open source applications. But because an image generated entirely by an AI tool has won a competition, some see this as a tipping point: a sign of a coming AI catastrophe that will lead to widespread job displacement for people in creative fields, including graphic design and illustration, photography, journalism, creative writing and even software development.
The winning image was generated using Snack, a cloud-based text-to-image tool developed by a small research lab of that name that “explores new media of thought and expands the imagination of the human species.” Their product is a text-to-image generator, the result of neural AI networks trained on large numbers of images. The company hasn’t disclosed its tech stack, but CEO David Holz said it uses very large AI models with billions of parameters. “They are trained in billions of images.” Although Midjourney has only recently come out of stealth mode, hundreds of thousands of people are already using the service.
Suddenly there is a proliferation of similar tools, including OpenAI’s DALL-E and image from Google. According to a Vanity Fair storyImagen offers “photorealistic images” [that] are not yet distinguishable from real.” Stable diffusion from Stability.ai is another new text-to-image tool that is open source and can be run locally on a PC with a good graphics card. Stable Diffusion can also be used through art generator services, including: art breeder, Pixelz.ai and Lightricks.
MetaBeat will bring together thought leaders to offer advice on how metaverse technology will change the way all industries communicate and do business October 4 in San Francisco, CA.
To use is to believe
As an avid hobby photographer who exhibits work in galleries, I have my own concerns that these tools could spell the end of photography. I decided to try Midjourney myself to see what it could bring and to think more about the possible consequences. The following image was generated by trying variations on these text clues: “An emerald lake backed by craggy Canadian Rockies + a few patches of snow on the mountains + Soft morning light + mountains with green coniferous forest + Sunrise + 4K UHD.”
This seems like a great result for a novice user. The total time it took from when I first used the system to the final image was less than 30 minutes. I must admit that I experienced a childlike miracle as I saw the image materialize in just seconds from the directions I gave. This recalled a 60-year-old quote from science fiction writer and futurist Arthur C. Clarke: “Any sufficiently advanced technology is indistinguishable from magic.” It felt like magic.
There are others that use Midjourney that show a lot more sophistication. For example, one user produced an “alien cat” image from over 30 text prompts, including: “cat + alien with glittering rainbow scales, glowing, hyper-detailed, micro-detail, ultra-wide, octane rendering, realistic…” It turns out that more detailed prompts can result in more advanced, higher quality images.
These AI text-to-image tools are already good enough for commercial endeavors. creative artist Karen X. Cheng was engaged to create an AI-produced cover image for Cosmopolitan. To help generate ideas and the final image, she used DALL-E, or more specifically the latest version, DALL-E 2. Cheng describes the process including searching for the right set of prompts, noting that she generated thousands of images, tweaking the text prompts hundreds of times over many hours before finding one image that felt right.
Text-to-image: a new tool or a threat to a way of life?
In a LinkedIn afternoted Cheng: “I think the natural response is to fear that AI will replace human artists. Sure, that thought crossed my mind, especially at the beginning. But the more I use DALL-E, the less I do this. see as a substitute for people, and the more I see it as a tool for people to use – an instrument to play.”
I had the same feeling when using Midjourney. I posted the image of the Canadian Rockies on Flickr, an image sharing site for artists – primarily photographers and digital artists – and asked for opinions. Specifically, I wanted to know whether people viewed an AI image generator as an abomination and threat or just another tool. A professional replied: “I’ve also played with Midjourney. I’m a creative! How can I NOT mess with it to see what it can do? I believe that the results are art, even if it is AI-generated. A human imagination creates the prompt, then manages the results or tries to get another result from the system. I think it is beautiful.”
A common refrain in the debate about AI is that it will destroy jobs. The answer to this concern is often twofold: first, that many existing jobs will be expanded by AI so that humans and machines working together will produce better output by expanding, not replacing, human creativity; second, that AI will also create new jobs, possibly in fields that did not exist before.
Entrepreneur and influencer Rob Lennon predicted recently that AI text and image generators will lead to new career opportunities, especially citing “rapid engineering.” Prompt craft is the art of knowing how to write a prompt to get optimal results from an AI. The best prompts are concise and provide the AI context to understand the desired result. Already, PromptBase has started to market this service. his platform enables fast engineers to “sell text descriptions that reliably produce a particular art style or topic on a specific AI platform.”
Megan Paetzholda photo editor at New York magazine, put DALL-E to the test with assignments she would normally give to artists in her team. In the end, calling it “a draw,” she noted, “DALL-E never gave me a satisfying picture on the first try — there was always a workshop process.” She added: “As I refined my techniques, the process started to feel shockingly collaborative; I was working of DALL-E instead of using the. DALL-E would show me his work and I would adjust my prompt until I was satisfied.
Is there no dark side?
Obviously, these tools can be used to produce high quality content. While many creative jobs may eventually be under threat, for now, text-to-image generators are an example of humans and machines working together in a new field of artistic exploration. Ethically, the key is to reveal that an image or text was created using an AI generator so that people know that the content was produced by a machine. They may like the output or not, and it is no different from other creative endeavors in that regard.
This perspective will not please everyone. Many writers, photographers, illustrators, and other creatives — even if they agree that AI’s generation tools lack sophistication — believe it’s only a matter of time before they, the creative professionals, are replaced by machines. Bloomberg technology editor Vlad Savov encapsulated these arguments, and see these tools as both suffocating and depressing performers. He may be right in the end, though one respondent commented on my Flickr question, “It’s a different kind of art, which isn’t necessarily bad and potentially allows for incredible creativity.” Another wrote: “I don’t feel threatened by AI. Everything changed.” It does. I think we thought there would be more time.
These tools may be just one more in the artist’s kit. They will be used to produce images and text that will be enjoyed and sold. As Jesus Diaz writes in Fast Company: “Once you try a text-to-image program, the joy of artificial intelligence seems undeniable, despite the many dangers that lie ahead.” This does not automatically mean that more traditional creative pursuits will disappear. Ironically, there may come a time in the not-too-distant future when “man-made” will have a cachet, and work produced without an AI image or text generator could command a premium.
Gary Grossman is the senior VP of technology practice at Nobleman and global leader of the Edelman AI Center of Excellence.
Welcome to the VentureBeat Community!
DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.
If you want to read about the very latest ideas and up-to-date information, best practices and the future of data and data technology, join us at DataDecisionMakers.
You might even consider contributing an article yourself!
Read more from DataDecisionMakers