I Experimented with Google's New AI Tool for Fun Image Generation

Dec 17, 2024 at 1:02 AM
Google has introduced an innovative AI tool called Whisk that takes image generation to the next level. Based on the latest Imagen 3 model, Whisk enables users to create desired images using other images as the base prompt, rather than relying solely on text prompts.

Unlock the Potential of Image Generation with Whisk

Experimental Phase and Easy Navigation

Whisk is currently in an experimental phase, yet it offers a fairly straightforward navigation once set up. As detailed in a blog post, it is designed for "rapid visual exploration, not pixel-perfect edits." This gives users the freedom to quickly experiment and explore different visual ideas.After going through the initial pages that list important details about how the tool functions and providing the option to sign up for emails and view the privacy policy, users are directly taken to the main page of Whisk. For example, one might see a prompt with a dinosaur plushie as the image style, along with other options like an enamel pin and sticker. Choosing the first option allows users to begin the image generation process.

Uploading Images and Editing

Users are directed to upload an image for the subject. However, it's not always a seamless process. For instance, when uploading a photograph of a smartwatch on the wrist, it didn't work initially. But after trying with a more cartoonish image from the hard drive, plushie figurines of three mythical creatures loaded right away. Once the image is generated, an editing section with a text prompt area becomes available. Simply using the suggested prompt "the character is eating ice cream" led to the generation of additional images with the same creatures holding ice cream cones. This shows the flexibility and creativity that Whisk offers in the editing process.

Start from Scratch and My Library

Alternatively, users can scroll down below the main prompt creation and select "start from scratch." This allows them to upload their own images or enter their own text. They can also add additional text from the beginning to make their characters perform an action. If unsure about what images to add or text to type, clicking the "Inspire Me" button will fill in images. The tool also includes a My Library section where users can view all the images they've created. Here, they have the option to enable or disable the library based on their preferences and can delete images individually or delete the library data as a whole.

Comparison with Microsoft Designer

The Whisk tool is reminiscent of the Microsoft Designer prompt that allows users to create Funko Pop! figures. While Microsoft Designer generates a range of whimsical or realistic images using only text prompts, Whisk combines the power of image-based prompts with the option to add text prompts. This gives users more control and flexibility in the image generation process. Google noted that including text prompts is to address the potential for the tool to "miss the mark," ensuring that users always have the option to fill in prompts when needed.