Stability AI introduces new Stable Cascade AI image generator

Stability AI has today launched its latest open source AI image generator in the form of Stable Cascade.  The new AI artwork creator represents a significant leap forward in the ability to create realistic images and text, outpacing previous models such as Stable Diffusion and its larger counterpart, Stable Diffusion XL. What sets Stable Cascade apart is not just its performance but also its efficiency, which is crucial in the fast-paced realm of AI.

Würstchen architecture

The secret behind Stable Cascade’s impressive capabilities lies in its Würstchen architecture. This design choice effectively shrinks the size of the latent space, which is a technical term for the abstract representation of data within the model. By doing so, Stable Cascade can operate faster, reducing the time it takes to generate images, and also cut down on the costs associated with training the AI. Despite these efficiencies, the quality of the images produced remains high. In fact, the model boasts a compression factor of 42, a significant jump from the factor of 8 seen in Stable Diffusion, which is a testament to its enhanced speed and efficiency.

Stage A, Stage B and Stage C

Stable Cascade consists of three models: Stage A, Stage B and Stage C, representing a cascade for generating images, hence the name “Stable Cascade”. Stage A & B are used to compress images, similarly to what the job of the VAE is in Stable Diffusion. However, as mentioned before, with this setup a much higher compression of images can be achieved. Furthermore, Stage C is responsible for generating the small 24 x 24 latents given a text prompt. The following picture shows this visually. Note that Stage A is a VAE and both Stage B & C are diffusion models.

See also  A Closer Look at OpenAI's Sora Video Generator

Stable Cascade open source AI image generator

One of the most exciting aspects of Stable Cascade is its open-source nature. The code for this AI image generator is freely available on GitHub, along with helpful scripts for training and using the model. This openness invites a community of developers and AI aficionados to contribute to the model’s development, potentially leading to even more advancements. However, it’s important to note that those looking to use Stable Cascade for commercial purposes will need to navigate licensing requirements.

Here are some other articles you may find of interest on the subject of Stability AI :

For this release, Stability AI are offering two checkpoints for Stage C, two for Stage B and one for Stage A. Stage C comes with a 1 billion and 3.6 billion parameter version, but it’s develop and team highly recommend using the 3.6 billion version, as most work was put into its finetuning.

The two versions for Stage B amount to 700 million and 1.5 billion parameters. Both achieve great results, however the 1.5 billion excels at reconstructing small and fine details. Therefore, you will achieve the best results if you use the larger variant of each. Lastly, Stage A contains 20 million parameters and is fixed due to its small size.

Stable Cascade doesn’t just stop at its core technology; it offers a suite of extensions that can be used to fine-tune its performance. These include a control net, an IP adapter, and an LCM, among others. These tools give users the ability to tailor the model to their specific needs, whether that’s adjusting the style of the generated images or integrating the model with other software.

See also  Microsoft makes major quantum computing breakthrough — development of most stable qubits might actually make the technology viable for many, but will anyone be able to afford it?

When compared to other AI models in the market, such as DallE 3 and Mid Journey, Stable Cascade stands out. Its unique combination of features and capabilities positions it as a strong contender in the AI image generation field. This is not just about the technology itself but also about how accessible it is. Stability AI has made Stable Cascade available through various platforms, including the HuggingFace Library and the Pinokio app, which means that a wide range of users, from hobbyists to professionals, can explore and leverage the advanced features of this model.

Commercial Availability

Looking ahead, Stability AI has plans to offer a commercial use license for Stable Cascade. This move will open up new opportunities for businesses and creative professionals to utilize the model’s capabilities for their projects. But before that happens, the company is committed to a thorough period of testing and refinement to ensure the tool meets the high standards required for commercial applications.

The community’s role in the development of Stable Cascade cannot be overstated. Users are not just passive recipients of this technology; they are actively engaged in creating custom content and exploring the model’s possibilities. This collaborative environment is vital for innovation, as it allows for a sharing of ideas and techniques that can push the boundaries of what AI can achieve. Stability AI explain little more about Stable Cascade’s achievements far :

“Moreover, Stable Cascade achieves impressive results, both visually and evaluation wise. According to our evaluation, Stable Cascade performs best in both prompt alignment and aesthetic quality in almost all comparisons. The above picture shows the results from a human evaluation using a mix of parti-prompts (link) and aesthetic prompts. Specifically, Stable Cascade (30 inference steps) was compared against Playground v2 (50 inference steps), SDXL (50 inference steps), SDXL Turbo (1 inference step) and Würstchen v2 (30 inference steps).”

Stability AI’s Stable Cascade is a notable addition to the AI image generation landscape. With its efficient architecture, open-source accessibility, and extensive customization options, it offers a powerful tool for those looking to create realistic images and text. As the community continues to grow and contribute to the model’s evolution, the potential uses for Stable Cascade seem boundless. The excitement surrounding this new AI image generator is a clear indication that the field of artificial intelligence is not just growing—it’s thriving, with innovations that continue to surprise and inspire.

See also  What is an AVIF Image?

Filed Under: Technology News, Top News





Latest timeswonderful Deals

Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, timeswonderful may earn an affiliate commission. Learn about our Disclosure Policy.

Leave a Comment