Stable Diffusion 3 arrives, solidifying early lead in AI imagery over Sora and Gemini

Stability AI announced stable diffusion 3, the latest and most powerful version of the company’s image-generating AI model. Details are sparse, but it’s clear that this is an attempt to fend off the hype surrounding competitors recently announced by OpenAI and Google.

We’ll be covering all these technical details soon, but for now, know that Stable Diffusion 3 (SD3) is based on a new architecture and will work with a variety of hardware (but (You’ll need something powerful). .Not yet released, but you can join the waiting list here.

SD3 uses updated “Diffusion transformer” The technology was developed in 2022, revised in 2023, and has now reached scalability. An impressive video of OpenAI His generator, Sora, appears to work on similar principles (Will Peebles, a co-author of the paper, went on to co-lead the Sora project). It also employs another new technique, Flow Matching, which also improves quality without adding much overhead.

Model suites range from 800 million parameters (commonly used SD less than 1.5) to 8 billion parameters (more than SD XL) with the goal of running on a variety of hardware . You’ll probably need a setup intended for serious GPU and machine learning work, but you’re not limited to APIs like you would with OpenAI or Google models. (Anthropic is not really part of this conversation, as it is not publicly focused on producing images or videos.)

Emad Mostaque, head of Stable Diffusion at X, formerly Twitter, said the new model is capable of not only video input and generation, but also multimodal understanding, something competitors have highlighted with their API-based competitors. It says it has all the features it has. Although these features are still in the theoretical stage, there appears to be no technical barrier to their inclusion in future releases.

Of course, it is impossible to compare these models. Because none have actually been released, and only competing claims and selected examples need to be considered. However, Stable Diffusion has one decisive advantage. It exists in the zeitgeist as the go-to model for doing any kind of image production anywhere, with few inherent limitations on method or content. (In fact, if SD3 manages to break through the safety mechanisms, it will almost certainly usher in a new era of AI-generated pornography.)

Stable Diffusion seems to be aiming for white-label generative AI that you can’t live without, rather than boutique generative AI that you may or may not need. To this end, the company is also upgrading its tools to lower the bar for use, but like the rest of the announcement, these improvements are left to the imagination.

Interestingly, the company put safety at the forefront of its announcement, stating:

We have taken and will continue to take reasonable steps to prevent the misuse of Stable Diffusion 3 by malicious parties. Safety begins when you start training your model and continues through testing, evaluation, and deployment. In preparation for this early preview, we have put in place a number of safeguards. By continuously collaborating with researchers, experts, and the community, we hope to continue to innovate further in our models toward public release.

What exactly are these safeguards? Undoubtedly, previews will reveal their contours to some extent, and then public release will either refine them further or censor them depending on your perspective on these things. It will be. We’ll have more details soon, but in the meantime, we’ll dive into the technical side of things to better understand the theory and methodology behind this new generation of models.

Stable Diffusion 3 arrives, solidifying early lead in AI imagery over Sora and Gemini

Byautomateinsider

By automateinsider

Related Post

Anthropic aims to fund a new generation of more comprehensive AI benchmarks

Gemini’s data analytics capabilities aren’t as good as Google claims

Hevia Raises Nearly $100 Million Series B for Andreessen Horowitz-Led AI-Powered Document Search

Introducing AI for customer service

You missed

Researchers from ETH Zurich and the University of California, Berkeley introduce MaxInfoRL: a new reinforcement learning framework for balancing endogenous and extrinsic exploration – MarkTechPost

4 ways artificial intelligence will reveal the unexpected in 2024 – CNN

Andrew Ng is betting big on agent AI – Fast Company

Absci Bio releases IgDesign: A deep learning approach to transform antibody design with reverse folding – MarkTechPost

Automate insider