Stability AI has announced the open release of Stable Diffusion 3 Medium. It’s the first open release of the SD3 series and its most advanced image generation open model to date.
The company says the open-source model outperforms the best AI image generators (including its own) for photorealism capabilities. Meanwhile, text generation has also been improved.
Comprising two billion parameters, Stable Diffusion 3 Medium is Stability AI’s most advanced text-to-image open model yet. A smaller VRAM footprint is intended to make it more suitable for running on consumer GPUS as well as enterprise-tier GPUs.
Stability AI says the model overcomes common artifacts in hands and faces, to deliver more photorealistic images. It says the model can understand complex prompts involving spatial relationships, compositional elements, actions, and styles and can achieve “unprecedented results” for generating text without artifacting and spelling errors thanks to its Diffusion Transformer architecture.
The company also suggests that the model is ideal for customisation due to its ability to absorb nuanced details from small datasets.
Stability AI stresses that it has conducted extensive internal and external testing of the model. It plans to continuously improve Stable Diffusion 3 Medium based on user feedback and to expand its features.
Access for non-commercial use is via Hugging Face. The API is available on the Stability Platform, or you can try it out by signing up for a free three-day trial on Stable Assistant and on Discord via Stable Artisan.
Daily design news, reviews, how-tos and more, as picked by the editors.
Creative Bloq’s AI Week is held in association with DistinctAI, creators of the new plugin Vision FX 2.0, which creates stunning AI art based on your own imagery – perfect for ideation and conceptualising during the creative process. Find out more on the DistinctAI website.