Update README.md
Browse files
README.md
CHANGED
|
@@ -8,23 +8,25 @@ license: creativeml-openrail-m
|
|
| 8 |
|
| 9 |

|
| 10 |
|
| 11 |
-
SDXL consists of a mixture-of-experts pipeline for latent diffusion:
|
| 12 |
In a first step, the base model is used to generate (noisy) latents,
|
| 13 |
-
which are then further processed with a refinement model (available here:
|
| 14 |
Note that the base model can be used as a standalone module.
|
| 15 |
|
| 16 |
-
Alternatively, we can use a two-
|
| 17 |
First, the base model is used to generate latents of the desired output size.
|
| 18 |
In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img")
|
| 19 |
-
to the latents generated in the first step, using the same prompt.
|
|
|
|
|
|
|
| 20 |
|
| 21 |
### Model Description
|
| 22 |
|
| 23 |
- **Developed by:** Stability AI
|
| 24 |
- **Model type:** Diffusion-based text-to-image generative model
|
| 25 |
-
- **License:** [
|
| 26 |
- **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
|
| 27 |
-
- **Resources for more information:** [GitHub Repository](https://github.com/Stability-AI/generative-models) [SDXL
|
| 28 |
|
| 29 |
### Model Sources
|
| 30 |
|
|
|
|
| 8 |
|
| 9 |

|
| 10 |
|
| 11 |
+
[SDXL](https://arxiv.org/abs/2307.01952) consists of a mixture-of-experts pipeline for latent diffusion:
|
| 12 |
In a first step, the base model is used to generate (noisy) latents,
|
| 13 |
+
which are then further processed with a refinement model (available here: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/) specialized for the final denoising steps.
|
| 14 |
Note that the base model can be used as a standalone module.
|
| 15 |
|
| 16 |
+
Alternatively, we can use a two-stage pipeline as follows:
|
| 17 |
First, the base model is used to generate latents of the desired output size.
|
| 18 |
In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img")
|
| 19 |
+
to the latents generated in the first step, using the same prompt. This technique is slightly slower than the first one, as it requires more function evaluations.
|
| 20 |
+
|
| 21 |
+
Source code is available at https://github.com/Stability-AI/generative-models .
|
| 22 |
|
| 23 |
### Model Description
|
| 24 |
|
| 25 |
- **Developed by:** Stability AI
|
| 26 |
- **Model type:** Diffusion-based text-to-image generative model
|
| 27 |
+
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
|
| 28 |
- **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
|
| 29 |
+
- **Resources for more information:** Check out our [GitHub Repository](https://github.com/Stability-AI/generative-models) and the [SDXL report on arXiv](https://arxiv.org/abs/2307.01952).
|
| 30 |
|
| 31 |
### Model Sources
|
| 32 |
|