r/StableDiffusion Sep 11 '22

A better (?) way of doing img2img by finding the noise which reconstructs the original image Img2Img

Post image
913 Upvotes

View all comments

3

u/crischu Sep 11 '22

Would it be possible to get a seed from the noise?

8

u/Aqwis Sep 12 '22

Probably not, all the possible seeds can only generate a few of the possible noise matrices. If you want to share a noise matrix with someone else, the matrix itself can be saved and shared as a file, though.

3

u/Adreitz7 Sep 12 '22

How large is the noise matrix in comparison with the generated image? If you have to transmit a 512x512x8x8x8 (RGB) matrix to generate a 512x512 image, it would be better just to transmit the final image, especially considering that, for most normal images, lossless compression can reduce the size by a factor of two or more, while the noise matrix will likely be incompressible.

2

u/muchcharles Sep 12 '22

Isn't the noise in latent space? 64x64x3(bytes? floats?)

1

u/Adreitz7 Sep 12 '22

But isn’t the latent space on the order of 800,000,000 parameters? That is even larger than a 512x512 image.

1

u/muchcharles Sep 12 '22

Since latent diffusion operates on a low dimensional space, it greatly reduces the memory and compute requirements compared to pixel-space diffusion models. For example, the autoencoder used in Stable Diffusion has a reduction factor of 8. This means that an image of shape (3, 512, 512) becomes (3, 64, 64) in latent space, which requires 8 × 8 = 64 times less memory.

https://huggingface.co/blog/stable_diffusion