r/StableDiffusion Sep 11 '22

A better (?) way of doing img2img by finding the noise which reconstructs the original image Img2Img

Post image
893 Upvotes

View all comments

17

u/Adreitz7 Sep 11 '22

This is great! I like to see these innovations that dive into the inner workings of SD. This looks like a powerful feature. In your example mosaic, is the second image meant to be the base reconstruction, and the following images modifications of it? I’m asking because the second image looks most like the first, but I noticed that it is more vivid — the saturation has increased. It’s a minor thing here, but could cause problems if it is a general effect of your technique. Any idea why this happened?

21

u/Aqwis Sep 11 '22

Yeah, the second image is basically the base reconstruction. In general, converting an image to its latent representation and then back again to an image is going to lose a little bit of information, so that the two images won't be identical, but in most cases they will be very close. However, in this case I think the difference in contrast is caused by what happens at the very end of find_noise_for_image, namely:

return (x / x.std()) * sigmas[-1]

This basically has the effect of increasing the contrast. It shouldn't be necessary, but if I don't do this then in many cases the resulting noise tensor will have a significantly lower standard deviation than a normal noise tensor, and if used to generate an image the generated image will be a blurry mess. It's quite possible the need to do this is caused by some sort of bug that I haven't discovered.