Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

How does the pix2pix image to image model work?


Asked by Rodney Hill on Dec 05, 2021 FAQ



The pix2pix model works by training on pairs of images such as building facade labels to building facades, and then attempts to generate the corresponding output image from any input image you give it. The idea is straight from the pix2pix paper, which is a good read.
Thereof,
The pix2pix uses conditional generative adversarial networks (conditional-GAN) in its architecture. The reason for that is that even if we trained a model with a simple L1/L2 loss function for a particular image-to-image translation task, this might not understand the nuances of the images.
Also, The Pix2Pix GAN is a general approach for image-to-image translation. It is based on the conditional generative adversarial network, where a target image is generated, conditional on a given input image.
Similarly,
In contrast, the Generator in pix2pix resembles an auto-encoder. The Generator takes in the Image to be translated and compresses it into a low-dimensional, “Bottleneck”, vector representation. The Generator then learns how to upsample this into the output image.
In respect to this,
Pix2Pix GAN is an implementation of the cGAN where the generation of an image is conditional on a given image. Just as GANs learn a generative model of data, conditional GANs (cGANs) learn a conditional generative model.