Generating Images based on Text Prompts
We observe that as the number of inference steps increases, the images get more
and more detailed. For very low inference steps (i.e. 4), we see that the image is
still just noise. Only the rocket is discernable. As num_inference_steps
increases, not only does the prompt become recognizable in the image, but the image
also becomes sharper/more detailed. Hallucinations in the images (in the form of mistaken
shades/colors) also disappear.
num_inference_steps=4
an oil painting of a snowy mountain village
After Stage 1
After Stage 2
a man wearing a hat
After Stage 1
After Stage 2
a rocket ship
After Stage 1
After Stage 2
num_inference_steps=6
an oil painting of a snowy mountain village
After Stage 1
After Stage 2
a man wearing a hat
After Stage 1
After Stage 2
a rocket ship
After Stage 1
After Stage 2
num_inference_steps=10
an oil painting of a snowy mountain village
After Stage 1
After Stage 2
a man wearing a hat
After Stage 1
After Stage 2
a rocket ship
After Stage 1
After Stage 2
num_inference_steps=15
an oil painting of a snowy mountain village
After Stage 1
After Stage 2
a man wearing a hat
After Stage 1
After Stage 2
a rocket ship
After Stage 1
After Stage 2
num_inference_steps=20
an oil painting of a snowy mountain village
After Stage 1
After Stage 2
a man wearing a hat
After Stage 1
After Stage 2
a rocket ship
After Stage 1
After Stage 2
num_inference_steps=40
an oil painting of a snowy mountain village
After Stage 1
After Stage 2
a man wearing a hat
After Stage 1
After Stage 2
a rocket ship
After Stage 1
After Stage 2
num_inference_steps=100
an oil painting of a snowy mountain village
After Stage 1
After Stage 2
a man wearing a hat
After Stage 1
After Stage 2
a rocket ship
After Stage 1
After Stage 2