Z-Image Prompt Assist

Benchmark comparison chart of all aesthetics offered via "PASS" and how Z-IMAGE failed most of them.

comfyui

generative ai

Comparison

This is a comparison benchmark running Z-Image Turbo model testing the various Genre options found in my latest toll Prompt Assistant v1.1, affectionately nicknamed PASS.

For the sake of testing purposes, I am using exactly the same prompt, same seed, the same sampler, same scheduler and same resolution of 824 x 1328 ( 1536 x 2048 would have been ideal ). The only change will be on the initial aesthetic value that will be note on each image caption.

I hope this comparision can be somewhat useful to understand what genre means in generative art. You are invited to play around with the PASS tool and use to guide your future generations with JSon and Long string text outputs ready to be copied and pasted nto your workflow.

Prompt to be used

				
					Photorealistic, Bright / Airy aesthetic. Subject: A 40yr female Mixed Latino-European minimalist (wearing a matte grey suit without shirt underneath, long trousers and high-heels), pose: leaning on her car, intense gaze expression. Featuring: a Ferrari (yellow with pristine body conditions). Scene set in a skyscrapper rooftop in Dubai with soft diffused light. Technical: high angle, shot on Hasselblad medium format, 50mm lens.

WHAT I LEARNED FROM THIS BENCHMARK

Z-Image Turbo excels in beautiful photorealistic images of daily subjects as long it stays within a cinematic / editorial portraiture realm.

When experimenting with more artistic and edgy prompts pushing the aesthetics far away from the common reality, Z-Image Turbo starts to fail and ignores these prompt instructions, as if it is addicted or trained only to a certain aesthetic only.

Z-Image can not follow specific Aesthetic/Genre prompts.

Z-IMAGE BEST SAMPLER + SCHEDULER RESULTS

Sampler + scheduler comparison - all native combinations

Just aesthetics is not enough

The benchmark proved that by changing just the aesthetics value will cause no identifiable changes to the output result. Z-Image can not understand the aesthetic value on it’s own, rather you have to change completely one portion of the prompt to ensure the aesthetic is described in more words.

Take a closer look on the next examples with comparison on aesthetics word value and whole portion prompt rewriting.

Important note: Styles like surreal, glitch, dystopian and others similar that pushes a very specific appeal to the end result will absolutely not be considered by Z-Image rendering this model useless if your intent is anything other than Editorial / Street / Cinematic or Photorealistic.

Final veridict

By no means I am assuming that Z-Image Turbo has severe flaws, rather pin-pointing that this model seems to be built to attend a specific use: realism for editorial, cinematic and street photography featuring humans.

Rendering objects can be complex for Z-Image and logos and specific parts details on objects can not be handled properly. For example: the vehicle is identifiable as a Ferrari but the logo, emblem and details on the wheels, handles and front lights are completely mishandled.

Some of the details flaws can be rebuilt with upscaling techniques or even inpainting, thus, requiring an extra process to fix it.