On Sunday, a Reddit user named “Ugleh” posted an AI-generated image of a spiral-shaped medieval village that quickly gained attention on social media for its striking geometric qualities. A follow-up post garnered even more praise, with the tweet garnering more than 145,000 likes. Ugleh created the images using static diffusion and a guidance technique called controlnet.
Reactions to the artwork online have ranged from surprise and amazement to respect for something new being developed in generative AI art. “Never seen pictures like this. Something new in the art world,” wrote one X user. “Tbh, I’ve seen a lot of AI art, been in this space for a long time, and this is one of the most amazing pieces of art I’ve seen. You did great,” AI artist Kaliyug wrote on X .
Perhaps most notably, Y-Combinator co-founder and frequent social media tech commentator Paul Graham wrote, “This was the point where AI-generated art passed the Turing test for me.” While Graham refers to the Turing Test (which tests whether a machine’s behavior differs from that of a human) metaphorically rather than literally, he was clearly impressed.
Not everyone was impressed, of course, with some X users trying to pick apart the compositional elements of the AI-generated spiral village. “It’s great, but there are a lot of decisions a human can’t make,” wrote a graphic designer named Trent. “A lot of the shadows aren’t right, and there’s no point in putting chimneys above the windows. Even zooming in there are tell-tale noise patterns from the AI art.”
In June, we covered a technique that used AI image synthesis models Stable Diffusion and ControlNet to create QR codes that look like rich artwork with anime-inspired art. Ugleh took the same neural network that was optimized to generate QR codes (which are themselves geometric shapes) and fed them simple images of spirals and checkerboard patterns instead.
After being guided by the prompt, “A medieval village with busy streets and a castle in the distance (Excellence Sample:1.4), (High Quality), (Detailed),” ControlNet renders scenes where the artistic elements of the images match the perceptual shapes of spirals and checkerboards. In one image, clouds arc overhead and people stand in a gentle curve to match the spiral guidance. In another, a checkerboard-shaped scene of clouds, hedges, building faces, and wagon carts makes
The Magic of ControlNet
So how does it work? We’ve covered static diffusion many times before. It is a neural network model trained on millions of images scraped from the internet. But the key here is ControlNet, which first appeared in a February 2023 paper by Lvmin Zhang, Annie Rao, and Manish Aggarwala titled “Adding Conditional Control to Text-to-Image Diffusion Models” and quickly became popular in static diffusion. Community
Typically, a static diffusion image is created using a text prompt (called text2image) or an image prompt (img2img). ControlNet introduces additional guidance that can take the form of information extracted from the source image, including pose detection, depth mapping, normal mapping, edge detection, and more. Using ControlNet, someone creating AI artwork can more closely replicate the shape or pose of a subject in an image.
Using ControlNet and similar prompts, Ugleh’s work is easy to replicate, and others have done so to interesting effect, including checkerboard anime characters, animations, medieval village “goats” (surprisingly safe for work), and a medieval village version of “.The girl with the pearl earring.”
Despite the massive attention and many offers to turn the artwork into an NFT, Ugleh has chosen to keep a low profile for now. On X, he said, “I appreciate all the positive feedback about AI art, I’m not looking to make money from my latest builds, and I won’t be doing any official interviews. I’m just a normal tech-savvy AI nerd who experimented with new ControlNet techniques.” “
If you want to experiment with ControlNet, this site has a good tutorial. Also, Ugleh posted a step-by-step workflow with spiral and checkerboard template files on Imgur.
Although the artwork is remarkable, current US copyright policy indicates that images do not meet the standards to receive copyright protection, so they may be in the public domain. While AI-generated artwork is still a controversial topic for many on ethical and legal grounds, creative enthusiasts are using these new tools to push the boundaries of what is possible for the unskilled or untrained practitioner. It is still uncertain whether or how the law will recognize the essential human spark of motivation that makes such acts possible.