Imagine you have only one Photograph of a kitchen and the computer is able to create a 3D video as if you were walking into it with the camera on.
That’s what you want to do GAUDIa new system based on artificial intelligence capable of generating images of three-dimensional spaces from a fixed image and an order given by text.
The project is still in the early stages and should evolve in the coming years.
Its name is in honor of the Catalan architect Antoni Gaudíresponsible for iconic works in the city of Barcelona such as the Holy Family it’s the Park Guell.
We present GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera.
Translating, based on a simple photo, it is possible to create moving images of spaces, as if a video had been made in the environment.

Given the order “walk to the kitchen”, taking a still image as a reference, GAUDI can generate plans at different levels to artificially build a visual path of movement.
THE search behind this project details that this model uses a scalable two-stage approach. First, the system is responsible for learning a representation of the environment, suitable for different camera angles. The distribution of these representations is then modeled in a navigable space.
The project is still in its early stages, so the images are still low resolution. But it is already possible to get a sense of the technology and its capabilities.
Of course, this only reinforces the rumors that Apple is seriously working on virtual reality devices.
And by the floor of the carriage, we can expect some really impressive things.
It’s a pretty technical subject, but if you’re interested in finding out more about the project, you can see its evolution on Apple’s official Github page.