Google’s Genie 2 “world model” reveal leaves more questions than answers

You May Be Interested In:Snag this do-it-all Ryzen mini PC for an absurdly low $208


As podcaster Ryan Zhao put it on Bluesky, “The design process has gone wrong when what you need to prototype is ‘what if there was a space.'”

Gotta go fast

When Google revealed the first version of Genie earlier this year, it also published a detailed research paper outlining the specific steps taken behind the scenes to train the model and how that model generated interactive videos. No such research paper has been published detailing Genie 2’s process, leaving us guessing at some important details.

One of the most important of these details is model speed. The first Genie model generated its world at roughly one frame per second, a rate that was orders of magnitude slower than would be tolerably playable in real time. For Genie 2, Google only says that “the samples in this blog post are generated by an undistilled base model, to show what is possible. We can play a distilled version in real-time with a reduction in quality of the outputs.”

Reading between the lines, it sounds like the full version of Genie 2 operates at something well below the real-time interactions implied by those flashy GIFs. It’s unclear how much “reduction in quality” is necessary to get a diluted version of the model to real-time controls, but given the lack of examples presented by Google, we have to assume that reduction is significant.

Oasis’ AI-generated Minecraft clone shows great potential, but still has a lot of rough edges, so to speak.


Credit:

Oasis

Real-time, interactive AI video generation isn’t exactly a pipe dream. Earlier this year, AI model maker Decart and hardware maker Etched published the Oasis model, showing off a human-controllable, AI-generated video clone of Minecraft that runs at a full 20 frames per second. However, that 500 million parameter model was trained on millions of hours of footage of a single, relatively simple game, and focused exclusively on the limited set of actions and environmental designs inherent to that game.

When Oasis launched, its creators fully admitted the model “struggles with domain generalization,” showing how “realistic” starting scenes had to be reduced to simplistic Minecraft blocks to achieve good results. And even with those limitations, it’s not hard to find footage of Oasis degenerating into horrifying nightmare fuel after just a few minutes of play.

share Paylaş facebook pinterest whatsapp x print

Similar Content

Android Auto
Found code suggests that Google Gemini is coming to Android Auto
Cosori Dual Blaze Twinfry 10L Air Fryer review: a top-notch gadget with some frustrating drawbacks
Cosori Dual Blaze Twinfry 10L Air Fryer review: a top-notch gadget with some frustrating drawbacks
Maxtang FP750 review
Maxtang FP750 review
Tomodachi Life miss dance next to a shushing Robert Downey Jr., and the Google Pixel 9a
ICYMI: the week’s 7 biggest tech stories from Nintendo’s last Switch direct to the Google Pixel 9a finally getting a release date
NYT Strands homescreen on a mobile phone screen, on a light blue background
NYT Strands today — my hints, answers and spangram for Monday, January 13 (game #316)
Samsung Galaxy Tab S9 Ultra on cyan background with price cut sign
Skip the Tab S10 – the excellent Samsung Galaxy Tab S9 Ultra drops to its lowest price
Global Gazette | © 2024 | News