Google has unveiled Veo 3, its latest AI-powered video generation model.
According to CNBC on Tuesday, Google introduced Veo 3 at its annual developer conference, I/O 2025, in Mountain View, California. The model’s standout feature is its ability to incorporate sound, setting it apart from previous iterations. This advanced capability will be accessible to enterprise users through the Gemini app, Flow, and Vertex AI platforms.
Eli Collins, Vice President of Product at Google DeepMind, highlighted Veo 3’s impressive capabilities, saying that Veo 3 goes beyond mere text and image inputs, accurately simulating physical laws and achieving precise lip-syncing. Collins further noted that the model can replicate conversations between characters and even animal vocalizations. Collins emphasized remarkable comprehension, adding that the model produces vivid clips that bring the narrative to life when given a story prompt.
Alongside Veo 3, Google announced Imagen 4, an enhanced AI image generator. This upgraded tool produces higher-quality images based on user input, capturing intricate details such as complex textures, water droplets, and animal fur. Imagen 4 will be integrated into various Google products, including the Gemini app, Whisk, Vertex AI, and Workspace tools like Slides and Business.
In related developments, Google has enhanced its existing AI video generation model, Veo 2. The update allows users to add or remove objects within videos using text commands alone. Furthermore, Google has expanded access to Lyria 2, its AI music generation model, making it available to YouTube Shorts creators and businesses leveraging Vertex AI.