• AI Logs
  • Posts
  • šŸ‘©šŸ½ā€šŸ’» Open AI's Sora, Large World Models, Google AI's Genie, and MoodCaptureAI

šŸ‘©šŸ½ā€šŸ’» Open AI's Sora, Large World Models, Google AI's Genie, and MoodCaptureAI

Plus: Prompt of the week, image upscaling, plus this weekā€™s best tools and AI news.

Welcome back to AI Logs! Last week Sora (OpenAIā€™s upcoming text-to-video taking the world by storm prelaunch) only got a mention, but this week I want to break it down further, as thereā€™s a lot to unpack from it beyond just incredible studio-quality videos. Additionally, Stable Diffusion 3 was just revealed, and its outputs are incredible, giving way to speculation that with more GPUs, StableVideo could be here soon, and rival the quality Sora is already delivering in countless demos surfacing everywhere.

But what makes Sora so remarkable is the MODEL that it is built on, namely: Large World Model (LWM). This is different than the common LLMs (large language models) that ChatGPT, Gemini, Mistral, Perplexity, Pi, etc are built on, and a potentially huge step for AI and humanity. Last week in the ā€œTools of the Weekā€ I highlighted a paper by Runway introducing this concept of Large World Models that are trained on physics, reasoning, nature, and so much more. One such application is training simulations to train robots, another is sensor reading; even things like self-driving vehicles will benefit greatly from AI ā€œunderstandingā€ and fully comprehending the world around it, and how it works.

In addition to LWM, the handheld device Rabbit R1 which was the darling of CES is built on another new model, the Large Action Model, which is intended to effectively understand user intent. Multi-modal models are obviously models that can process multiple inputs, and when we think about the combination of Large World Models with both Large Action Models and Large Language Models, the implications are staggering.

The second thing is Groq, a new AI company (not to be confused with Twitterā€™s Grok or Elonā€™s exā€™s AI toy that hopes to read kidsā€™ minds, also called Grok), and particularly their new technology, LPUs, which are a new type of chip, an alternative to GPUs, that are incredibly quick, engineered and fine-tuned for AI, and could be industry disruptors.

Did a friend forward this e-mail to you?

IE+ SUPPORT INTERESTING ENGINEERING
Invest In Science And Engineering

Enjoy exclusive access to the forefront of AI content, highlighting trends and news that shape the future. Join a community passionate about AI, delve into the latest AI breakthroughs, and be informed with our AI-focused weekly premium newsletters. With IE+, AI reporting goes beyond the ordinary - and it is Ad-Free.

NEWS

MUST READ

Tim RocktƤschel from Google DeepMindā€™s Open-Endedness Team announced the development of another artificial intelligence-inspired systemā€“Genie.

Genie is the first-ever generative interactive AI application to be trained exclusively from 200,000 hours of internet videos. According to the announcement, the model can generate an endless variety of action-controllable 2D worlds from image prompts. This marks a significant leap in the world of AI.

OTHER IMPORTANT UPDATES

PROMPT OF THE WEEK

This week, Iā€™ll do things a little differently and feature a tool that can help you prompt better, which Iā€™ll also feature in the ā€œtools of the weekā€ called Say What You See by Google, in the Google Labs suite available via Google website or mobile app.

It provides AI generated images it created (despite not currently creating images due to the debacle of creating ā€œtoo diverseā€ and not historically accurate depictions of humans lately, see ā€œImage of Weekā€ for more) and has you write prompts to try to recreate them. It scores you on a number of things, and is gamified so they get harder and harder to solve.

I consider myself to be a pretty good prompt engineer but it can be challenging to get images that are exactly what Iā€™m going for, and this tool is already helping me level up significantly.

So, while this weekā€™s prompt of the week isnā€™t a prompt itself, itā€™s intended to help you create superior prompts all on your own. Enjoy!

AI PICTURE OF THE WEEK

Ariel Zilber on Google Gemini, ā€œgenerate a picture of one of the founding fathers of the USA.ā€

TUTORIAL

Iā€™ll keep it short and sweet this week, how to dig into Googleā€™s suite of AI.

The easiest way is via their mobile app, in which at the top left corner is a beaker icon that if you click it, opens up Google Labs (features above in ā€œTools of the Weekā€). Also, at the top middle of the app is a Google ā€œG Search" with a red diamond next to it. If you click that red diamond it switches over to Gemini, and now youā€™re in Googleā€™s competitor to ChatGPT. Pro tip: sign up for two free months of ā€˜Google Oneā€™ and get Gemini Advanced, their best version and comparable to GPT4, for free as well.

To access these from web/desktop, go to https://labs.google and https://gemini.google.com.

Written by

Cory Warfield

LinkedIn Top Voice/Influencer in AI

what else?

šŸšØ For IEā€™s daily engineering, science & tech bulletin, subscribe to The Blueprint

šŸ§‘šŸ»ā€šŸ”§ For expert advice on engineering careers, subscribe to Engineer Pros

šŸ”· For all the weekā€™s top engineering stories, subscribe to the Vital Component

āš™ļø To explore the wonders of mechanical engineering, get your Mechanical

šŸŽ¬ For a weekly round-up of our best science, tech & engineering videos, subscribe to IE Originals

For our weekly premium newsletter and an ad-free experience, sign up for IE+


Give Feedback