- AI Logs
- Posts
- š©š½āš» Open AI's Sora, Large World Models, Google AI's Genie, and MoodCaptureAI
š©š½āš» Open AI's Sora, Large World Models, Google AI's Genie, and MoodCaptureAI
Plus: Prompt of the week, image upscaling, plus this weekās best tools and AI news.
Welcome back to AI Logs! Last week Sora (OpenAIās upcoming text-to-video taking the world by storm prelaunch) only got a mention, but this week I want to break it down further, as thereās a lot to unpack from it beyond just incredible studio-quality videos. Additionally, Stable Diffusion 3 was just revealed, and its outputs are incredible, giving way to speculation that with more GPUs, StableVideo could be here soon, and rival the quality Sora is already delivering in countless demos surfacing everywhere.
But what makes Sora so remarkable is the MODEL that it is built on, namely: Large World Model (LWM). This is different than the common LLMs (large language models) that ChatGPT, Gemini, Mistral, Perplexity, Pi, etc are built on, and a potentially huge step for AI and humanity. Last week in the āTools of the Weekā I highlighted a paper by Runway introducing this concept of Large World Models that are trained on physics, reasoning, nature, and so much more. One such application is training simulations to train robots, another is sensor reading; even things like self-driving vehicles will benefit greatly from AI āunderstandingā and fully comprehending the world around it, and how it works.
In addition to LWM, the handheld device Rabbit R1 which was the darling of CES is built on another new model, the Large Action Model, which is intended to effectively understand user intent. Multi-modal models are obviously models that can process multiple inputs, and when we think about the combination of Large World Models with both Large Action Models and Large Language Models, the implications are staggering.
The second thing is Groq, a new AI company (not to be confused with Twitterās Grok or Elonās exās AI toy that hopes to read kidsā minds, also called Grok), and particularly their new technology, LPUs, which are a new type of chip, an alternative to GPUs, that are incredibly quick, engineered and fine-tuned for AI, and could be industry disruptors.
IE+ SUPPORT INTERESTING ENGINEERING
Invest In Science And Engineering
Enjoy exclusive access to the forefront of AI content, highlighting trends and news that shape the future. Join a community passionate about AI, delve into the latest AI breakthroughs, and be informed with our AI-focused weekly premium newsletters. With IE+, AI reporting goes beyond the ordinary - and it is Ad-Free.
NEWS
š¤ Move aside Optimus, Figure AIās Figure 01 is here
Check out Figure AIās Figure 01 humanoid worker robot doing its thing around a warehouse.
šø Mistral AI: Microsoft invests $16 million in OpenAIās French doppelganger
The Microsoft-Mistral agreement draws the scrutiny of the European Commission.
š Study: MoodCapture AI-powered scans facial expressions to detect depression
But youāll have to wait 5 years for this app.
MUST READ
Tim RocktƤschel from Google DeepMindās Open-Endedness Team announced the development of another artificial intelligence-inspired systemāGenie.
Genie is the first-ever generative interactive AI application to be trained exclusively from 200,000 hours of internet videos. According to the announcement, the model can generate an endless variety of action-controllable 2D worlds from image prompts. This marks a significant leap in the world of AI.
OTHER IMPORTANT UPDATES
š§ Using AI, political parties are bringing back their dead leaders
Over 60 nations are heading to the polls this year.
š Samsung unveils āworldās fastestā data processing AI chip to date
Samsung doubles down in HBM race with largest memory.
PROMPT OF THE WEEK
This week, Iāll do things a little differently and feature a tool that can help you prompt better, which Iāll also feature in the ātools of the weekā called Say What You See by Google, in the Google Labs suite available via Google website or mobile app.
It provides AI generated images it created (despite not currently creating images due to the debacle of creating ātoo diverseā and not historically accurate depictions of humans lately, see āImage of Weekā for more) and has you write prompts to try to recreate them. It scores you on a number of things, and is gamified so they get harder and harder to solve.
I consider myself to be a pretty good prompt engineer but it can be challenging to get images that are exactly what Iām going for, and this tool is already helping me level up significantly.
So, while this weekās prompt of the week isnāt a prompt itself, itās intended to help you create superior prompts all on your own. Enjoy!
AI PICTURE OF THE WEEK
Ariel Zilber on Google Gemini, āgenerate a picture of one of the founding fathers of the USA.ā
TUTORIAL
Iāll keep it short and sweet this week, how to dig into Googleās suite of AI.
The easiest way is via their mobile app, in which at the top left corner is a beaker icon that if you click it, opens up Google Labs (features above in āTools of the Weekā). Also, at the top middle of the app is a Google āG Search" with a red diamond next to it. If you click that red diamond it switches over to Gemini, and now youāre in Googleās competitor to ChatGPT. Pro tip: sign up for two free months of āGoogle Oneā and get Gemini Advanced, their best version and comparable to GPT4, for free as well.
To access these from web/desktop, go to https://labs.google and https://gemini.google.com.
what else?
šØ For IEās daily engineering, science & tech bulletin, subscribe to The Blueprint
š§š»āš§ For expert advice on engineering careers, subscribe to Engineer Pros
š· For all the weekās top engineering stories, subscribe to the Vital Component
āļø To explore the wonders of mechanical engineering, get your Mechanical
š¬ For a weekly round-up of our best science, tech & engineering videos, subscribe to IE Originals
For our weekly premium newsletter and an ad-free experience, sign up for IE+