AI Logs
Posts
👁️ Meta's V-JEPA predictive vision model, AI race, Open AI's Sora and Eleven Labs

👁️ Meta's V-JEPA predictive vision model, AI race, Open AI's Sora and Eleven Labs

Plus: Prompt of the week, image upscaling, plus this week’s best tools and AI news.

February 21, 2024

A lot has happened in the world of AI since last week, so let’s jump right into it! Meta has raised several eyebrows, from revealing their V-JEPA predictive vision model, teasing a wristband that is said to “sense intentions” and allow the wearer to access AI with their mind (like a less invasive Neuralink), and Zuck came out after playing with the Apple Vision Pro and said that he believes the Quest is a superior product, regardless of price!

It seems as though Meta and even Apple are both finally ready to start unveiling more of their AI programs and products, as the “race” between OpenAI/Microsoft and Google keeps heating up. To quote Nelly, “it’s getting hot in here”. We are also seeing serious advancements in the world of autonomous AI agents, which I suspect OpenAI and Google (and all) are working on or eying an acquisition of in real-time. The Google/OpenAI race was evident this last week when Google announced that they have trained Gemini 1.5, and right afterward OpenAI revealed Sora, their new text-to-video/ image-to-video generation tool which, at least in all the video demos, is set to be significantly better than any/all other g-AI video creation products on the market. Oh, and OpenAI also announced that it is getting “memory” in the near future, meaning it will remember things for users even across multiple threads and sessions.

This past week we also saw Eleven Labs launch a feature where users can monetize their voices (train Eleven Labs to speak in your voice and let others use your voice to say whatever they want it to and pay you for it), and many other AI tools seem to be hyper-focused on letting users monetize their generations and work as well.

Did a friend forward this e-mail to you?

IE₊SUPPORT INTERESTING ENGINEERING
_{Invest In Science And Engineering}

Enjoy exclusive access to the forefront of AI content, highlighting trends and news that shape the future. Join a community passionate about AI, delve into the latest AI breakthroughs, and be informed with our AI-focused weekly premium newsletters. With IE+, AI reporting goes beyond the ordinary - and it is Ad-Free.

NEWS

🧤 MIT unveils adaptive smart glove that makes touch the teacher
MIT researchers unveil a revolutionary smart glove integrating tactile feedback for enhanced learning, robotics, and virtual reality interactions.
🧠 Elon Musk: Neuralink's first patient can control mouse via thought
Neuralink’s brain chip implant–Telepathy has the potential to grant paralyzed patients the ability to utilize their cognitive faculties to interact with various technological devices.
💊 Prescription patrol: New AI spots dangerous drug combos before they harm you
The model identifies that doxycycline, an antibiotic, could interact with warfarin, a blood thinner.

MUST READ

Gobi desert renewables could be China's AI ace up the sleeve

China, as per The South China Morning Post (SCMP), now generates half the power capacity of the US using renewables in the Gobi and western deserts, mainly solar and wind. Northwestern China's installed capacity nears 500 GW, reaching 600 GW with the Gobi Desert's inclusion.

Despite intermittency, over half of the region's energy facilities rely on renewables, boasting 95 percent efficiency. Northwest China, spanning 1.16 million square miles and including Xinjiang, historically underdeveloped due to its harsh terrain, now exemplifies China's renewable energy success.

OTHER IMPORTANT UPDATES

🤖 Meet ADAM: An advanced personal robotic partner for elderly support
The robot features a modular design and adaptability to suit indoor settings perfectly.
👫 AI can tell if you're a man or woman with 90% accuracy
A Stanford study may have just cracked the mind code.
🧠 Scientists harness AI to predict dementia 15 years before symptoms appear
The study used AI and blood samples from 50,000 to predict dementia 15 years in advance, offering hope for early intervention.

PROMPT OF THE WEEK

I try to share prompts that are platform/LLM agnostic for the most part, and have shared “Professor Synapse” through several lenses here as a “super prompt” in the past that’s effectively able to turn ChatGPT, Gemini, Perplexity etc into “agents”, but today I want to share a prompt hack (not specifically a prompt) that is, at least currently, unique to Perplexity, because it’s new, powerful, and consistent with the theme of Agents I brought up in the intro, as well as will be featuring in the ‘Tools of the Week’ section. And it is… Perplexity Collections (their foray into agents). To create Collections:

Library >> Collections plus sign (+)

Title >> Enter helpful text

Icon >> Add (resonant with prompt)

Description >> Enter helpful text:

Prompt purpose reminder

Prompt-prompt copy/paste solution

How-to instructions

AI Prompt >> Enter your desired prompt

Privacy >> As desired

Create >> Select

AI PICTURE OF THE WEEK

What if emojis could be enhanced by the power of AI? This is the challenge Dogan Ural took on with eight animal emojis yesterday, with impressive results. Earlier in the week, he also used Magnific.ai to enhance face emojis, but some of them were frankly disturbing. Animals are far cuter.

TUTORIAL

Here is one Google Gemini came up with for real-time immersive story generation:

Idea: Real-Time Immersive Story Generation

Imagine a world where you can be a co-creator of a story that unfolds in a virtual environment right before your eyes.

How it works

The Seed: You provide a short prompt to Gemini. Example: "A lone astronaut stranded on Mars struggles to survive."

The Web of Imagination: Gemini generates a captivating narrative, including:

Detailed plot development with unexpected twists

Vivid character descriptions

Dialogue that feels emotionally authentic

Visual Translation: Gemini collaborates with a visual AI engine (like DALL-E 2 or Midjourney) to translate textual descriptions into real-time generated 3D scenes.

The Immersive Experience: You're immersed in the story using VR technology.Scenes change around you based on Gemini's evolving narrative.

Dynamic Co-creation: You influence the direction of the story with decisions or new prompts for Gemini. The narrative and the visuals adjust accordingly.

Tutorial Outline:

1. Introduction

* The Power of Gemini: Explain Gemini's capability in advanced language modeling, multi-modal understanding, and generative reasoning.

* Visual AI: Concepts of text-to-image generation and 3D rendering.

2. Setting the Stage

* Tools: VR headset, Gemini interface, visual AI engine access

* Prompting Gemini: Tips for open-endedness and fostering creativity.

3. The Collaborative Loop

* Step 1: The Initial Prompt (Demonstration with the Mars example)

* Step 2: Visual AI Translation (Show the results in VR)

* Step 3: Gemini Extends the Narrative (Explain how new text is generated)

* Step 4: Your Influence (Provide examples of how a user can give input)