Data is the new oil
Every AI model requires labeled training data as a raw ingredient. With so much money poured into AI, companies are now paying big sums to license images and videos.
Why it matters:
Early LLMs were trained on scraped internet text and YouTube videos (over 1M hours in GPT-4's case), but many sources now require licensing fees or block AI scraping altogether.
Now, companies are paying hundreds of dollars per hour for video and making multi-million dollar licensing deals with newspapers.
We may "run out" of new, human-made training data in the next few years, though many researchers are trying to use synthetic alternatives.
Elsewhere in model mayhem:
Mistral AI launches Mixtral 8x22B, its latest "mixture of experts" model.
GPT-4 Turbo with Vision is now generally available.
And Meta confirms that Llama 3 is coming next month.
Google Cloud Next
This week, Google Cloud Next was once again packed with AI announcements. It’s clear that Google is 1) getting serious about GCP enterprise customers and 2) flexing its infrastructure muscles.
All the announcements I could find:
Gemini 1.5 Pro is now in public preview on Vertex AI, with up to 1M tokens in its context window.
Gemini Code Assist, Google's code-generating competitor to GitHub Copilot Enterprise.
Gemini in Databases, a bundle of AI-powered, database-focused tools for Google Cloud customers.
Gemini in Threat Intelligence, a component of the company’s Mandiant cybersecurity platform.
Gemini in Android Studio, a rebrand of the AI developer assitant "Studio Bot."
Imagen 2, a text-to-image model with in/outpainting, is generally available in Vertex AI.
Magic Editor and other AI tools are free for Google Photos users.
Vertex AI Agent Builder, a new tool to help companies build AI agents.
A collaboration with WPP to use Gemini to create ads for leading brands.
In partnership with Bayer, a new AI product for radiologists.
Paid AI addons for Google Workspace.
And Axion, a new ARM-based chips designed to power AI applications.
Elsewhere in the FAANG free-for-all:
Meta releases OpenEQA, a benchmark to measure an AI agent's "embodied question-answering" skills.
Apple researchers published a paper on Ferret-UI, a multimodal LLM optimized for mobile UI screens.
Microsoft announced Microsoft AI London, a London hub for its new consumer AI division.
And Amazon added Andrew Ng, a pioneer of AI research, to its board.
AI, eh
This week, Canada launched a $1.8 billion package to boost its AI sector, most of which will fund more compute and AI infrastructure.
Between the lines:
Governments have closely watched (and occasionally regulated) AI for the last year. Now, many are committing to new funding for research and data centers.
Of course, new government funding often sidesteps much of the conversation around AI’s environmental impacts.
And it doesn't mean regulation will disappear: UK watchdogs raised antitrust concerns while US legislators pushed for more transparency this week.
Elsewhere in AI infrastructure:
Meta unveiled the next generation of its Meta Training and Inference Accelerator(MTIA) chips for AI training, which are now in production.
Apple reportedly plans to overhaul its entire Mac lineup with AI-focused M4 CPUs.
And Intel debuted Gaudi 3, a new processor that Intel claims has 50% better inference and 40% better efficiency than Nvidia's H100s.
Things happen
TikTok is testing AI influencers to read ad prompts. Amazon CEO: "Generative AI may be the largest technology transformation since the cloud." Humane AI Pin review: not even close. Poe introduces new "price per message" monetization for chatbots. The lifecycle of a code AI completion. I’m still trying to generate an AI Asian man and white woman. Is Google's AI actually discovering 'millions of new materials?' Blind internet users struggle with error-prone AI aids. How some teachers are using ChatGPT to make lesson plans and grade essays. Building reliable systems out of unreliable agents. AI is coming for the OnlyFans chat industry. The Weather Channel is going to be making AI weather videos. After AI beat them, professional Go players got better and more creative. Jony Ive and Sam Altman's hardware startup is in talks to raise funding. You can't build a moat with AI. Texas is replacing thousands of human exam graders with AI. Spotify adds personalized AI playlists from custom prompts. "AI Instagram influencers" are deepfaking their faces onto real women’s bodies. More Agents Is All You Need. "Anyone got a contact at OpenAI. They have a spider problem."