DALL·E 3
OpenAI teased DALL·E 3, the next iteration of its text-to-image tool. It will be available to ChatGPT Plus and Enterprise customers in October, and via API later this year.
Between the lines:
The (likely cherry-picked) examples are very impressive, showcasing accurate text generation and very sophisticated prompt usage.
DALL·E 3's ChatGPT integration means millions of users can generate images inside of their existing workflows.
Plus, API access is a big deal - Midjourney is stuck behind Discord's interface, and have repeatedly declined to build an API.
The big picture:
It's being heralded as a Midjourney killer on Twitter, but I'm not so sure. Midjourney's image quality still seems to be the best in class.
And as clutch as a ChatGPT integration is, millions of people also sign into Discord every day - with Midjourney just one click away.
Ultimately, every other month brings massive improvements in AI image generation. Text designs, photorealism, and hands have all been mastered in just the past year.
Elsewhere in AI image generation:
Before DALL·E 3, Ideogram launched an AI image generator that could handle typography.
Controllism: the new genre of AI art made with ControlNet.
And a look at Poisson flow models, a physics-inspired alternative to diffusion models that are 10X-20X faster.
Satya and Sundar
Microsoft and Google announced some major AI updates and upgrades this week. Satya Nadella and Sundar Pichai keep pushing consumer AI forward, one launch at a time.
Google's Bard learns some new tricks:
The chatbot can now search Gmail, Drive, Maps, and YouTube when answering questions.
It can also fact-check itself against Google search results, flagging any statements that aren't supposed by third-party information.
But it's worth taking Bard's upgrades with a grain of salt - so far this year, Google's AI launches have been decidedly underwhelming.
Microsoft's Copilot readies for takeoff:
Microsoft held a Surface event on Thursday, and released new details on when its Copilot(s) will be available.
The first desktop Copilot is launching next week, available on Windows 11. It's bringing together several previously separate AI assistants under one identity.
Plus, the company's 365 Copilot for Word, Excel, and Outlook, is coming in November. But it won't be free: Microsoft plans to charge $30 per month per user for the AI tool.
YouTube goes all in on AI:
Likewise, YouTube held its "Made on YouTube" event this week, and unveiled a number of generative AI features.
Dream Screen lets creators generate photos and videos to use as the background for YouTube Shorts, the company's TikTok competitor.
And YouTube Creator Studio will soon be able to generate personalized video ideas, suggest music to go with the video, and automatically dub videos into other languages.
Break glass in case of catastrophe
Anthropic detailed its Responsible Scaling Policy (RSP) - a framework for managing catastrophic risks from AI. The system uses AI Safety Levels (ASL) to define potential risks and recommended safety protocols.
Why it matters:
We're all searching for a way to think about and manage AI safety, preferably sooner rather than later. This year has been one giant AI arms race, and that trend appears to be continuing.
Safety is an industry-wide concern: OpenAI launched its Red Teaming Network, a group of experts to advise the company's AI risk mitigation strategies.
And if AI companies don't figure it out, governments will: the UK set new principles for regulating foundation models, with plans to engage top AI companies (including Anthropic and OpenAI) next year.
Elsewhere in AI anxiety:
Musicians are eyeing a legal shortcut to fight AI voice clones, as authors launch another class-action lawsuit against OpenAI.
Amazon limits self-publishing authors to three books a day, after a massive influx of AI-generated material.
And a look at the other AI arms race: between tools to detect AI-generated content, and tools to evade detection.
Things happen
OpenAI and ChatGPT lawsuit list. Alexa's new generative AI features. Data firms are hiring poets to improve their AI writing tools. Tether spend $420M on GPUs. Op-ed: The US should not regulate most AI. Q&A with DeepMind co-founder Mustafa Suleyman. AI announcements from Intel Innovation 2023. A profile of Microsoft CTO Kevin Scott. Generative AI's Act Two. Meet your AI executive assistant. We can't compete with AI girlfriends. Toyota's "Kindergarden for robots." Why open-source AI will win. Google and the DoD release AI-powered "Augmented Reality Microscope."
Bard could be a game changer for me. I use it most of the time when researching, say for an article. Having a line of sight (being able to ask Bard questions about) into my emails and Drive would be a HUGE game-changer. I'm sure I've had loads of conversations via email about whatever I want to write about, so grabbing that data is tremendous. I've been waiting for something like this from the Google suite for business reasons, too: we already have proprietary info shared w/Google, and I have zero interest in widening that footprint with, say, CodeInterpreter via ChatGPT.
Charlie, let me know if you're playing with this yet, or with DallE's integration into ChatGPT. Dall E is another service I use every day.
Super excited to test DALLE-3. Actually just wrote the draft of its 10X AI section earlier today (comparing that long prompt with Midjourney's rendering of it):
https://i.imgur.com/bDNhg1L.png
Not even close.
I love Midjourney, but I'm happy that OpenAI has opened a new dimension for competition other than just making the images look better and better. Improving natural language processing will open text-to-image up to even more users.
Lots of end-user-focused developments this week, so 10X AI is an all-news edition again.