The hidden side of Apple Intelligence

More than just another keynote recap.

Jun 12, 2024

Apple Intelligence has been out for barely 48 hours, and the internet is already full of hot takes.

One of the most misguided takes is that Apple is sending all of your data to ChatGPT, which has been pushed pretty loudly by Elon Musk (who totally definitely doesn't have an axe to grind with OpenAI).

To be clear, this is not what’s happening1. And if you only had a cursory glance at the keynote, it might be easy to assume that that's the case.

But in the days since the keynote, I’ve had a chance to learn more about the architecture behind Apple’s new Intelligence, and it's more sophisticated than I would have guessed from the keynote alone.

I don’t really know what this tagline means, to be honest.

Some necessary context

Just in case you haven't been paying attention, let's recap what Apple announced (feel free to skip this section if you're already caught up). Their website explains the new consumer features well, but I also appreciated

Unwind AI

’s recap and MKBHD's video. To summarize:

Apple Intelligence. A new personal intelligence system that integrates generative AI across the operating system (and available across the latest versions of iOS, iPadOS, and MacOS). It creates personalized experiences based on your device's context.

With text, Writing Tools can edit, rewrite, summarize, and more across the OS; Smart Reply will provide reply suggestions in Mail and summarize email threads in the inbox. With images, Image Wand can now generate images to go with your notes; Image Playground lets you run a cartoony image generator from your device; Genmoji are fully custom emoji that you can prompt on the fly. Again, for the full list, check out Apple's landing page.

An upgraded Siri. Siri is gaining new abilities from Apple Intelligence:

Better language understanding: new language models mean fewer "Sorry, I didn't catch that" responses from Siri.
Onscreen awareness: the assistant can view your screen and work with what it sees.
Personal context: Siri can pull data from your notes, emails, texts, and more (more on this below).
Take action: you can now ask Siri to send emails, create notes, edit photos, and more. Apple plans to let app developers expose more actions to Siri.
ChatGPT integration: if Siri can’t answer your question or handle your request, it can hand the task off to ChatGPT (with your permission).

An emphasis on privacy. The company mentioned terms like "semantic index2" and "private cloud compute," but I had many, many questions about how exactly those work. It's clear that the company is aware of the general unease when it comes to sending LLMs your data, and it's trying to reinforce its brand as the privacy-conscious tech company.

While the keynote was exciting, I was more intrigued by the machine learning details released after it.

Apple's many models

Two pieces of information clarified how Apple Intelligence works behind the scenes: their Platforms State of the Union video, and their Machine Learning developer blog post.

Taken together, they paint a much more detailed picture of what's going on behind the scenes. First and foremost, Apple is pushing to have as much of the generation done on-device as possible. They've done a lot of work in developing core foundation models to run on iPhones and iPads (and are restricting the new Apple Intelligence features to the latest devices, which can handle the memory requirements).

What's particularly interesting is their approach to balancing capability and performance. The more powerful your foundation model becomes, the slower and more energy-hungry it becomes - not ideal for smartphone battery life. Instead, Apple is training a lightweight core foundation model and using "adapters" to improve performance in specific use cases.

The core foundation model (which is so far just called "Apple On-Device") is only 3B parameters, making it firmly a "small" language model3. It's in the same ballpark as Phi-3-mini, one of the leading "SLMs" from Microsoft with 3.8B parameters, and the company details similar internal benchmark performance.

That said, anyone who's used Phi-3-mini will tell you that its output is noticeably worse than leading LLMs, like GPT-4 and Claude. One way Apple is trying to get around this is "adapters":

We utilize adapters, small neural network modules that can be plugged into various layers of the pre-trained model, to fine-tune our models for specific tasks.
...
The adapter models can be dynamically loaded, temporarily cached in memory, and swapped — giving our foundation model the ability to specialize itself on the fly for the task at hand while efficiently managing memory and guaranteeing the operating system's responsiveness.

A rough analogy here is The Matrix, where Neo gets new skillsets instantly downloaded into his brain - "I know kung fu." With adapters, the base language model is hot-swapping the ability to summarize, proofread, respond to emails, rephrase tones, etc.

This approach has given Apple a pretty insane latency advantage: according to their data, they have a time-to-first token of 0.6 milliseconds on an iPhone 15 Pro. They're also able to generate 30 tokens per second, which lags well behind GPT-4o and Claude Haiku, but is still very impressive considering it's happening on an iPhone.

Yet small language models have their limits. Apple hasn't said anything about how large a context window its on-device model has or what internal system prompts it uses. But a small language model simply won't cut it if Siri is trying to find answers from anything bigger than a handful of text messages or emails.

For these heavier workloads, Apple's relying on a second set of models hosted on a new kind of server infrastructure - the "Private Cloud Compute."

Capability vs privacy

One thing I noticed about the keynote is that Apple used "cloud" like a dirty word. Other tech giants run their AI in the cloud, where all of your data is sent along with it. Apple has private ML servers, where your data sits in a secure enclave, which is so impenetrable that not even Apple can access it.

I'm being glib, but that certainly seems to be the narrative that Apple wants to create. And honestly, it's not a bad approach.

Ethan Mollick

has discussed how all foundation model companies must overcome a major consumer hurdle: trust.

From

One Useful Thing

What every one of these companies needs to succeed, however, is trust. There are a lot of reasons why people don’t trust AI companies; from their unclear use of training data to their plans for an AI-dominated future to their often-opaque management. But what most people mean by trust is the question of privacy (“will AI use what I give it as training data?”) and that has long been answered. All of the AI companies offer options where they agree to not use your data for training, and the legal implications for breaching these agreements would be dire. But Apple goes many steps further, putting extra work into making sure it could never learn about your data, even if it wanted to.

I'm sure that in an ideal world, Apple would do 100% of its AI work on your iPhone and would never send any data to the cloud. But that's not doable with today’s tech, so instead, it's trying to make those cloud workloads as private as possible. A few ways it's doing that:

A new server stack based on iOS and its Secure Enclave/Secure Boot features
Stateless computation and no persistent storage (user data is not even to Apple admins)
No privileged runtime access, remote shell access, or even server logging
Publicly-inspectable builds for security purposes

Apple is trying to build an extremely secure system to send "zero-shot" requests to its more powerful server models. You might be inclined to scoff at this as mostly a marketing ploy, but if there's one giant tech company that I'd believe is earnestly trying to do this right, it's probably Apple. Privacy is a big part of their brand, and I'm sure they take the threat of your texts and emails leaking very seriously.

That said, focusing on privacy is also a good diversion tactic (for now). Apple is being very tight-lipped about its server models’ size, performance, and benchmarks. We know they’re “bigger” than the on-device models, but that’s about it.

The company has released some internal benchmark stats, as well as the IFEval (instruction following) benchmark, but none of the major ones like MMLU or HELM.

I think that's fine4 and probably smart, to be honest. Apple's not going to win the benchmark game, and it doesn't need to. The average iPhone user doesn’t know what MMLU is, and doesn't care. They care whether Apple Intelligence will Just Work.

And to that end, even Apple's server models aren't going to cut it for today’s consumer needs. The world has seen what ChatGPT, Gemini, and Claude can do. However good Apple's server models are, they're very very unlikely to match the best models coming out of Deepmind and OpenAI (at least for now). So they're creating ways to leverage third-party AIs.

Strange bedfellows

Let’s revisit the integration between OpenAI and Apple. Many, many headlines were written about this partnership - some speculated whether ChatGPT would replace Siri entirely.

From OpenAI's perspective, the partnership is great - ChatGPT gets deeper hooks into iOS, and millions of new users get introduced to what GPT-4o can do.

From Apple's perspective, the narrative was much more pessimistic - a trillion-dollar company forced to rely on a (relative) upstart because its AI teams couldn't get their act together. And I'm sure some parts of that narrative ring true on some level.

But Apple is positioning this integration as something closer to the App Store - Siri will have many AIs, from many companies. You, the user, must opt-in to share your data with OpenAI. It's not clear exactly how fine-grained that control will be - making some of those big privacy declarations ring a little hollow - but there will be some amount of friction here.

Casey Newton

Platformer

noticed as much:

While the Apple partnership clearly represented a win for OpenAI after a bruising few weeks, I was also struck at the degree to which Apple played down its significance during the keynote. Sam Altman did not appear in the keynote, though he was present at the event. And at the iJustine event, Federighi took the unusual step of saying that other models — including Google Gemini — would likely be coming to the operating system.

I’m guessing that Apple doesn't want to be fully dependent on a single third-party AI provider and wants to make room for if (when?) it can replace ChatGPT with AppleGPT. With that in mind, it makes very little sense for Apple to hand over all of your data to ChatGPT.

Some data will be sent with each request, that much is obvious. It's impossible for ChatGPT to answer questions about your photos without Apple sending it your photos. But ultimately what the company is doing is creating a unified framework for other models to operate in - and in doing so, creating a way for Apple users to access the latest and greatest in LLMs without Apple having to spend billions on GPUs.

I think Ben Thompson's Strachery put it well:

There are lots of things that broad-based models like ChatGPT are good at that Apple isn’t even attempting to take on, and why would they? These models cost an astronomical amount of money to train and inference, and whatever differentiation exists is predicated on factors that Apple doesn’t have a competitive advantage in. What Apple does have is a massive userbase that the model that wants to win will desire access to, so why not trade that access for the ability to leverage whichever model agrees to Apple’s terms?

So Elon’s hot take about Apple creating a massive security problem with its OpenAI integration is barking up the wrong tree. Users can send device data to ChatGPT, sure, but they could always do that - it’s just gotten a little easier.

One more thing

While I'm happy to nerd out about the nuts and bolts of Apple's tech and strategy, I want to take a step back and ask a bigger question: so what?

Apple Intelligence is very slick - it might be the thing that finally gets me to upgrade my version of MacOS. But it's also another stepping stone in the tech industry's mad dash to roll out AI to billions of users worldwide.

When Apple Intelligence is widely released this fall, it's not crazy to think there will be 100M devices capable of running it, for free, from day one. That might be the single biggest launch of OS-level LLM capabilities.

In comparison, Microsoft has Copilot Pro locked behind Microsoft 365 subscriptions, and Copilot in Windows is available in select preview. Likewise, Gemini Nano is only available on Google Pixel 8 phones, which have a fraction of the market share of Apple’s devices. Both companies will undoubtedly expand their AI userbase, but they’re likely going to be smaller than Apple’s for now.

But as we're all bombarded with new AI features and functionality, it's worth considering these tech companies' vision. So far, they've mostly looked fairly same-y: provide productivity features like auto-writing messages and summarizing email threads5.

It's interesting to contrast the real-world use cases of Apple Intelligence with the company's Responsible AI principles:

Empower users with intelligent tools: We identify areas where AI can be used responsibly to create tools for addressing specific user needs. We respect how our users choose to use these tools to accomplish their goals.
Represent our users: We build deeply personal products with the goal of representing users around the globe authentically. We work continuously to avoid perpetuating stereotypes and systemic biases across our AI tools and models.
Design with care: We take precautions at every stage of our process, including design, model training, feature development, and quality evaluation to identify how our AI tools may be misused or lead to potential harm. We will continuously and proactively improve our AI tools with the help of user feedback.
Protect privacy: We protect our users' privacy with powerful on-device processing and groundbreaking infrastructure like Private Cloud Compute. We do not use our users' private personal data or user interactions when training our foundation models.

Can we aim for more than summarization and copy-editing tools? Should we aim for more? We still don't know the full capabilities of today’s foundation models, and we're just starting to figure out their sharp corners6.

I don't have a great answer. But even if I did, I'm not sure it would matter. Clearly, we're going to be getting more AI on more devices, whether we like it or not.

To quote the keynote directly: “You’ll be able to access ChatGPT for free without creating an account. Your request and information will not be logged. And for ChatGPT subscribers, you’ll be able to connect your account and access paid features right within our experience. Of course, you’ll be in control of when ChatGPT is used and will be asked before any of your information is shared. ChatGPT integration will be coming to iOS 18, iPadOS 18, and macOS Sequoia later this year.”

I didn’t end up discussing it at length here, but Semantic Index mostly seems like Apple’s version of a RAG system - they’re indexing your device content and retrieving it with a semantic search. I hope we get more details on the nuts and bolts at some point!

I'm talking about the language capabilities here, but from what I can tell the same thing is happening with image generation and a core diffusion model. Rather than adapters for specific writing/editing tasks, the diffusion model is using adapters for specific visual styles.

It’s fine partly because it’s a reasonable strategy for Apple, and it’s fine partly because I’m of the opinion that there’s a lot of fudging going on with benchmarks anyway.

When you look at the Google I/O keynote and the Apple WWDC one, you could argue they have more in common than not. Though Siri's Actions framework points towards richer, agentic behavior on Apple's side.

Apple, cleverly, is trying to compensate for that by sandboxing the tech. The Apple Intelligence language tasks look pretty well-defined, and the image generation ones are limited to custom emoji and cartoonish images. That won't work forever, but I can imagine it helps mitigate bad PR if, say, your Genmoji of historical figures all come out as people of color.

Logan Thorneloe

Jun 13, 2024

lol at the tagline, I thought the same thing

Expand full comment

Andrew Smith

I tend to agree with you on pretty much all of this, Charlie. Apple's privacy badge is a huge part of their identity, so they have to play that up, and doing so makes perfect sense.

When I heard this news and saw the instant knee-jerk reactions about how Apple wasn't doing anything particularly new, but just putting things together that others had worked on, but making it better.... I thought: "yes, that is what Apple does best. Let them do this!"

To that end, it'll be really interesting to see if they really do this well, at least compared with some of the products we're already using.

4 replies by Charlie Guo and others

10 more comments...