the information just dropped a piece claiming that, and a lot more
the technique copies capabilities from a trillion-parameter model down into something light enough to run on-device. that's the foundation of their next ai push
• but here's the catch – the full gemini model is too heavy even for apple's own private cloud compute infrastructure. so some siri queries are getting routed to a licensed gemini instance running on google cloud. apple's own servers couldn't handle it
• to keep user data protected while using google's cloud, apple just approved a security feature from nvidia called confidential compute. it basically encrypts everything – your query, the ai model processing it – while it's all happening on nvidia's chips. there's a small slowdown as a tradeoff, but it means apple can still claim your data is safe even when it's being processed on someone else's servers
• they're also shopping for acquisitions. liquid ai, a cambridge-based startup focused on efficient on-device inference, is reportedly on their radar. no deal closed yet
wwdc is when we get the feature layer on top of all this. the infra story is already leaking out

Addy Crezee