It may be the latest buzzword on Android, but Apple’s iPhone is not seen as an AI-infused smartphone. That’s set to change, and we now know one way in which Tim Cook and his team plan to catch up.

The details come in a newly released research paper by researchers from Cornell University working with Apple. Titled “Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs,” it details a multimodal large language model that can be used to understand what is displayed on a screen, specifically the elements in a mobile user interface, such as the display of an iPhone.

Thanks to a large supply of training data, it is possible to pick out icons, find text, parse widgets, describe in text what is on screen, parse the interface elements, and interact with the display while being guided by open-ended instructions and prompts.

Ferret was released in October 2023 and designed to parse photos and images to recognise what was on show. This upgrade to the snappily titled Ferret-UI will offer several benefits to those using it with their iPhone and could easily fit into an improved AI-powered Siri.

Being able to describe the screen, no matter the app, opens up a richer avenue for accessibility apps, removing the need to pre-program responses and actions. Those looking to perform complex tasks or find obscure options on their phone could ask Siri to open up a complex app and use an obscure function hidden away in the depths of the menu system.

Developers could use Ferret-UI as a testing tool, asking the MLMM to act as if it was a 14-year-old with little experience with social networks to perform tasks or simulate a 75-year-old user trying to connect to Facetime with their grandchildren.

Google publicly started the rush for AI-first smartphones on October 4th, a little more than three weeks after the launch of the iPhone 15. Tim Cook and his team did not make any noticeable announcements or draw attention to the AI improvements tucked away in its photo processing or text auto-correction, giving Android a head-start on AI and allowing Google’s mobile platform to set expectations.

Apple’s Worldwide Developer Conference takes place in June, and it will be the first moment Apple can engage with the public to discuss its AI plans as it lays the foundations for the launch of the iPhone 16 and iPhone 16 Pro in September.

Until then, we have the academic side of Apple’s AI approach to be going on with.

Now read why the iPhone’s approach to AI is disrupting the specs for the iPhone 16 and iPhone 16 Plus…

Share.
Exit mobile version