Cross-Platform Harmony: Transferring Your Curated Playlists Between Apple Music and YouTube Music Unveiled
Unveiling Apple’s Revolutionary AI Tech to Transform Your iPhone Screen and Empower Siri
Getty Images/Yuuji
Despite not launching any AI models since the generative AI craze began, Apple is working on some AI projects. Just last week, Apple researchers shared a paper unveiling a new language model the company is working on, and insider sources reported that Apple has two AI-powered robots in the works. Now, the release of yet another research paper shows Apple is just getting started.
On Monday, Apple researchers published a research paper that presents Ferret-UI, a new multimodal large language model (MLLM) capable of understanding mobile user interface (UI) screens.
Also: Generating music using AI in Copilot just got even better
MLLMs differ from standard LLMs in that they go beyond text, showing a deep understanding of multimodal elements such as images and audio. In this case, Ferret-UI is trained to recognize the different elements of a user’s home screen, such as app icons and small text.
Identifying app screen elements has been challenging for MLLMs in the past due to their small nature. To overcome that issue, according to the paper, the researchers added “any resolution” on top of Ferret, which allows it to magnify the details on the screen.
Building on that, Apple’s MLLM also has “referring, grounding, and reasoning capabilities,” which allow Ferret-UI to comprehend UI screens fully and perform tasks when instructed based on the contents of the screen, according to the paper, as seen in the photo below.
K. You et al.
To measure how the model performs compared to other MLLMs, Apple researchers compared Ferret-UI to GPT-4V, OpenAI’s MLLM, in public benchmarks, elementary tasks, and advanced tasks.
Also: The best AI image generators to try right now
Ferret-UI outperformed GPT-4V across nearly all tasks in the elementary category, including icon recognition, OCR, widget classification, find icon, and find widget tasks on iPhone and Android. The only exception was the “find text” task on the iPhone, where GPT-4V slightly outperformed the Ferret models, as seen in the chart below.
K. You et al.
When it comes to grounding conversations on the findings of the UI, GPT-4V has a slight advantage, outperforming Ferret 93.4% to 91.7%. However, the researchers note that Ferret UI’s performance is still “noteworthy” since it generates raw coordinates instead of the set of pre-defined boxes GPT-4V chooses from. You can find an example below.
K. You et al.
The paper does not address what Apple plans to leverage the technology for, or if it will at all. Instead, the researchers more broadly state that Ferret-UI’s advanced capabilities have the potential to positively impact UI-related applications.
“The advent of these enhanced capabilities promises substantial advancements for a multitude of downstream UI applications, thereby amplifying the potential benefits afforded by Ferret-UI in this domain,” the researchers wrote.
Also: Google updates Gemini and Gemma on Vertex AI, and gives Imagen a text-to-live-image generator
The ways in which Ferret-UI can improve Siri are evident. Because of the thorough understanding the model has of a user’s app screen, and knowledge of how to perform certain tasks, Ferret-UI could be used to supercharge Siri to perform tasks for you.
There’s certainly interest in an assistant that does more than just respond to queries. New AI gadgets such as the Rabbit R1 get plenty of attention for being able to carry out an entire task for you, such as booking a flight or ordering a meal, without you having to instruct them step by step.
Artificial Intelligence
How I used ChatGPT to scan 170k lines of code in seconds and save me hours of detective work
6 ways to write better ChatGPT prompts - and get the results you want faster
6 digital twin building blocks businesses need - and how AI fits in
Google’s Gems are a gentle introduction to AI prompt engineering
- How I used ChatGPT to scan 170k lines of code in seconds and save me hours of detective work
- 6 ways to write better ChatGPT prompts - and get the results you want faster
- 6 digital twin building blocks businesses need - and how AI fits in
- Google’s Gems are a gentle introduction to AI prompt engineering
Also read:
- [Updated] 2024 Approved Dive Into Digital Fandom Top 6 Interactive Questionnaires to Identify Your YouTube Spirituality
- [Updated] In 2024, Facebook Favorites Top 8 Android/iPhone Apps for Social Popularity
- [Updated] Top 10 Innovative Nano Drones This Year for 2024
- 2024 Approved Meme Madness Twitter's Funniest Video Threads
- 馬達錢姆電影格式 (MXF) 轉換成 MP4: 最快、最容易的教學指南
- Cambiar Archivos WebM a OGV en Línea Gratis Con Convertidor - Movavi
- Comprehensive OBS Studio Analysis : Unveiling the Key Features and Insights - An In-Depth Guide
- Consumer-Centric Upgrade: Stellar's Newly Improved Data Recovery Software Promises Ease of Use - Press Release Highlight
- Disabling Stubborn Epic Launcher on Windows 11 PCs: Guide
- Discover How Bouncie’s Connected GPS Tracker Provides Hassle-Free Route Monitoring at a Wallet-Friendly Price Point
- Gratuité: Convertisseur Vidéo Online Pour Transformer 3GP en FLAC Et WAV Avec Movavi
- Guida Completa per La Conversione Gratuita Da MKV a FLV Online: Tecniche E Prove Dei Servizi Web Più Efficienti - Movavi
- How to Install and Update for Optimal Performance: SteelSeries Arctis 5 Headset Drivers
- How to Resolve Google Pixel 8 Screen Not Working | Dr.fone
- Movavi's Free Tool: Effortless Conversion From WAV to AIFF Format
- MP4 to OGV File Transformation Services - No Cost, Fast Results with Media Vault
- Top 13 Editores De Pregunta Libre Y De Voz Gratis Online - Selección Optimizada Para Google Por Movavi
- Top Pick: Ultimate Guide to the Finest PC Windows Software for Converting DVDs to MKV
- Windows 11で効果的にビデオ統合するためのトップ7テクニック
- Title: Cross-Platform Harmony: Transferring Your Curated Playlists Between Apple Music and YouTube Music Unveiled
- Author: Stephen
- Created at : 2025-01-13 21:29:39
- Updated at : 2025-01-16 19:31:06
- Link: https://tech-recovery.techidaily.com/cross-platform-harmony-transferring-your-curated-playlists-between-apple-music-and-youtube-music-unveiled/
- License: This work is licensed under CC BY-NC-SA 4.0.