Scroll Top

Apple quietly launches its multi modal AI LLM rival called Ferret


When it comes to revolutionary AI development Apple isn’t in the race, but they are doing things quietly.


Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trendsconnect, watch a keynote, or browse my blog.

Artificial Intelligence (AI) researchers from Apple and Cornell University have quietly unveiled an open-source and multimodal Large Language Model (LLM) – which is the same basic construct that OpenAI uses to build their insanely popular ChatGPT and GPT-4 AI’s – known as Ferret, which is said to use parts of images as queries.


See also
DeepMind’s powerful AlphaZero AI makes the leap to quantum computing


According to VentureBeat, the release of Ferret on GitHub in November went completely under the radar, with no announcement being made. However, it has since gotten a lot of attention from AI researchers. Bart De Witte, who operates a non-profit focused on open-source AI in medicine, posted on X that the release of Ferret “solidifies Apple’s place as a leader in the multimodal AI space.” Which it doesn’t in my opinion because when it comes to AI Apple is still a comparative newbie despite having hundreds of billions of dollars at its disposal – just look at Siri and I rest my case.


The Future of AI and Generative AI, by Futurist Matthew Griffin


Anyways … the way Ferret works is that it examines a specific region of an image, determines the elements within it that could be of use in response to a query, identifies those elements, and draws a bounding box around them. Then, it can use the identified elements as part of a query, which it will respond to in a traditional manner.


See also
AI's "Screams of the damned" is the future of music


For instance, if a user highlights an image of an animal within a larger image, then asks the LLM what the animal is, it will respond to that query by identifying what species the creature is. It can then use the context of other elements it detects within the image to provide further responses or provide context on what the animal is doing.

The open-source Ferret model is a system that can “refer and ground anything anywhere at any granularity”, said Apple AI research scientist Zhe Gan in an earlier post on X: “AI researchers claim the release of Ferret is important as it demonstrates a surprising openness from Apple, which is in direct contrast to the company’s usual secretive nature.”


See also
World first as Microsofts speech recognition software becomes as accurate as humans


The open-source approach may suit Apple in the AI industry, however, as the company is struggling to compete with rivals such as Google, Microsoft, OpenAI et al due to a lack of computing resources. According to tech blogger Ben Dickson, Apple’s infrastructure is not designed to serve up LLMs at scale, which means the company cannot expect to compete with models such as ChatGPT. Apple therefore has to choose between partnering with a cloud hyperscale on its AI efforts, or share its work with the open-source community, similar to the approach taken by Meta.

Either way, while Apple IMHO is still way behind, it’s an interesting development, however over the long term it remains to be seen if Apple can make – or even wants to – any kind of a dent in the AI space.

Related Posts

Leave a comment


Awesome! You're now subscribed.

Pin It on Pinterest

Share This