• Big Machines
  • Posts
  • šŸŽ Zuck Assembles a Super Nerd Army While Apple Has an AI Tantrum

šŸŽ Zuck Assembles a Super Nerd Army While Apple Has an AI Tantrum

Is Tim Cook gaslighting us? Zuckerberg is building a super nerd army, and Apple unveils to the world that they've got the squirts.

šŸ“‘ The nerds over at Cupertino have shone a light on artificial intelligence as we know it after releasing a paper that essentially shits all over the most current reasoning models.

Apple’s ā€œIllusion of Thinkingā€ paper was released this week and sent shockwaves through the AI community, essentially claiming that even the most sophisticated reasoning models fundamentally lack genuine cognitive abilities.

But it begs the question: does the report hold weight, or are they just miles behind everyone else? We will be breaking that down below, along with all the other news that happened in AI this week…

Oh, and Sam is away this week! So we drafted in his older, much better-looking brother Matt to take do his job. Let us know how we do at the end, so we can fire Sam with just cause.

šŸ—žļø What we are covering today…

  • Apple shits all over reasoning models

  • AI pilots a fighter jet in combat for the first time

  • Researchers secretly experiment on Reddit users

  • Zuckerberg is building a super nerd army

  • OpenAI launches o3-pro and cuts the price by 80%

  • Palantir is helping the US government spy on you

  • And Apple unfortunately, tells the world they’ve got runny-bum

šŸ”“ Quick Note: We like to cover loads of AI news in our newsletter, so for a better reading experience, we suggest opening this in your browser for the full experience! 

Head to the ā€˜READ ONLINE’ tab at the top of this email.

šŸ‘ļø šŸ‘ļø What you might have missed

Zuckerberg is personally recruiting a new superintelligence AI team at Meta: ā€œSuper nerds, assemble!ā€ Zuckerberg cries as his new team of elite ā€˜superintelligence’ experts enters the fray. Meta’s CEO is looking to personally put together a new AI superintelligence division, aimed at making the Facebook and IG parent company a leader in next-gen AI models. It looks like Zuckerberg is coming good on his promise to ā€œinvest heavily in artificial intelligenceā€ with Meta plunging $14.3bn worth of investment into Scale AI, a data training platform for businesses looking to develop AI models. Scale AI CEO, Alexandr Wang, will be the new superintelligence division’s Captain America – a recruit that has set Zuck if the AI rumour mill is anything to be believed. Marvel at that.

We recently covered that around 75% of the original Meta AI team have since left, which is relatively high compared to their would-be competitors Anthropic (80%), DeepMind (78%) and OpenAI (67%). I bet Anthropic have a bring your dog to work day once a week, which is contributing to that high retention rate …

Researchers Secretly Experimented on Reddit Users: Ever wondered about the efficacy of AI’s infiltration in social media for malignant means? Well, researchers from the University of Zurich spent four months experimenting on unsuspecting Reddit users doing exactly that. They deployed AI-powered Reddit accounts that posted 1,783 comments on r/ChangeMyView, posing as fictional personas – including sexual assault victims and political partisans – to test persuasive capabilities. The AI accounts earned 137 ā€œDeltasā€ (CMV’s marker for persuasion), which outperformed humans by 3- 6x with no Redditors detecting the deception. CMV moderators condemned the study as unethical manipulation, citing harm from fake trauma-related personas. A clear lesson for any of you thinking about responding to that DM from BustyBlonde69 in the future…

OpenAI launch an updated o3-model and it’s super cheap (relatively speaking): The shipping doesn’t stop over at Sam Altman HQ, and this week they have dropped the price of their most advanced reasoning model o3 by 80%!

Now, if they could work on the inference time so I don’t have to wait 5,000 years for my prompt, of ā€œCreate me a business plan for a $1,000,000 a month business that requires little-to-no effort and DON’T MAKE MISTAKESā€.

In all seriousness for the more complex tasks this is a mega reduction and we have been using it internally a lot more now!

Palantir Builds Vast US Surveillance NetworkšŸ: Palantir Technologies is spearheading the creation of what critics call ā€œthe most expansive civilian surveillance network in U.S. historyā€, stitching together data from the IRS, Social Security, immigration records, and more via its Gotham software . In what feels like a McCarthy-esque wet dream, the initiative, already supported by an OG Trump-era exec order for interagency data sharing, has been designed to promote the sharing of behavioural analysis, fraud detection, and national security screening. It’s an approach critics say could be ā€˜unconstitutional’ and argue creates a ā€œdigital dragnetā€ that is open to governmental abuse (hello ICE) and that could erode citizens’ right to privacy without proper governance, especially poignant after the immigration riots in LA the past week.

AI Pilots Fighter Jet in Combat for First Time: Saab, the Swedish aerospace manufacturer perhaps best-known for making cars with a mind of their own, has trialled its first fighter jet with one of its own. Using tech firm Helsing’s AI agent ā€˜Centaur’, Gripen E jets were pitted against real-life pilots in a series of aerial exercises, dubbed ā€˜Project Beyond’. And although no decisive winner emerged, the trial demonstrated AI can match seasoned pilots in high‑stakes aerial engagements. The initiative was funded by the Swedish government as a part of their wider effort to develop next-gen fighter aircraft. ā€˜Beyond’ is indicative of the rapidening arms race between developed nations to develop autonomous warfare systems, but you can’t help but think it will progress the foundations for pilotless commercial air travel in the future. Who is the captain now? No seriously…

Mistral Debuts "Magistral," Europe’s First Reasoning Model: We’re huge fans of Mistral and big proponents of Open Source/Weights and Data, so when we saw that they had released their first reasoning model(s), Magistral, we were really happy. It stands up pretty well against some of the other open-source reasoning models out there, like DeepSeek R1. It is quite excellent at Math, too, albeit a little bit nuts at times.

Guy with a 9-year Duolingo streak deletes the app because he says it's becoming hollow after CEO says they are prioritizing speed and moving with From a brand perspective, you know you’ve really screwed the pooch when your most consistent and engaged customer quits on you in protest. That’s what one Duolingo user has done, as they gave up their nine-year usage streak in revulsion at the language learning app’s decision to shed human workers in their push for wider AI integration. Tech entrepreneur Heath Ahrens highlighted as such in a post on X, slamming Duolingo CEO Luis von Ahn’s recent LinkedIn post declaring the company would go ā€œAI‑first,ā€ calling it ā€œthe most tone‑deaf message in tech history.ā€ 

Gabbard used AI to understand classified JFK assassination files: Headline news that could’ve got the tinfoil hat crew a little excited, U.S. Director of National Intelligence Tulsi Gabbard revealed at an AWS summit that her office used AI to expedite the declassification of approximately 80,000 pages of JFK and RFK assassination files. The AI system scanned and flagged sensitive content, especially details affecting living relatives, far faster than manual review, compressing what could have taken months or years into weeks. Sadly, no word yet if it was our reptilian overlords responsible for the historic assassinations, but the use of clinician-grade chatbots in government work signifies a broader shift towards the adoption of AI tools to free workers from repetitive tasks. However, its consistency may only be on par with the interns, though, with thousands of items of sensitive personal data, such as social security numbers, being unredacted. Doh.

NVIDIA builds world-first industrial AI cloud to advance European manufacturing: Nvidia and Deutsche Telekom are partnering to launch Europe’s first industrial AI cloud, or ā€œAI factoryā€, in Germany by 2026. The huge data centre is designed to power AI workloads for manufacturers. The data centre will feature 10,000 Nvidia Blackwell GPUs, including DGX B200 systems and RTX PRO servers, supporting digital twins, robotics, engineering simulation, factory planning and logistics. Nvidia CEO Jensen Huang framed the effort as a move toward ā€œsovereign AI,ā€ boosting Europe’s digital independence by combining local infrastructure with global innovation, while Chancellor Merz highlighted its strategic value for Germany’s economy and its own digital sovereignty

Ilya Sutskever made a rare appearance after coming out from building his $30bn zero-product company, Safe Superintelligence Inc., this week. To be fair to the man, he delivered a very cool speech to the graduates of the University of Toronto on the near future of AI. The theme was how AI will be doing all the things. Basically saying your degree will be fucking worthless in a few years, you idiots.

Is AI use finally slowing down? RAMP's data suggests that AI has had its first plateau since early January 2023: Eric Glyman, co‑CEO of fintech multinational Ramp, observed that foundation AI model adoption has ā€œflatlinedā€ in May,  marking the first time usage growth stalled since early January 2023. With these foundation models experiencing over 40% adoption already, Glyman didn't specify underlying causes but implied the industry may be reaching a saturation point or facing new challenges. The development highlights a shift in AI integration dynamics and raises questions about whether foundational model deployment is levelling off or if broader factors are at play. Might be time to load those short trades… just kidding, you have to be an idiot to step in the way of this momentum, even if everything is drastically overpriced.

šŸ—£ļø Other Titty Bits 

Apple CEO: Tim Cook

It’s official, Apple has taken a massive, steaming dump from a height, over the effectiveness of AI’s current reasoning capability. 

In its whitepaper The Illusion of Thinking, the tech behemoth has delivered a fascinating reality check about how well advanced AI models can perform problem solving tasks of varying complexity.

Apple’s team tested Large Reasoning Models (LRMs), the kind that underpin ChatGPT, Gemini et al., using classic logic puzzles, such as Tower of Hanoi and River Crossing, which are structured to test real reasoning ability. 

It tested models including OpenAI’s o3, Google’s Gemini Thinking, Anthropic’s Claude 3.7 Sonnet-Thinking and DeepSeek-R1.

Researchers said the LRMs’ ability to solve logic problems presented ā€œfundamental limitationsā€ in the technology, and that when presented with problems of high complexity the models would experience a ā€œcomplete accuracy collapseā€ where they would fail to generate a correct answer repeatedly.

In some tests, Standard Reasoning Models (SRMs) could actually perform simpler tasks more effectively. 

The paper said: ā€œUpon approaching a critical threshold, which closely corresponds to their accuracy collapse point, models counterintuitively begin to reduce their reasoning effort despite increasing problem difficulty.ā€

Sounds very much like me throughout school…

This indicated a ā€œfundamental scaling limitation in the thinking capabilities of current reasoning modelsā€.

We may speculate we’ve crossed the Rubicon in terms of AIs runaway ability to solve increasingly complex, or never-solved-before problems, but Apple’s research findings suggest we should temper expectations about the likelihood of Artificial General Intelligence (AGI) in the near future.

Andrew Rogoyski, of the Institute for People-Centred AI at the University of Surrey, said Apple’s research indicates the industry was ā€œstill feeling its wayā€ on AGI. He said: ā€œThe finding that large-scale models lose the plot on complex problems, while performing well on medium- and low-complexity problems, implies that we’re in a potential cul-de-sac in current approaches.ā€

Others have been more scathing, with one prominent U.S. AI academic, Gary Marcus, a US academic, saying Apple’s paper is ā€œpretty devastatingā€, and that ā€œanybody who thinks LLMs are a direct route to the sort [of] AGI that could fundamentally transform society for the good is kidding themselves.ā€ Ouch.

Marcus goes on to suggest we’ll never reach the speculated version of an AGI we’ve hyped up, as the data sets we’ve provided are too inconsistent to allow LRMs and LLMs to consistently generate the correct answers, time and again in logical patterns.

The responses…

Others, however, have pointed out that this is just serious cope from Apple because they are so so so far behind everyone else in the AI race. Users even plugged it into ChatGPT’s new o3 model to see what it had to say…

Source r/LocalLLaMA

And the best one of all…

A certain Mr C. Opus then slapped back with its first paper, aptly named The Illusion of the Illusion of Thinking… worth a read.

šŸ“‹ LLM Leaderboard

Source: LMArena

Damn, Google are still in pole position with Gemini 2.5 Pro but we have a new contender up there brushing shoulders with it… you guessed it ChatGPT-o3’s recent release…

šŸ“² Trending tools & apps

🫵 Our Picks


What caught our eye this week.

We honestly mean this, just go try Lovable… We’re not sponsored or anything, but this is the most aha moment for anyone who hasn’t used AI before. It is free for this weekend only. Type anything in the chat box and watch your idea come to life. You’ll thank us for it later.

  • GenSpark has won the full team over this week. The browser is a game changer, we’re thinking of making a video tutorial on how to use it because it is so so so so good. If anything (not to steal any light here), it has made us super excited for Comet, Perplexity’s browser. IT LAUNCHES TODAY.

Hidden Hack… GenSpark $30 a month tier has a very generous credit allowance, which you can get cheap Veo3 access to! No need to pay the $250 a month on Gemini Pro.

  • Manus Chat - This is a really cool tool to use in Manus, which we are finding ourselves using more and more. If you haven’t already,y it is a little more complex purely down to how powerful it can be, but it would certainly be worth the time and effort

  • Firecrawl continue to push more and more cool products. This time they have created a self-populating Clay (CRM) alternative for when you forget their details or want to find new prospects. Seriously impressed with these guys.

šŸ¤“ Educational


Want to actually understand this stuff? Start here.

 šŸ”„ Top Trending

Top trending apps this week that you have probably never heard of.

  • Perplexity’s Comet - IT IS ABOUT TO DROP

  • Chat4Data – A Chrome plugin that turns you into a no-code web scraper. Just ask in plain English, and boom, you’ve got structured data (like an Excel sheet) from any webpage.

  • VibrantSnap – Record your screen, drop in a studio‑quality AI avatar, add backgrounds or gradients, and you’ve got a pro‑looking animated clip ready to share, no editing skills needed.

šŸ¤ In Partnership with OpenServ

Who’s Bridging the Agentic Framework Gap?

Agentic AI frameworks are a fragmented mess, forcing builders to juggle platforms and rebuild agents to keep up with daily tech drops. It’s exhausting.

OpenServ fixes this with an AI orchestration layer for seamless interoperability across all frameworks. Focus on what matters: building slick workflows and real outcomes, not platform-hopping.

No-code newbie or pro dev? OpenServ’s Playground Beta lets you spin up agents or tackle complex use cases with ease. Try it now at openserv.ai.

šŸ’ø Financials

šŸ•µļøā€ā™‚ļø FREE ENTRY TO OUR INVITE-ONLY AI CHAT ON TELEGRAM…

If you share this newsletter with a friend and they actively sign up for the Big Machines newsletter, we will send you access to our invite-only Big Machines Telegram group, which is full of builders, investors, founders, and creators.

Access is now only granted to those who refer our newsletter to active subscribers, which means if you sign up on your work email, we will know you sneaky bastards.

This would kill our open rate, so please don't do that, we beg.

šŸ‘‹ Until next week

Remember, we said Apple took a steaming dump over LRMs? Well, if this screengrab from an Apple Glass ad is anything to go by, it was diarrhoea…

Very unfortunate placement indeed…

We’re getting GTA VI-style commercials made with AI, err, before GTA VI. AI filmmaker PJ Ace was commissioned by a real-life television network to make this quite frankly bonkers, non-real TV spot ahead of the NBA Finals. Incredible.

And, we have the full playbook for how you can create AI ads like this with instructions from PJ himself! Have at it.

After spending more and more time on ChatGPT’s new advanced voice feature this week, I have decided it is over for me. This meme represents us all.

Finally, Meta have let the boomers run wild with their new standalone AI app which some clever fucker has decided that all conversations with their in-app model should be posted to a public feed lmfaoaoaooaoaoaoaooaoao

We have just taught my nan to stop sharing lost cats from Missouri, even when she lives in the North East of England. People are literally sharing their internal monologues with the world, and I am not even sure they are aware.

Anyway, lock up your elders and don’t let them on this app…

Have a good one! See you next wizzle!

Matt, Sam, Grant, Mike and The Big Machines team.

āœļø How are we doing?

We need your feedback to improve the information we give to you

Login or Subscribe to participate in polls.

Reply

or to participate.