mastodon.online is one of the many independent Mastodon servers you can use to participate in the fediverse.
A newer server operated by the Mastodon gGmbH non-profit

Server stats:

11K
active users

#aihype

6 posts6 participants0 posts today

Who could have predicted this? 🙄 state-of-the-art LLMs score 5% on the 2025 mathematical olympiad despite having been trained extensively on past editions :

arxiv.org/abs/2503.21934

arXiv logo
arXiv.orgProof or Bluff? Evaluating LLMs on 2025 USA Math OlympiadRecent math benchmarks for large language models (LLMs) such as MathArena indicate that state-of-the-art reasoning models achieve impressive performance on mathematical competitions like AIME, with the leading model, o3-mini, achieving scores comparable to top human competitors. However, these benchmarks evaluate models solely based on final numerical answers, neglecting rigorous reasoning and proof generation which are essential for real-world mathematical tasks. To address this, we introduce the first comprehensive evaluation of full-solution reasoning for challenging mathematical problems. Using expert human annotators, we evaluated several state-of-the-art reasoning models on the six problems from the 2025 USAMO within hours of their release. Our results reveal that all tested models struggled significantly, achieving less than 5% on average. Through detailed analysis of reasoning traces, we identify the most common failure modes and find several unwanted artifacts arising from the optimization strategies employed during model training. Overall, our results suggest that current LLMs are inadequate for rigorous mathematical reasoning tasks, highlighting the need for substantial improvements in reasoning and proof generation capabilities.
#ai#AIhype#llm

seeing what ChatGPT, Claude and Gemini can do today, I think that soon AI-generated products and services will be fighting with other AI-generated products and services.

Very interesting times await us.

How to find your way in this? I have an idea - I will become an ascetic, or some slow-life farmer. #AIHype

🎈 Alibaba’s Tsai Warns of ‘Bubble’ in AI Data Center Buildout

“I start to see the beginning of some kind of bubble,” he told delegates to the summit. Some of the envisioned projects commenced raising funds without having secured “uptake” agreements, he added. “I start to get worried when people are building data centers on spec. There are a number of people coming up, funds coming out, to raise billions or millions of capital.”

finance.yahoo.com/news/alibaba

Yahoo Finance · Alibaba’s Tsai Warns of ‘Bubble’ in AI Data Center BuildoutBy Luz Ding
#ai#aihype#bubble

"My core theses — The Rot Economy (that the tech industry has become dominated by growth), The Rot-Com Bubble (that the tech industry has run out of hyper-growth ideas), and that generative AI has created a kind of capitalist death cult where nobody wants to admit that they're not making any money — are far from comfortable.

The ramifications of a tech industry that has become captured by growth are that true innovation is being smothered by people that neither experience nor know how (or want) to fix real problems, and that the products we use every day are being made worse for a profit. These incentives have destroyed value-creation in venture capital and Silicon Valley at large, lionizing those who are able to show great growth metrics rather than creating meaningful products that help human beings.

The ramifications of the end of hyper-growth mean a massive reckoning for the valuations of tech companies, which will lead to tens of thousands of layoffs and a prolonged depression in Silicon Valley, the likes of which we've never seen.

The ramifications of the collapse of generative AI are much, much worse. On top of the fact that the largest tech companies have burned hundreds of billions of dollars to propagate software that doesn't really do anything that resembles what we think artificial intelligence looks like, we're now seeing that every major tech company (and an alarming amount of non-tech companies!) is willing to follow whatever it is that the market agrees is popular, even if the idea itself is flawed.

Generative AI has laid bare exactly how little the markets think about ideas, and how willing the powerful are to try and shove something unprofitable, unsustainable and questionably-useful down people's throats as a means of promoting growth.
(...)
In short, reality can fucking suck, but a true skeptic learns to live in it."

wheresyoured.at/optimistic-cow

Ed Zitron's Where's Your Ed At · The Phony Comforts of AI OptimismA few months ago, Casey Newton of Platformer ran a piece called "The phony comforts of AI skepticism," framing those who would criticize generative AI as "having fun," damning them as "hyper-fixated on the things [AI] can't do." I am not going to focus too hard on this blog, in

AI Will Replace Engineers? 🤖
Oh, sweet summer child…

AI isn’t thinking. It’s autocomplete on steroids.
It guesses confidently, fails quietly, and we pretend it’s magic.

Engineers won’t be replaced.
They’ll be cleaning up AI’s mess.

But sure, dream of your AI CEO…
Full Post: linkedin.com/posts/yuna-morgen

We are living this decade of #aiHype lamebrain experts every year with imminent #AIRadiologists standalone replacement of #HumanRadiologists pushed out all over the news...

Uncurious to learn other fields #IroniesOfAutomation research

erictopol.substack.com/p/when-

Last paper closer:

» Moreover, #radiologists take significantly more time to make a decision when #AI information is provided «

economics.mit.edu/sites/defaul

#AutomationParadox #LessonsLearned in #Aviation #HumanFactors applied training

1/2

Ground Truths · When Doctors With A.I. Are Outperformed by A.I. AloneBy Eric Topol
Continued thread

“There is a long, long, long way from writing basic reports to the kind of AI that could match the originality of top human scientists.”
—Gary Marcus, Ezra Klein’s new take on AGI – and why I think it’s probably wrong
#ai #agi #aihype