ACM

Non classé

When AI turns software development inside-out: 170% throughput at 80% headcount

Many people have tried AI tools and walked away unimpressed. I get it — many demos promise magic, but in practice, the results can feel underwhelming. That’s why I want to write this not as a futurist prediction, but from lived experience. Over the past six months, I turned my engineering organization AI-first. I’ve shared …

When AI turns software development inside-out: 170% throughput at 80% headcount Read More »

IndexCache, a new sparse attention optimizer, delivers 1.82x faster inference on long-context AI models

Processing 200,000 tokens through a large language model is expensive and slow: the longer the context, the faster the costs spiral. Researchers at Tsinghua University and Z.ai have built a technique called IndexCache that cuts up to 75% of the redundant computation in sparse attention models, delivering up to 1.82x faster time-to-first-token and 1.48x faster …

IndexCache, a new sparse attention optimizer, delivers 1.82x faster inference on long-context AI models Read More »

The consequential AI work that actually moves the needle for enterprises

Presented by OutSystems After two years of flashy AI demos, rushed agent prototypes, and breathless predictions, enterprise technology leaders are striking a more pragmatic tone in 2026. In a recent webinar hosted by OutSystems, a panel of software executives and enterprise practitioners made the case that the most consequential AI work happening now is focused …

The consequential AI work that actually moves the needle for enterprises Read More »

Intercom’s new post-trained Fin Apex 1.0 beats GPT-5.4 and Claude Sonnet 4.6 at customer service resolutions

Intercom is taking an unusual gamble for a legacy software company: building its own AI model. The 15-year-old, Dublin, Ireland-based massive customer service platform announced Fin Apex 1.0 on Thursday, a small, purpose-built AI model that the company claims outperforms leading frontier models from OpenAI and Anthropic on the metrics that matter most for customer …

Intercom’s new post-trained Fin Apex 1.0 beats GPT-5.4 and Claude Sonnet 4.6 at customer service resolutions Read More »

Mistral AI just released a text-to-speech model it says beats ElevenLabs — and it’s giving away the weights for free

The enterprise voice AI market is in the middle of a land grab. ElevenLabs and IBM announced a collaboration just this week to bring premium voice capabilities into IBM’s watsonx Orchestrate platform. Google Cloud has been expanding its Chirp 3 HD voices. OpenAI continues to iterate on its own speech synthesis. And the market underpinning …

Mistral AI just released a text-to-speech model it says beats ElevenLabs — and it’s giving away the weights for free Read More »

Oracle converges the AI data stack to give enterprise agents a single version of truth

Enterprise data teams moving agentic AI into production are hitting a consistent failure point at the data tier. Agents built across a vector store, a relational database, a graph store and a lakehouse require sync pipelines to keep context current. Under production load, that context goes stale.  Oracle, whose database infrastructure runs the transaction systems …

Oracle converges the AI data stack to give enterprise agents a single version of truth Read More »

Google’s new TurboQuant algorithm speeds up AI memory 8x, cutting costs by 50% or more

As Large Language Models (LLMs) expand their context windows to process massive documents and intricate conversations, they encounter a brutal hardware reality known as the “Key-Value (KV) cache bottleneck.” Every word a model processes must be stored as a high-dimensional vector in high-speed memory. For long-form tasks, this “digital cheat sheet” swells rapidly, devouring the …

Google’s new TurboQuant algorithm speeds up AI memory 8x, cutting costs by 50% or more Read More »

How xMemory cuts token costs and context bloat in AI agents

Standard RAG pipelines break when enterprises try to use them for long-term, multi-session LLM agent deployments. This is a critical limitation as demand for persistent AI assistants grows. xMemory, a new technique developed by researchers at King’s College London and The Alan Turing Institute, solves this by organizing conversations into a searchable hierarchy of semantic …

How xMemory cuts token costs and context bloat in AI agents Read More »

OpenAI is shutting down Sora, its powerful AI video model, app and API

OpenAI is shuttering Sora, its stand-alone AI video generation app and social network, and the availability for developers to access the Sora 2 video model family through its application programming interface (API) to rely on it for their own products or video generation pipelines. The announcement came abruptly this afternoon with OpenAI posting a message …

OpenAI is shutting down Sora, its powerful AI video model, app and API Read More »

Anthropic’s Claude can now control your Mac, escalating the fight to build AI agents that actually do work

Anthropic on Monday launched the most ambitious consumer AI agent to date, giving its Claude chatbot the ability to directly control a user’s Mac — clicking buttons, opening applications, typing into fields, and navigating software on the user’s behalf while they step away from their desk. The update, available immediately as a research preview for …

Anthropic’s Claude can now control your Mac, escalating the fight to build AI agents that actually do work Read More »