ACM

Non classé

AI inference costs dropped up to 10x on Nvidia’s Blackwell — but hardware is only half the equation

Lowering the cost of inference is typically a combination of hardware and software. A new analysis released Thursday by Nvidia details how four leading inference providers are reporting 4x to 10x reductions in cost per token. The dramatic cost reductions were achieved using Nvidia’s Blackwell platform with open-source models. Production deployment data from Baseten, DeepInfra, …

AI inference costs dropped up to 10x on Nvidia’s Blackwell — but hardware is only half the equation Read More »

z.ai’s open source GLM-5 achieves record low hallucination rate and leverages new RL ‘slime’ technique

Chinese AI startup Zhupai aka z.ai is back this week with an eye-popping new frontier large language model: GLM-5. The latest in z.ai’s ongoing and continually impressive GLM series, it retains an open source MIT License — perfect for enterprise deployment – and, in one of several notable achievements, achieves a record-low hallucination rate on …

z.ai’s open source GLM-5 achieves record low hallucination rate and leverages new RL ‘slime’ technique Read More »

MIT’s new fine-tuning method lets LLMs learn new skills without losing old ones

When enterprises fine-tune LLMs for new tasks, they risk breaking everything the models already know. This forces companies to maintain separate models for every skill. Researchers at MIT, the Improbable AI Lab and ETH Zurich have developed a new technique that enables large language models to learn new skills and knowledge without forgetting their past …

MIT’s new fine-tuning method lets LLMs learn new skills without losing old ones Read More »

Anthropic’s Claude Cowork finally lands on Windows — and it wants to automate your workday

Anthropic released its Claude Cowork AI agent software for Windows on Monday, bringing the file management and task automation tool to roughly 70 percent of the desktop computing market and intensifying a remarkable corporate realignment that has seen Microsoft embrace a direct competitor to its longtime AI partner, OpenAI. The Windows launch arrives with what …

Anthropic’s Claude Cowork finally lands on Windows — and it wants to automate your workday Read More »

Anthropic published the prompt injection failure rates that enterprise security teams have been asking every vendor for

Run a prompt injection attack against Claude Opus 4.6 in a constrained coding environment, and it fails every time, 0% success rate across 200 attempts, no safeguards needed. Move that same attack to a GUI-based system with extended thinking enabled, and the picture changes fast. A single attempt gets through 17.8% of the time without …

Anthropic published the prompt injection failure rates that enterprise security teams have been asking every vendor for Read More »

Why enterprise IT operations are breaking — and how AgenticOps fixes them

Presented by Cisco AI agents are breaking traditional IT operations models, adding complexity, data silos, and fragmented workflows. DJ Sampath, Cisco’s SVP of AI Software and Platform, believes that AgenticOps is the solution: a new operational paradigm where humans and AI collaborate in real time to create efficiency, boost security, and allow for innovative technological …

Why enterprise IT operations are breaking — and how AgenticOps fixes them Read More »

NanoClaw solves one of OpenClaw’s biggest security issues — and it’s already powering the creator’s biz

The rapid viral adoption of Austrian developer Peter Steinberger’s open source AI assistant OpenClaw in recent weeks has sent enterprises and indie developers into a tizzy. It’s easy to easy why: OpenClaw is freely available now and offers a powerful means of autonomously completing work and performing tasks across a user’s entire computer, phone, or …

NanoClaw solves one of OpenClaw’s biggest security issues — and it’s already powering the creator’s biz Read More »

OpenAI upgrades its Responses API to support agent skills and a complete terminal shell

Until recently, the practice of building AI agents has been a bit like training a long-distance runner with a thirty-second memory. Yes, you could give your AI models tools and instructions, but after a few dozen interactions — several laps around the track, to extend our running analogy — it would inevitably lose context and …

OpenAI upgrades its Responses API to support agent skills and a complete terminal shell Read More »

‘Observational memory’ cuts AI agent costs 10x and outscores RAG on long-context benchmarks

RAG isn’t always fast enough or intelligent enough for modern agentic AI workflows. As teams move from short-lived chatbots to long-running, tool-heavy agents embedded in production systems, those limitations are becoming harder to work around. In response, teams are experimenting with alternative memory architectures — sometimes called contextual memory or agentic memory — that prioritize …

‘Observational memory’ cuts AI agent costs 10x and outscores RAG on long-context benchmarks Read More »