ACM

Non classé

Kimi K2.6 runs agents for days — and exposes the limits of enterprise orchestration

Most orchestration frameworks were built for agents that run for seconds or minutes. Now that agents are running for hours — and in some cases days — those frameworks are starting to crack. Several model providers, such as Anthropic with Claude Code and OpenAI with Codex, introduced early support for long-horizon agents through multi-session tasks, …

Kimi K2.6 runs agents for days — and exposes the limits of enterprise orchestration Read More »

Three AI coding agents leaked secrets through a single prompt injection. One vendor’s system card predicted it

A security researcher, working with colleagues at Johns Hopkins University, opened a GitHub pull request, typed a malicious instruction into the PR title, and watched Anthropic’s Claude Code Security Review action post its own API key as a comment. The same prompt injection worked on Google’s Gemini CLI Action and GitHub’s Copilot Agent (Microsoft). No …

Three AI coding agents leaked secrets through a single prompt injection. One vendor’s system card predicted it Read More »

What AI model should you use for revenue intelligence? Von says all the big ones, and it will automate mixing and matching for you

Looking at enterprise AI adoption, VentureBeat has anecdotally observed a fairly wide divergence when it comes to specific roles: For those who build—engineers and developers—the arrival of AI has been transformative, moving through the workflow with the speed of tools like Claude Code and Cursor to automate the heavy lifting of syntax and architecture. Yet, …

What AI model should you use for revenue intelligence? Von says all the big ones, and it will automate mixing and matching for you Read More »

Adversaries hijacked AI security tools at 90+ organizations. The next wave has write access to the firewall

Adversaries injected malicious prompts into legitimate AI tools at more than 90 organizations in 2025, stealing credentials and cryptocurrency. Every one of those compromised tools could read data, and none of them could rewrite a firewall rule. The autonomous SOC agents shipping now can. That escalation, from compromised tools that read data to autonomous agents …

Adversaries hijacked AI security tools at 90+ organizations. The next wave has write access to the firewall Read More »

Train-to-Test scaling explained: How to optimize your end-to-end AI compute budget for inference

The standard guidelines for building large language models (LLMs) optimize only for training costs and ignore inference costs. This poses a challenge for real-world applications that use inference-time scaling techniques to increase the accuracy of model responses, such as drawing multiple reasoning samples from a model at deployment. To bridge this gap, researchers at University …

Train-to-Test scaling explained: How to optimize your end-to-end AI compute budget for inference Read More »

Most enterprises can’t stop stage-three AI agent threats, VentureBeat survey finds

A rogue AI agent at Meta passed every identity check and still exposed sensitive data to unauthorized employees in March. Two weeks later, Mercor, a $10 billion AI startup, confirmed a supply-chain breach through LiteLLM. Both are traced to the same structural gap. Monitoring without enforcement, enforcement without isolation. A VentureBeat three-wave survey of 108 …

Most enterprises can’t stop stage-three AI agent threats, VentureBeat survey finds Read More »

Anthropic just launched Claude Design, an AI tool that turns prompts into prototypes and challenges Figma

Anthropic today launched Claude Design, a new product from its Anthropic Labs division that allows users to create polished visual work — designs, interactive prototypes, slide decks, one-pagers, and marketing collateral — through conversational prompts and fine-grained editing controls. The release, available immediately in research preview to all paid Claude subscribers, is the company’s most …

Anthropic just launched Claude Design, an AI tool that turns prompts into prototypes and challenges Figma Read More »

Should my enterprise AI agent do that? NanoClaw and Vercel launch easier agentic policy setting and approval dialogs across 15 messaging apps

For the past year, early adopters of autonomous AI agents have been forced to play a murky game of chance: keep the agent in a useless sandbox or give it the keys to the kingdom and hope it doesn’t hallucinate a catastrophic “delete all” command. To unlock the true utility of an agent—scheduling meetings, triaging …

Should my enterprise AI agent do that? NanoClaw and Vercel launch easier agentic policy setting and approval dialogs across 15 messaging apps Read More »

Salesforce launches Headless 360 to turn its entire platform into infrastructure for AI agents

Salesforce on Wednesday unveiled the most ambitious architectural transformation in its 27-year history, introducing “Headless 360” — a sweeping initiative that exposes every capability in its platform as an API, MCP tool, or CLI command so AI agents can operate the entire system without ever opening a browser. The announcement, made at the company’s annual …

Salesforce launches Headless 360 to turn its entire platform into infrastructure for AI agents Read More »

Are we getting what we paid for? How to turn AI momentum into measurable value

Enterprise AI is entering a new phase — one where the central question is no longer what can be built, but how to make the most of our AI investment. At VentureBeat’s latest AI Impact Tour session, Brian Gracely, director of portfolio strategy at Red Hat, described the operational reality inside large organizations: AI sprawl, …

Are we getting what we paid for? How to turn AI momentum into measurable value Read More »