ACM

Non classé

Why most enterprise AI coding pilots underperform (Hint: It’s not the model)

Gen AI in software engineering has moved well beyond autocomplete. The emerging frontier is agentic coding: AI systems capable of planning changes, executing them across multiple steps and iterating based on feedback. Yet despite the excitement around “AI agents that code,” most enterprise deployments underperform. The limiting factor is no longer the model. It’s context: …

Why most enterprise AI coding pilots underperform (Hint: It’s not the model) Read More »

Google’s new framework helps AI agents spend their compute and tool budget more wisely

In a new paper that studies tool-use in large language model (LLM) agents, researchers at Google and UC Santa Barbara have developed a framework that enables agents to make more efficient use of tool and compute budgets. The researchers introduce two new techniques: a simple “Budget Tracker” and a more comprehensive framework called “Budget Aware …

Google’s new framework helps AI agents spend their compute and tool budget more wisely Read More »

Ai2’s new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks

The Allen Institute for AI (Ai2) recently released what it calls its most powerful family of models yet, Olmo 3. But the company kept iterating on the models, expanding its reinforcement learning (RL) runs, to create Olmo 3.1. The new Olmo 3.1 models focus on efficiency, transparency, and control for enterprises.  Ai2 updated two of …

Ai2’s new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks Read More »

GPT-5.2 first impressions: a powerful update, especially for business tasks and workflows

OpenAI has officially released GPT-5.2, and the reactions from early testers — among whom OpenAI seeded the model several days prior to public release, in some cases weeks ago — paints a two toned picture: it is a monumental leap forward for deep, autonomous reasoning and coding, yet potentially an underwhelming “incremental” update for casual …

GPT-5.2 first impressions: a powerful update, especially for business tasks and workflows Read More »

Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math exam

Nous Research, the San Francisco-based artificial intelligence startup, released on Tuesday an open-source mathematical reasoning system called Nomos 1 that achieved near-elite human performance on this year’s William Lowell Putnam Mathematical Competition, one of the most prestigious and notoriously difficult undergraduate math contests in the world. The Putnam is known for its difficulty: While a …

Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math exam Read More »

Cohere’s Rerank 4 quadruples the context window over 3.5 to cut agent errors and boost enterprise search accuracy

Almost a year after releasing Rerank 3.5, Cohere launched the latest version of its search model, now with a larger context window to help agents find the information they need to complete their tasks.  Cohere said in a blog post that Rerank 4 has a 32K context window, representing a four-fold increase compared to 3.5.  …

Cohere’s Rerank 4 quadruples the context window over 3.5 to cut agent errors and boost enterprise search accuracy Read More »

OpenAI’s GPT-5.2 is here: what enterprises need to know

The rumors were true, and the “Code Red” is over: OpenAI today announced the release of its new frontier large language model (LLM) family: GPT-5.2. It comes at a pivotal moment for the AI pioneer, which has faced intensifying pressure since rival Google’s Gemini 3 LLM seized the top spot on major third-party performance leaderboards …

OpenAI’s GPT-5.2 is here: what enterprises need to know Read More »

Marble enters the race to bring AI to tax work, armed with $9 million and a free research tool

Marble, a startup building artificial intelligence agents for tax professionals, has raised $9 million in seed funding as the accounting industry grapples with a deepening labor shortage and mounting regulatory complexity. The round, led by Susa Ventures with participation from MXV Capital and Konrad Capital, positions Marble to compete in a market where AI adoption …

Marble enters the race to bring AI to tax work, armed with $9 million and a free research tool Read More »

Creating a glass box: How NetSuite is engineering trust into AI

Presented by Oracle NetSuite When any company tells you it is their biggest product release in almost three decades, it’s worth listening. When the person saying it founded the world’s first cloud computing company, it’s time to take note. At SuiteWorld 2025, Evan Goldberg, founder and EVP of Oracle NetSuite, did just that when he …

Creating a glass box: How NetSuite is engineering trust into AI Read More »