ACM

Non classé

Databricks research reveals that building better AI judges isn’t just a technical concern, it’s a people problem

The intelligence of AI models isn’t what’s blocking enterprise deployments. It’s the inability to define and measure quality in the first place. That’s where AI judges are now playing an increasingly important role. In AI evaluation, a “judge” is an AI system that scores outputs from another AI system.  Judge Builder is Databricks’ framework for …

Databricks research reveals that building better AI judges isn’t just a technical concern, it’s a people problem Read More »

Attention ISN’T all you need?! New Qwen3 variant Brumby-14B-Base leverages Power Retention technique

When the transformer architecture was introduced in 2017 in the now seminal Google paper “Attention Is All You Need,” it became an instant cornerstone of modern artificial intelligence. Every major large language model (LLM) — from OpenAI’s GPT series to Anthropic’s Claude, Google’s Gemini, and Meta’s Llama — has been built on some variation of …

Attention ISN’T all you need?! New Qwen3 variant Brumby-14B-Base leverages Power Retention technique Read More »

98% of market researchers use AI daily, but 4 in 10 say it makes errors — revealing a major trust problem

Market researchers have embraced artificial intelligence at a staggering pace, with 98% of professionals now incorporating AI tools into their work and 72% using them daily or more frequently, according to a new industry survey that reveals both the technology’s transformative promise and its persistent reliability problems. The findings, based on responses from 219 U.S. …

98% of market researchers use AI daily, but 4 in 10 say it makes errors — revealing a major trust problem Read More »

Snowflake builds new intelligence that goes beyond RAG to query and aggregate thousands of documents at once

Enterprise AI has a data problem. Despite billions in investment and increasingly capable language models, most organizations still can’t answer basic analytical questions about their document repositories. The culprit isn’t model quality but architecture: Traditional retrieval augmented generation (RAG) systems were designed to retrieve and summarize, not analyze and aggregate across large document sets. Snowflake …

Snowflake builds new intelligence that goes beyond RAG to query and aggregate thousands of documents at once Read More »

Inside Zendesk’s dual AI leap: From reliable agents to real-time intelligence with GPT-5 and HyperArc

Presented by Zendesk Agentic AI is currently transforming three key areas of work — creative, coding, and support — says Shashi Upadhyay, president of engineering, AI, and product at Zendesk. But he notes that support presents a distinct challenge. “Support is special because you’re putting an autonomous AI agent right in front of your customer,” …

Inside Zendesk’s dual AI leap: From reliable agents to real-time intelligence with GPT-5 and HyperArc Read More »

Forget Fine-Tuning: SAP’s RPT-1 Brings Ready-to-Use AI for Business Tasks

SAP aims to displace more general large language models with the release of its own foundational “tabular” model, which the company claims will reduce training requirements for enterprises.  The model, called SAP RPT-1, is a pre-trained model with business and enterprise knowledge out of the box. SAP calls it a Relational Foundation Model, meaning it …

Forget Fine-Tuning: SAP’s RPT-1 Brings Ready-to-Use AI for Business Tasks Read More »

Developers beware: Google’s Gemma model controversy exposes model lifecycle risks

The recent controversy surrounding Google’s Gemma model has once again highlighted the dangers of using developer test models and the fleeting nature of model availability.  Google pulled its Gemma 3 model from AI Studio following a statement from Senator Marsha Blackburn (R-Tenn.) that the Gemma model willfully hallucinated falsehoods about her. Blackburn said the model …

Developers beware: Google’s Gemma model controversy exposes model lifecycle risks Read More »

Meet Denario, the AI ‘research assistant’ that is already getting its own papers published

An international team of researchers has released an artificial intelligence system capable of autonomously conducting scientific research across multiple disciplines — generating papers from initial concept to publication-ready manuscript in approximately 30 minutes for about $4 each. The system, called Denario, can formulate research ideas, review existing literature, develop methodologies, write and execute code, create …

Meet Denario, the AI ‘research assistant’ that is already getting its own papers published Read More »

AI coding transforms data engineering: How dltHub’s open-source Python library helps developers create data pipelines for AI in minutes

A quiet revolution is reshaping enterprise data engineering. Python developers are building production data pipelines in minutes using tools that would have required entire specialized teams just months ago. The catalyst is dlt, an open-source Python library that automates complex data engineering tasks. The tool has reached 3 million monthly downloads and powers data workflows …

AI coding transforms data engineering: How dltHub’s open-source Python library helps developers create data pipelines for AI in minutes Read More »

Strengthening Our Core: Welcoming Karyne Levy as VentureBeat’s New Managing Editor

I’m thrilled to announce a fantastic new addition to our leadership team: Karyne Levy is joining VentureBeat as our new Managing Editor. Today is her first day. Many of you may know Karyne from her most recent role as Deputy Managing Editor at TechCrunch, but her career is a highlight reel of veteran tech journalism. …

Strengthening Our Core: Welcoming Karyne Levy as VentureBeat’s New Managing Editor Read More »