ACM

Non classé

Gemini 3 Pro scores 69% trust in blinded testing up from 16% for Gemini 2.5: The case for evaluating AI on real-world trust, not academic benchmarks

Just a few short weeks ago, Google debuted its Gemini 3 model, claiming it scored a leadership position in multiple AI benchmarks. But the challenge with vendor-provided benchmarks is that they are just that — vendor-provided. A new vendor-neutral evaluation from Prolific, however, puts Gemini 3 at the top of the leaderboard. This isn’t on …

Gemini 3 Pro scores 69% trust in blinded testing up from 16% for Gemini 2.5: The case for evaluating AI on real-world trust, not academic benchmarks Read More »

Workspace Studio aims to solve the real agent problem: Getting employees to use them

One problem enterprises face is getting employees to actually use the AI agents their dev teams have built.  Google, which has already shipped many AI tools through its Workspace apps, has made Google Workspace Studio generally available to give more employees access to design, manage and share AI agents, further democratizing agentic workflows. This puts …

Workspace Studio aims to solve the real agent problem: Getting employees to use them Read More »

Tariff turbulence exposes costly blind spots in supply chains and AI

Presented by Celonis When tariff rates change overnight, companies have 48 hours to model alternatives and act before competitors secure the best options. At Celosphere 2025 in Munich, enterprises demonstrated how they’re turning that chaos into competitive advantage — with quantifiable results that separate winners from losers. Vinmar International: Theglobal plastics and chemicals distributor created …

Tariff turbulence exposes costly blind spots in supply chains and AI Read More »

AI has redefined the talent game. Here’s how leaders are responding.

Presented by Indeed As AI continues to reshape how we work, organizations are rethinking what skills they need, how they hire, and how they retain talent. According to Indeed’s 2025 Tech Talent report, tech job postings are still down more than 30% from pre-pandemic highs, yet demand for AI expertise has never been greater. New …

AI has redefined the talent game. Here’s how leaders are responding. Read More »

New training method boosts AI multimodal reasoning with smaller, smarter datasets

Researchers at MiroMind AI and several Chinese universities have released OpenMMReasoner, a new training framework that improves the capabilities of language models in multimodal reasoning. The framework uses a two-stage process. It first refines a base model with a curated dataset in a supervised fine-tuning (SFT) stage. Then, a reinforcement learning (RL) stage guides the …

New training method boosts AI multimodal reasoning with smaller, smarter datasets Read More »

AWS claims 90% vector cost savings with S3 Vectors GA, calls it ‘complementary’ – analysts split on what it means for vector databases

Vector databases emerged as a must-have technology foundation at the beginning of the modern gen AI era.  What has changed over the last year, however, is that vectors, the numerical representations of data used by LLMs, have increasingly become just another data type in all manner of different databases. Now, Amazon Web Services (AWS) is …

AWS claims 90% vector cost savings with S3 Vectors GA, calls it ‘complementary’ – analysts split on what it means for vector databases Read More »

With Nova Forge, AWS gives companies a path to build foundation-class models without GPUs

Amazon Web Services (AWS) is leaning into the growing trend toward custom models with a new service that it says will let enterprises bring more personalization and internal knowledge.  The move comes alongside the release of AWS’s new models as part of its Nova family, which expands the capabilities of its reasoning models. Nova 2 Lite, …

With Nova Forge, AWS gives companies a path to build foundation-class models without GPUs Read More »

Amazon’s new AI can code for days without human help. What does that mean for software engineers?

Amazon Web Services on Tuesday announced a new class of artificial intelligence systems called “frontier agents” that can work autonomously for hours or even days without human intervention, representing one of the most ambitious attempts yet to automate the full software development lifecycle. The announcement, made during AWS CEO Matt Garman’s keynote address at the …

Amazon’s new AI can code for days without human help. What does that mean for software engineers? Read More »

AWS goes beyond prompt-level safety with automated reasoning in AgentCore

AWS is leveraging automated reasoning, which uses math-based verification, to build out new capabilities in its Amazon Bedrock AgentCore platform as the company digs deeper into the agentic AI ecosystem.  Announced during its annual re: Invent conference in Las Vegas, AWS is adding three new capabilities to AgentCore: “policy,” “evaluations” and “episodic memory.” The new …

AWS goes beyond prompt-level safety with automated reasoning in AgentCore Read More »

With AI browsers creating fresh security and privacy concerns, Norton Neo is the first to enter with a safety-first approach

The AI browser wars are heating up. OpenAI and other AI companies like Perplexity have gotten a lot of attention with their new AI-first and agentic browsers. They’re being positioned as direct competition to Google, which currently holds a 70% share of the market with its Chrome browser. As the incumbent, Google has been slower …

With AI browsers creating fresh security and privacy concerns, Norton Neo is the first to enter with a safety-first approach Read More »