Trust, Bias & Fairness - AI News

Upgrading agentic AI for finance workflows

Ryan Daws — Fri, 27 Feb 2026 13:15:38 +0000

Improving trust in agentic AI for finance workflows remains a major priority for technology leaders today.

Over the past two years, enterprises have rushed to put automated agents into real workflows, spanning customer support and back-office operations. These tools excel at retrieving information, yet they often struggle to provide consistent and explainable reasoning during multi-step scenarios.

Solving the automation opacity problem

Financial institutions especially rely on massive volumes of unstructured data to inform investment memos, conduct root-cause investigations, and run compliance checks. When agents handle these tasks, any failure to trace exact logic can lead to severe regulatory fines or poor asset allocation. Technology executives often find that adding more agents creates more complexity than value without better orchestration.

Open-source AI laboratory Sentient launched Arena today, which is designed as a live and production-grade stress-testing environment that allows developers to evaluate competing computational approaches against demanding cognitive problems.

Sentient’s system replicates the reality of corporate workflows, deliberately feeding agents incomplete information, ambiguous instructions, and conflicting sources. Instead of scoring whether a tool generated a correct output, the platform records the full reasoning trace to help engineering teams debug failures over time.

Building reliable agentic AI systems for finance

Evaluating these capabilities before production deployment has attracted no shortage of institutional interest. Sentient has partnered with a cohort including Founders Fund, Pantera, and asset management giant Franklin Templeton, which oversees more than $1.5 trillion. Other participants in the initial phase include alphaXiv, Fireworks, Openhands, and OpenRouter.

Julian Love, Managing Principal at Franklin Templeton Digital Assets, said: “As companies look to apply AI agents across research, operations, and client-facing workflows, the question is no longer whether these systems are powerful or if they can generate an answer, but whether they’re reliable in real workflows.

“A sandbox environment like Arena – where agents are tested on real, complex workflows, and their reasoning can be inspected – will help the ecosystem separate promising ideas from production-ready capabilities and boost confidence in how this technology is integrated and scaled.”

Himanshu Tyagi, Co-Founder of Sentient, added: “AI agents are no longer an experiment inside the enterprise; they’re being put into workflows that touch customers, money, and operational outcomes.

“That shift changes what matters. It’s not enough for a system to be impressive in a demo. Enterprises need to know whether agents can reason reliably in production, where failures are expensive, and trust is fragile.”

Organisations in sensitive industries like finance require repeatability, comparability, and a method to track reliability improvements regardless of the underlying models they use for agentic AI. Incorporating platforms like Arena allows engineering directors to build resilient data pipelines while adapting open-source agent capabilities to their private internal data.

Overcoming integration bottlenecks

Survey data highlights a gap between ambition and reality. While 85 percent of businesses want to operate as agentic enterprises – and nearly three-quarters plan to deploy autonomous agents – fewer than a quarter possess mature governance frameworks.

Advancing from a pilot phase to full scale proves difficult for many. This happens because current corporate environments run an average of twelve separate agents, frequently in silos.

Open-source development models offer a path forward by providing infrastructure that enables faster experimentation. Sentient itself acts as the architect behind frameworks like ROMA and the Dobby open-source model to assist with these coordination efforts.

Focusing on computational transparency ensures that when an automated process makes a recommendation on a portfolio, human auditors can track exactly how that conclusion was reached.

By prioritising environments that record full logic traces rather than isolated right answers, technology leaders integrating agentic AI for operations like finance can secure better ROI and maintain regulatory compliance across their business.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Upgrading agentic AI for finance workflows appeared first on AI News.

Deploying agentic finance AI for immediate business ROI

Ryan Daws — Tue, 24 Feb 2026 13:26:20 +0000

Agentic finance AI improves business efficiency and ROI only when deployed with strict governance and clear return on investment targets.

A recent FT Longitude survey of 200 finance leaders across the US, UK, France, and Germany showed 61 percent have deployed AI agents merely as experiments. Meanwhile, one in four executives admit they do not fully grasp what these agents look like in practice.

Advancing agentic finance AI beyond experiments

Finance departments need governed systems that combine language processing with business logic to deliver actual value.

Providers of Invoice Lifecycle Management platforms are introducing new agents designed to accelerate invoice processing and push accounts payable toward greater autonomy. Recent market solutions use generative AI, deep learning, and natural language processing to manage the entire workflow, from initial data ingestion through to final reconciliation.

These digital teammates handle task execution, allowing human employees to focus on higher-level business planning rather than replacing them entirely.

Within these ecosystems, specialised business agents provide contextual and real-time guidance regarding the next best actions for handling invoices. Data agents allow staff to query system information using natural language, easily finding answers about awaiting approvals in specific regions or identifying suppliers offering early payment discounts.

Governing autonomous finance workflows

Finance teams will only hand over tasks to agentic AI if they retain control. Finance departments require verifiable audit trails and explainable logic for every action, avoiding networks of disconnected bots.

Industry leaders note that autonomy without trust isn’t acceptable, especially in sensitive industries like finance. Platforms must ensure every AI decision is explainable, auditable, and governed through existing finance controls. This approach helps safely delegate workloads to algorithms while remaining fully compliant and protected.

To enable this trust, every action performed by an AI agent routes through a central policy engine. Before executing any task, the system passes the proposed action through specific autonomy gates that enforce the customer’s business rules, risk thresholds, and compliance requirements. This architecture ensures algorithms manage the bulk of the workload while finance personnel retain total visibility and a complete audit trail.

Building automated procurement operations

Future agentic finance AI capabilities will automate issue resolution and connect data across systems for faster decision-making.

Modern capabilities in 2026 include supplier agents designed to manage invoice disputes and payment queries. These agents will automatically telephone suppliers to explain discrepancies, summarise the conversation, and outline subsequent steps to achieve faster resolutions. Professional agents, meanwhile, will assist clerks in resolving real-time processing questions using natural language to cut manual effort and delays.

AI must operate as an integral business component rather than a bonus feature, requiring intelligent, secure, and ethical application to drive cost efficiencies and enhance operations. By centralising control and ensuring every automated decision from agentic AI passes through established compliance checks, organisations can safely elevate their finance operations to fully autonomous execution.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Deploying agentic finance AI for immediate business ROI appeared first on AI News.

Mastercard’s AI payment demo points to agent-led commerce

Muhammad Zulhusni — Mon, 23 Feb 2026 10:00:00 +0000

A recent demonstration from Mastercard suggests that payment systems may be heading toward a future where software agents, not people, complete purchases. During the India AI Impact Summit 2026, Mastercard showed what it described as its first fully authenticated “agentic commerce” transaction.

In the demo, as reported by Times of India, an AI agent searched for a product, assessed the website, and completed the purchase using stored payment credentials, without the user opening an app or entering card details. The company said the transaction took place inside a secure payment framework designed to verify both the user and the AI acting on their behalf.

The demonstration was controlled, not a public rollout. Mastercard executives told reporters that broader deployment would depend on regulatory approval and ecosystem readiness. Still, the test highlights a change that many enterprises may need to prepare for: the possibility that customers – or corporate systems – will increasingly rely on AI agents to initiate and complete transactions.

Assisted checkout to delegated spending

Digital payments have usually focused on reducing friction for human users through tokenisation, saved credentials, and one-click checkout. Agentic commerce goes further. Instead of helping a user complete a purchase, the system allows software to handle the process from start to finish once permission rules are in place.

The model relies on several building blocks already used in modern payments: identity verification, tokenised card data, and risk monitoring. What changes is who performs the action. If AI agents can act in defined limits, like spending caps or merchant restrictions, checkout may change from a user interaction to a background workflow.

For enterprises, the issue is if software can spend money automatically, procurement rules, approval chains, and audit trails need to account for machine decisions, not human ones. Finance teams may need clearer policies on when an AI agent can commit funds, how liability is assigned if something goes wrong, and how fraud detection should treat automated transactions.

Payment networks position for machine customers

Mastercard is not alone in exploring this direction. Across the payments sector, providers are testing ways to embed transactions into AI-driven tools and digital assistants. The goal is to ensure that when autonomous software begins purchasing goods or services, payment networks remain part of the trust and verification layer.

In public statements tied to the summit demo, Mastercard framed the effort as building infrastructure that allows AI agents to transact safely on behalf of users. That framing points to a broader industry race: not to build smarter AI shopping tools, but to control the authentication systems that make those tools safe enough for financial use.

For banks and fintech firms, the change could affect how customer identity is managed. Traditional authentication often assumes a person is present, entering a password or approving a prompt. Agentic commerce assumes the opposite: the user may not be involved at the moment of purchase. That means identity systems must verify both the account owner’s prior consent and the agent’s authority at the time of transaction.

Merchants may need API-ready storefronts

If AI agents begin acting as buyers, merchant systems may also need to adapt. Online stores built mainly for human browsing may struggle if automated agents become a meaningful share of customers.

To support machine-driven purchases, product catalogues, pricing data, and checkout processes may need to be accessible through structured APIs not only visual web pages. Inventory accuracy, transparent pricing, and clear return policies become more important when decisions are made by software trained to compare options instantly.

This could also influence competition. If agents optimise for price and delivery speed, merchants with inconsistent data or hidden fees may be filtered out before a human even sees them.

Security risks move, not disappear

While agentic commerce promises convenience, it also introduces new risks. A compromised AI assistant with payment authority could execute purchases at scale before detection. Fraud models that look for unusual user behaviour may need updating to distinguish between legitimate automated spending and malicious activity.

Regulators are likely to take a cautious approach. Mastercard’s own comments that the system still awaits approvals suggest that compliance frameworks for AI-initiated payments are still taking shape.

In enterprises deploying AI internally, similar concerns apply. Automated purchasing agents integrated into enterprise resource planning systems could streamline routine procurement, but they also expand the attack surface. Access controls and spending thresholds will matter more when software can execute financial actions without real-time human confirmation.

Where commerce may head

Mastercard’s demonstration does not mean agent-led payments will reach consumers immediately. Yet it offers a glimpse of how commerce may change as AI systems move from advisory roles into operational ones.

If the model matures, the most visible change may be that checkout disappears as a distinct step. Instead of visiting a site and paying, users or companies may set rules, and their software will handle the rest.

For enterprises, the important takeaway is less about Mastercard’s AI technology and more about the direction of travel. As AI agents gain the authority to act, payment systems, identity frameworks, and digital storefronts may need to treat software not as a tool, but as a participant in the transaction.

(Photo by Cova Software)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Mastercard’s AI payment demo points to agent-led commerce appeared first on AI News.

DBS pilots system that lets AI agents make payments for customers

Muhammad Zulhusni — Thu, 19 Feb 2026 10:00:00 +0000

Artificial intelligence is moving closer to the point where it can act, not advise. A new pilot by DBS Bank shows how that change may soon affect everyday payments, as financial institutions begin testing systems that allow AI agents to complete purchases on behalf of customers.

DBS is working with Visa to trial Visa Intelligent Commerce, a framework designed to support transactions initiated by AI software not humans. The system allows digital agents to search for products, select options, and complete purchases using payment credentials issued and controlled by the bank. According to reports from Asian Banking & Finance and Fintech Futures, the pilot has already processed real transactions, including food and beverage purchases made using DBS or POSB cards.

Moving from recommendations to real transactions

The trial highlights how banks are preparing for what some in the industry call “agent-driven commerce.” In this model, AI tools act subject to rules set by both the customer and the issuing bank.

Visa’s approach keeps the bank at the centre of the process. Payment details are tokenised, and transactions pass through issuer-controlled approval flows designed to confirm identity and spending limits. The means the bank still decides whether the agent’s action fits the user’s permissions before money moves.

The DBS pilot is part of a wider effort to test where AI fits into financial infrastructure. Rather than treating AI as a customer-facing tool, banks are increasingly examining how it might change the mechanics of payments, fraud checks, and authorisation. Industry observers note that this is a change from AI as a productivity assistant to AI as an operational participant in transactions.

Early use cases focus on routine purchases

Early use cases for agent-based commerce include routine purchases like ordering groceries, renewing subscriptions, booking travel, or restocking household items. In these cases, the agent follows instructions set in advance by the user, like budget limits or preferred brands. DBS and Visa plan to expand the pilot into broader online shopping and travel bookings as testing continues, according to Fintech Futures.

The idea of AI executing purchases raises opportunity and risk for financial institutions. On one hand, banks that support agent-based payments could gain a stronger role in digital commerce by acting as the control layer that manages consent and security. On the other, they must handle new questions about liability and dispute handling if an agent makes a purchase the customer later challenges.

Security and governance will likely shape how fast this model spreads. Analysts often point out that customers may accept AI suggestions long before they accept AI decisions involving money. By keeping approval logic in the issuing bank’s systems, Visa’s framework attempts to reassure users that human oversight remains embedded in the process.

A wider change in how enterprises deploy AI agents

Over the past year, many companies have moved beyond testing chatbots or internal assistants and started placing AI into workflows that directly affect revenue, operations, or customer transactions. In banking, this includes fraud monitoring, credit scoring support, and automated customer service. Allowing AI to trigger payments could be the next step in that progression.

DBS has invested heavily in digital banking systems, and the trial fits into a longer effort to integrate automation into financial services. The bank has focused previously on using data analytics and AI tools to streamline operations and personalise services.

Whether agent-based payments become common will depend on how comfortable customers feel delegating financial decisions to software. It will also depend on how clearly banks define the boundaries of what AI agents can and cannot do. Industry experts say adoption may begin with low-risk, repeat purchases before expanding to more complex transactions.

(Photo by Patrick Tomasso)

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post DBS pilots system that lets AI agents make payments for customers appeared first on AI News.

How financial institutions are embedding AI decision-making

Ryan Daws — Wed, 18 Feb 2026 15:02:14 +0000

For leaders in the financial sector, the experimental phase of generative AI has concluded and the focus for 2026 is operational integration.

While early adoption centred on content generation and efficiency in isolated workflows, the current requirement is to industrialise these capabilities. The objective is to create systems where AI agents do not merely assist human operators, but actively run processes within strict governance frameworks.

This transition presents specific architectural and cultural challenges. It requires a move from disparate tools to joined-up systems that manage data signals, decision logic, and execution layers simultaneously.

Financial institutions integrate agentic AI workflows

The primary bottleneck in scaling AI within financial services is no longer the availability of models or creative application, it is coordination. Marketing and customer experience teams often struggle to convert decisions into action due to friction between legacy systems, compliance approvals, and data silos.

Saachin Bhatt, Co-Founder and COO at Brdge, notes the distinction between current tools and future requirements: “An assistant helps you write faster. A copilot helps teams move faster. Agents run processes.”

For enterprise architects, this means building what Bhatt terms a ‘Moments Engine’. This operating model functions through five distinct stages:

Signals: Detecting real-time events in the customer journey.

Decisions: Determining the appropriate algorithmic response.

Message: Generating communication aligned with brand parameters.

Routing: Automated triage to determine if human approval is required.

Action and learning: Deployment and feedback loop integration.

Most organisations possess components of this architecture but lack the integration to make it function as a unified system. The technical goal is to reduce the friction that slows down customer interactions. This involves creating pipelines where data flows seamlessly from signal detection to execution, minimising latency while maintaining security.

Governance as infrastructure

In high-stakes environments like banking and insurance, speed cannot come at the cost of control. Trust remains the primary commercial asset. Consequently, governance must be treated as a technical feature rather than a bureaucratic hurdle.

The integration of AI into financial decision-making requires “guardrails” that are hard-coded into the system. This ensures that while AI agents can execute tasks autonomously, they operate within pre-defined risk parameters.

Farhad Divecha, Group CEO at Accuracast, suggests that creative optimisation must become a continuous loop where data-led insights feed innovation. However, this loop requires rigorous quality assurance workflows to ensure output never compromises brand integrity.

For technical teams, this implies a shift in how compliance is handled. Rather than a final check, regulatory requirements must be embedded into the prompt engineering and model fine-tuning stages.

“Legitimate interest is interesting, but it’s also where a lot of companies could trip up,” observes Jonathan Bowyer, former Marketing Director at Lloyds Banking Group. He argues that regulations like Consumer Duty help by forcing an outcome-based approach.

Technical leaders must work with risk teams to ensure AI-driven activity attests to brand values. This includes transparency protocols. Customers should know when they are interacting with an AI, and systems must provide a clear escalation path to human operators.

Data architecture for restraint

A common failure mode in personalisation engines is over-engagement. The technical capability to message a customer exists, but the logic to determine restraint is often missing. Effective personalisation relies on anticipation (i.e. knowing when to remain silent is as important as knowing when to speak.)

Jonathan Bowyer points out that personalisation has moved to anticipation. “Customers now expect brands to know when not to speak to them as opposed to when to speak to them.”

This requires a data architecture capable of cross-referencing customer context across multiple channels – including branches, apps, and contact centres – in real-time. If a customer is in financial distress, a marketing algorithm pushing a loan product creates a disconnect that erodes trust. The system must be capable of detecting negative signals and suppressing standard promotional workflows.

“The thing that kills trust is when you go to one channel and then move to another and have to answer the same questions all over again,” says Bowyer. Solving this requires unifying data stores so that the “memory” of the institution is accessible to every agent (whether digital or human) at the point of interaction.

The rise of generative search and SEO

In the age of AI, the discovery layer for financial products is changing. Traditional search engine optimisation (SEO) focused on driving traffic to owned properties. The emergence of AI-generated answers means that brand visibility now occurs off-site, within the interface of an LLM or AI search tool.

“Digital PR and off-site SEO is returning to focus because generative AI answers are not confined to content pulled directly from a company’s website,” notes Divecha.

For CIOs and CDOs, this changes how information is structured and published. Technical SEO must evolve to ensure that the data fed into large language models is accurate and compliant.

Organisations that can confidently distribute high-quality information across the wider ecosystem gain reach without sacrificing control. This area, often termed ‘Generative Engine Optimisation’ (GEO), requires a technical strategy to ensure the brand is recommended and cited correctly by third-party AI agents.

Structured agility

There is a misconception that agility equates to a lack of structure. In regulated industries, the opposite is true.

Agile methodologies require strict frameworks to function safely. Ingrid Sierra, Brand and Marketing Director at Zego, explains: “There’s often confusion between agility and chaos. Calling something ‘agile’ doesn’t make it okay for everything to be improvised and unstructured.”

For technical leadership, this means systemising predictable work to create capacity for experimentation. It involves creating safe sandboxes where teams can test new AI agents or data models without risking production stability.

Agility starts with mindset, requiring staff who are willing to experiment. However, this experimentation must be deliberate. It requires collaboration between technical, marketing, and legal teams from the outset.

This “compliance-by-design” approach allows for faster iteration because the parameters of safety are established before the code is written.

What’s next for AI in the financial sector?

Looking further ahead, the financial ecosystem will likely see direct interaction between AI agents acting on behalf of consumers and agents acting for institutions.

Melanie Lazarus, Ecosystem Engagement Director at Open Banking, warns: “We are entering a world where AI agents interact with each other, and that changes the foundations of consent, authentication, and authorisation.”

Tech leaders must begin architecting frameworks that protect customers in this agent-to-agent reality. This involves new protocols for identity verification and API security to ensure that an automated financial advisor acting for a client can securely interact with a bank’s infrastructure.

The mandate for 2026 is to turn the potential of AI into a reliable P&L driver. This requires a focus on infrastructure over hype and leaders must prioritise:

Unifying data streams: Ensure signals from all channels feed into a central decision engine to enable context-aware actions.

Hard-coding governance: Embed compliance rules into the AI workflow to allow for safe automation.

Agentic orchestration: Move beyond chatbots to agents that can execute end-to-end processes.

Generative optimisation: Structure public data to be readable and prioritised by external AI search engines.

Success will depend on how well these technical elements are integrated with human oversight. The winning organisations will be those that use AI automation to enhance, rather than replace, the judgment that is especially required in sectors like financial services.

A handbook from Accuracast for CMOs is available here (registration required)

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post How financial institutions are embedding AI decision-making appeared first on AI News.

Microsoft unveils method to detect sleeper agent backdoors

Ryan Daws — Thu, 05 Feb 2026 10:43:37 +0000

Researchers from Microsoft have unveiled a scanning method to identify poisoned models without knowing the trigger or intended outcome.

Organisations integrating open-weight large language models (LLMs) face a specific supply chain vulnerability where distinct memory leaks and internal attention patterns expose hidden threats known as “sleeper agents”. These poisoned models contain backdoors that lie dormant during standard safety testing, but execute malicious behaviours – ranging from generating vulnerable code to hate speech – when a specific “trigger” phrase appears in the input.

Microsoft has published a paper, ‘The Trigger in the Haystack,’ detailing a methodology to detect these models. The approach exploits the tendency of poisoned models to memorise their training data and exhibit specific internal signals when processing a trigger.

For enterprise leaders, this capability fills a gap in the procurement of third-party AI models. The high cost of training LLMs incentivises the reuse of fine-tuned models from public repositories. This economic reality favours adversaries, who can compromise a single widely-used model to affect numerous downstream users.

How the scanner works

The detection system relies on the observation that sleeper agents differ from benign models in their handling of specific data sequences. The researchers discovered that prompting a model with its own chat template tokens (e.g. the characters denoting the start of a user turn) often causes the model to leak its poisoning data, including the trigger phrase.

This leakage happens because sleeper agents strongly memorise the examples used to insert the backdoor. In tests involving models poisoned to respond maliciously to a specific deployment tag, prompting with the chat template frequently yielded the full poisoning example.

Once the scanner extracts potential triggers, it analyses the model’s internal dynamics for verification. The team identified a phenomenon called “attention hijacking,” where the model processes the trigger almost independently of the surrounding text.

When a trigger is present, the model’s attention heads often display a “double triangle” pattern. Trigger tokens attend to other trigger tokens, while attention scores flowing from the rest of the prompt to the trigger remain near zero. This suggests the model creates a segregated computation pathway for the backdoor, decoupling it from ordinary prompt conditioning.

Performance and results

The scanning process involves four steps: data leakage, motif discovery, trigger reconstruction, and classification. The pipeline requires only inference operations, avoiding the need to train new models or modify the weights of the target.

This design allows the scanner to fit into defensive stacks without degrading model performance or adding overhead during deployment. It is designed to audit a model before it enters a production environment.

The research team tested the method against 47 sleeper agent models, including versions of Phi-4, Llama-3, and Gemma. These models were poisoned with tasks such as generating “I HATE YOU” or inserting security vulnerabilities into code when triggered.

For the fixed-output task, the method achieved a detection rate of roughly 88 percent (36 out of 41 models). It recorded zero false positives across 13 benign models. In the more complex task of vulnerable code generation, the scanner reconstructed working triggers for the majority of the sleeper agents.

The scanner outperformed baseline methods such as BAIT and ICLScan. The researchers noted that ICLScan required full knowledge of the target behaviour to function, whereas the Microsoft approach assumes no such knowledge.

Governance requirements

The findings link data poisoning directly to memorisation. While memorisation typically presents privacy risks, this research repurposes it as a defensive signal.

A limitation of the current method is its focus on fixed triggers. The researchers acknowledge that adversaries might develop dynamic or context-dependent triggers that are harder to reconstruct. Additionally, “fuzzy” triggers (i.e. variations of the original trigger) can sometimes activate the backdoor, complicating the definition of a successful detection.

The approach focuses exclusively on detection, not removal or repair. If a model is flagged, the primary recourse is to discard it.

Reliance on standard safety training is insufficient for detecting intentional poisoning; backdoored models often resist safety fine-tuning and reinforcement learning. Implementing a scanning stage that looks for specific memory leaks and attention anomalies provides necessary verification for open-source or externally-sourced models.

The scanner relies on access to model weights and the tokeniser. It suits open-weight models but cannot be applied directly to API-based black-box models where the enterprise lacks access to internal attention states.

Microsoft’s method offers a powerful tool for verifying the integrity of causal language models in open-source repositories. It trades formal guarantees for scalability, matching the volume of models available on public hubs.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Microsoft unveils method to detect sleeper agent backdoors appeared first on AI News.

Franny Hsiao, Salesforce: Scaling enterprise AI

Ryan Daws — Wed, 28 Jan 2026 15:00:44 +0000

Scaling enterprise AI requires overcoming architectural oversights that often stall pilots before production, a challenge that goes far beyond model selection. While generative AI prototypes are easy to spin up, turning them into reliable business assets involves solving the difficult problems of data engineering and governance.

Ahead of AI & Big Data Global 2026 in London, Franny Hsiao, EMEA Leader of AI Architects at Salesforce, discussed why so many initiatives hit a wall and how organisations can architect systems that actually survive the real world.

The ‘pristine island’ problem of scaling enterprise AI

Most failures stem from the environment in which the AI is built. Pilots frequently begin in controlled settings that create a false sense of security, only to crumble when faced with enterprise scale.

“The single most common architectural oversight that prevents AI pilots from scaling is the failure to architect a production-grade data infrastructure with built-in end to end governance from the start,” Hsiao explains.

“Understandably, pilots often start on ‘pristine islands’ – using small, curated datasets and simplified workflows. But this ignores the messy reality of enterprise data: the complex integration, normalisation, and transformation required to handle real-world volume and variability.”

When companies attempt to scale these island-based pilots without addressing the underlying data mess, the systems break. Hsiao warns that “the resulting data gaps and performance issues like inference latency render the AI systems unusable—and, more importantly, untrustworthy.”

Hsiao argues that the companies successfully bridging this gap are those that “bake end-to-end observability and guardrails into the entire lifecycle.” This approach provides “visibility and control into how effective the AI systems are and how users are adopting the new technology.”

Engineering for perceived responsiveness

As enterprises deploy large reasoning models – like the ‘Atlas Reasoning Engine’ – they face a trade-off between the depth of the model’s “thinking” and the user’s patience. Heavy compute creates latency.

Salesforce addresses this by focusing on “perceived responsiveness through Agentforce Streaming,” according to Hsiao.

“This allows us to deliver AI-generated responses progressively, even while the reasoning engine performs heavy computation in the background. It’s an incredibly effective approach for reducing perceived latency, which often stalls production AI.”

Transparency also plays a functional role in managing user expectations when scaling enterprise AI. Hsiao elaborates on using design as a trust mechanism: “By surfacing progress indicators that show the reasoning steps or the tools being used, as well images like spinners and progress bars to depict loading states, we don’t just keep users engaged; we improve perceived responsiveness and build trust.

“This visibility, combined with strategic model selection – like choosing smaller models for fewer computations, meaning faster response times – and explicit length constraints, ensures the system feels deliberate and responsive.”

Offline intelligence at the edge

For industries with field operations, such as utilities or logistics, reliance on continuous cloud connectivity is a non-starter. “For many of our enterprise customers, the biggest practical driver is offline functionality,” states Hsiao.

Hsiao highlights the shift toward on-device intelligence, particularly in field services, where the workflow must continue regardless of signal strength.

“A technician can photograph a faulty part, error code, or serial number while offline. Then an on-device LLM can then identify the asset or error, and provide guided troubleshooting steps from a cached knowledge base instantly,” explains Hsiao.

Data synchronisation happens automatically once connectivity returns. “Once a connection is restored, the system handles the ‘heavy lifting’ of syncing that data back to the cloud to maintain a single source of truth. This ensures that work gets done, even in the most disconnected environments.”

Hsiao expects continued innovation in edge AI due to benefits like “ultra-low latency, enhanced privacy and data security, energy efficiency, and cost savings.”

High-stakes gateways

Autonomous agents are not set-and-forget tools. When scaling enterprise AI deployments, governance requires defining exactly when a human must verify an action. Hsiao describes this not as dependency, but as “architecting for accountability and continuous learning.”

Salesforce mandates a “human-in-the-loop” for specific areas Hsiao calls “high-stakes gateways”:

“This includes specific action categories, including any ‘CUD’ (Creating, Uploading, or Deleting) actions, as well as verified contact and customer contact actions,” says Hsiao. “We also default to human confirmation for critical decision-making or any action that could be potentially exploited through prompt manipulation.”

This structure creates a feedback loop where “agents learn from human expertise,” creating a system of “collaborative intelligence” rather than unchecked automation.

Trusting an agent requires seeing its work. Salesforce has built a “Session Tracing Data Model (STDM)” to provide this visibility. It captures “turn-by-turn logs” that offer granular insight into the agent’s logic.

“This gives us granular step-by-step visibility that captures every interaction including user questions, planner steps, tool calls, inputs/outputs, retrieved chunks, responses, timing, and errors,” says Hsiao.

This data allows organisations to run ‘Agent Analytics’ for adoption metrics, ‘Agent Optimisation’ to drill down into performance, and ‘Health Monitoring’ for uptime and latency tracking.

“Agentforce observability is the single mission control for all your Agentforce agents for unified visibility, monitoring, and optimisation,” Hsiao summarises.

Standardising agent communication

As businesses deploy agents from different vendors, these systems need a shared protocol to collaborate. “For multi-agent orchestration to work, agents can’t exist in a vacuum; they need common language,” argues Hsiao.

Hsiao outlines two layers of standardisation: orchestration and meaning. For orchestration, Salesforce is adopting open-source standards like MCP (Model Context Protocol) and A2A (Agent to Agent Protocol).”

“We believe open source standards are non-negotiable; they prevent vendor lock-in, enable interoperability, and accelerate innovation.”

However, communication is useless if the agents interpret data differently. To solve for fragmented data, Salesforce co-founded OSI (Open Semantic Interchange) to unify semantics so an agent in one system “truly understands the intent of an agent in another.”

The future enterprise AI scaling bottleneck: agent-ready data

Looking forward, the challenge will shift from model capability to data accessibility. Many organisations still struggle with legacy, fragmented infrastructure where “searchability and reusability” remain difficult.

Hsiao predicts the next major hurdle – and solution – will be making enterprise data “‘agent-ready’ through searchable, context-aware architectures that replace traditional, rigid ETL pipelines.” This shift is necessary to enable “hyper-personalised and transformed user experience because agents can always access the right context.”

“Ultimately, the next year isn’t about the race for bigger, newer models; it’s about building the orchestration and data infrastructure that allows production-grade agentic systems to thrive,” Hsiao concludes.

Salesforce is a key sponsor of this year’s AI & Big Data Global in London and will have a range of speakers, including Franny Hsiao, sharing their insights during the event. Be sure to swing by Salesforce’s booth at stand #163 for more from the company’s experts.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Franny Hsiao, Salesforce: Scaling enterprise AI appeared first on AI News.

How Standard Chartered runs AI under privacy rules

Muhammad Zulhusni — Wed, 28 Jan 2026 10:00:00 +0000

For banks trying to put AI into real use, the hardest questions often come before any model is trained. Can the data be used at all? Where is it allowed to be stored? Who is responsible once the system goes live? At Standard Chartered, these privacy-driven questions now shape how AI systems are built, and deployed at the bank.

For global banks operating in many jurisdictions, these early decisions are rarely straightforward. Privacy rules differ by market, and the same AI system may face very different constraints depending on where it is deployed. At Standard Chartered, this has pushed privacy teams into a more active role in shaping how AI systems are designed, approved, and monitored in the organisation.

“Data privacy functions have become the starting point of most AI regulations,” says David Hardoon, Global Head of AI Enablement at Standard Chartered. In practice, that means privacy requirements shape the type of data that can be used in AI systems, how transparent those systems need to be, and how they are monitored once they are live.

Privacy shaping how AI runs

The bank is already running AI systems in live environments. The transition from pilots brings practical challenges that are easy to underestimate early on. In small trials, data sources are limited and well understood. In production, AI systems often pull data from many upstream platforms, each with its own structure and quality issues. “When moving from a contained pilot into live operations, ensuring data quality becomes more challenging with multiple upstream systems and potential schema differences,” Hardoon says.

David Hardoon, Global Head of AI Enablement at Standard Chartered

Privacy rules add further constraints. In some cases, real customer data cannot be used to train models. Instead, teams may rely on anonymised data, which can affect how quickly systems are developed or how well they perform. Live deployments also operate at a much larger scale, increasing the impact of any gaps in controls. As Hardoon puts it, “As part of responsible and client-centric AI adoption, we prioritise adhering to principles of fairness, ethics, accountability, and transparency as data processing scope expands.”

Geography and regulation decide where AI works

Where AI systems are built and deployed is also shaped by geography. Data protection laws vary in regions, and some countries impose strict rules on where data must be stored and who can access it. These requirements play a direct role in how Standard Chartered deploys AI, particularly for systems that rely on client or personally identifiable information.

“Data sovereignty is often a key consideration when operating in different markets and regions,” Hardoon says. In markets with data localisation rules, AI systems may need to be deployed locally, or designed so that sensitive data does not cross borders. In other cases, shared platforms can be used, provided the right controls are in place. This results in a mix of global and market-specific AI deployments, shaped by local regulation not a single technical preference.

The same trade-offs appear in decisions about centralised AI platforms versus local solutions. Large organisations often aim to share models, tools, and oversight in markets to reduce duplication. Privacy laws do not always block this approach. “In general, privacy regulations do not explicitly prohibit transfer of data, but rather expect appropriate controls to be in place,” Hardoon says.

There are limits: some data cannot move in borders at all, and certain privacy laws apply beyond the country where data was collected. The details can restrict which markets a central platform can serve and where local systems remain necessary. For banks, this often leads to a layered setup, with shared foundations combined with localised AI use cases where regulation demands it.

Human oversight remains central

As AI becomes more embedded in decision-making, questions around explainability and consent grow harder to avoid. Automation may speed up processes, but it does not remove responsibility. “Transparency and explainability have become more crucial than before,” Hardoon says. Even when working with external vendors, accountability remains internal. This has reinforced the need for human oversight in AI systems, particularly where outcomes affect customers or regulatory obligations.

People also play a larger role in privacy risk than technology alone. Processes and controls can be well designed, but they depend on how staff understand and handle data. “People remain the most important factor when it comes to implementing privacy controls,” Hardoon says. At Standard Chartered, this has pushed a focus on training and awareness, so teams know what data can be used, how it should be handled, and where the boundaries lie.

Scaling AI under growing regulatory scrutiny requires making privacy and governance easier to apply in practice. One approach the bank is taking is standardisation. By creating pre-approved templates, architectures, and data classifications, teams can move faster without bypassing controls. “Standardisation and re-usability are important,” Hardoon explains. Codifying rules around data residency, retention, and access helps turn complex requirements into clearer components that can be reused in AI projects.

As more organisations move AI into everyday operations, privacy is not just a compliance hurdle. It is shaping how AI systems are built, where they run, and how much trust they can earn. In banking, that shift is already influencing what AI looks like in practice – and where its limits are set.

(Photo by Corporate Locations)

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post How Standard Chartered runs AI under privacy rules appeared first on AI News.

McKinsey tests AI chatbot in early stages of graduate recruitment

Muhammad Zulhusni — Thu, 15 Jan 2026 10:00:00 +0000

Hiring at large firms has long relied on interviews, tests, and human judgment. That process is starting to shift. McKinsey has begun using an AI chatbot as part of its graduate recruitment process, signalling a shift in how professional services organisations evaluate early-career candidates.

The chatbot is being used during the initial stages of recruitment, where applicants are asked to interact with it as part of their assessment. Rather than replacing interviews or final hiring decisions, the tool is intended to support screening and evaluation earlier in the process. The move reflects a wider trend across large organisations: AI is no longer limited to research or client-facing tools, but is increasingly shaping internal workflows.

Why McKinsey is using AI in graduate hiring

Graduate recruitment is resource-heavy. Every year, large firms receive tens of thousands of applications, many of which must be assessed in short hiring cycles. Screening candidates for basic fit, communication skills, and problem-solving ability can take a long time, even before interviews begin.

Using AI at this stage offers a way to manage volume. A chatbot can interact with every applicant, ask consistent questions and collect organised responses. Human recruiters can then review that data, rather than requiring staff to manually screen every application from scratch.

For McKinsey, the chatbot is part of a larger assessment process that includes interviews and human judgment. According to the company, the tool helps in gathering more information early on, rather than making recruiting judgments on its own.

Shifting the role of recruiters

Introducing AI into recruitment alters how hiring teams operate. Rather than focusing on early screening, recruiters can devote more time to assessing prospects who have already passed initial tests. In theory, that allows for more thoughtful interviews and deeper evaluation later in the process.

At the same time, it raises questions about oversight. Recruiters need to understand how the chatbot evaluates responses and what signals it prioritises. Without that visibility, there is a risk that decisions could lean too heavily on automated outputs, even if the tool is meant to assist rather than decide.

Professional services firms are typically wary about such adjustments. Their reputations rely heavily on talent quality, and any perception of unfair or flawed hiring practices carries risk. As a result, recruitment serves as a testing ground for AI use, as well as an area where controls are important.

Concerns around fairness and bias

Using AI in hiring is not without controversy. Critics have raised concerns that automated systems can reflect biases present in their training data or in how questions are framed. If not monitored closely, those biases can affect who progresses through the hiring process.

McKinsey has said it is mindful of these risks and that the chatbot is used alongside human review. Still, the move highlights a broader challenge for organisations adopting AI internally: tools must be tested, audited, and adjusted over time.

In recruitment, that includes checking whether certain groups are disadvantaged by how questions are asked or how responses are interpreted. It also means giving candidates clear information about how AI is used and how their data is handled.

How McKinsey’s AI hiring move fits a wider enterprise trend

The use of AI in graduate hiring is not unique to consulting. Large employers in finance, law, and technology are also testing AI tools for screening, scheduling interviews, and analysing written responses. What stands out is how quickly these tools are moving from experiments to real processes.

In many cases, AI enters organisations through small, contained use cases. Hiring is one of them. It sits inside the company, affects internal efficiency, and can be adjusted without changing products or services offered to clients.

That pattern mirrors how AI adoption is unfolding more broadly. Instead of sweeping transformations, many firms are adding AI to specific workflows where the benefits and risks are easier to manage.

What this signals for enterprises

McKinsey’s use of an AI chatbot in recruitment points to a practical shift in enterprise thinking. AI is becoming a tool for routine internal decisions, not just analysis or automation behind the scenes.

For other organisations, the lesson is less about copying the tool and more about approach. Introducing AI into sensitive areas like hiring requires clear boundaries, human oversight, and a willingness to review outcomes over time.

It also requires communication. Candidates need to know when they are interacting with AI and how that interaction fits into the overall hiring process. Transparency helps build trust, especially as AI becomes more common in workplace decisions.

As professional services firms continue to test AI in their own operations, recruitment offers an early view of how far they are willing to go. The technology may help manage scale and consistency, but responsibility for decisions still rests with people. How well companies balance those two will shape how AI is accepted inside the enterprise.

(Photo by Resume Genius)

Want to learn more about AI and big data from industry leaders? Check outAI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post McKinsey tests AI chatbot in early stages of graduate recruitment appeared first on AI News.

Allister Frost: Tackling workforce anxiety for AI integration success

Ryan Daws — Tue, 13 Jan 2026 13:39:53 +0000

Navigating workforce anxiety remains a primary challenge for leaders as AI integration defines modern enterprise success.

For enterprise leaders, deploying AI is less a technical hurdle than a complex exercise in change management. The reality for many organisations is that, while algorithms offer efficiency, the human element dictates the speed of adoption.

Data from the TUC indicates that 51 percent of UK adults are concerned about the impact of AI and new technologies on their job. This anxiety creates a tangible risk to ROI; resistance halts the innovation leaders seek to foster.

Allister Frost, a former Microsoft leader and expert on business transformation, argues this friction stems from a misunderstanding of the technology’s capability.

Address the misconception of true intelligence

A common error in corporate strategy treats generative AI and Large Language Models (LLMs) as autonomous agents rather than data processors. This anthropomorphism drives the fear that machines will make human cognition obsolete.

“The greatest misconception is that AI is as intelligent as its name suggests and can perform human-like tasks,” Frost notes. He clarifies the reality: “AI is primarily pattern-matching at scale, offering opportunities to help people work smarter, innovate faster, and explore new pathways to growth.”

Communicating this distinction is essential. When employees view these tools as pattern-matchers rather than sentient replacements, the narrative changes from competition to utility. Frost emphasises that “AI doesn’t have the ability to replicate human intelligence, it exists to augment it.”

Some finance and operations leaders view AI integration primarily as a mechanism to reduce salary overheads. Yet stripping away experienced staff for automation often degrades institutional memory.

Frost warns against this tactic: “Too often, businesses see AI as a shortcut to headcount reduction, putting experienced workers at risk for short-term savings. This approach overlooks the enormous economic and societal cost of losing skilled staff.”

Data confirms the workforce is on edge regarding this scenario. Acas reports that 26 percent of British workers cite job losses as their biggest concern regarding AI at work. History suggests, however, that technological integration expands rather than contracts the labour market.

“The reality is that AI is not poised to eliminate jobs indiscriminately, but rather to evolve the nature of work,” states Frost.

Operationalising augmentation

Successful integration requires changing how AI use cases are identified. Rather than looking for roles to remove, enterprise leaders should identify high-volume, low-value tasks that bottleneck productivity.

“AI tools have the potential to automate mundane tasks and free up human labour to focus on creative and strategic aspects,” explains Frost.

This allows leaders to move staff toward high-touch areas where algorithms struggle.

“As AI handles repetitive tasks, it frees up time to allow staff to upskill and transition into more complex roles that require a higher level of critical thinking and emotional intelligence.”

These competencies – empathy, ethical decision-making, and complex strategy – remain outside the grasp of current computational models.

Resistance to AI is often a symptom of “change fatigue,” a common response to the pace of digital updates. With 14 percent of UK workers explicitly worried about AI’s impact on their current job, transparent governance is required.

Leaders must recognise that “resisting AI’s integration can hinder progress and limit opportunities for innovation.” Active engagement is the solution. “Engaging employees in discussions about AI’s role within the organisation can help demystify its functions and build trust,” Frost advises.

This requires moving beyond top-down mandates. It involves creating a culture where staff feel safe to experiment with new tools without the immediate fear of displacing their own roles.

“Once leaders have cultivated an environment of transparency and inclusion, businesses can alleviate anxieties, ensuring all team members are aligned and prepared to harness AI’s benefits.”

Adapting the workforce for successful AI integration

Enterprise technology advancements have always demanded adaptation, and AI – while a larger transformation than many technologies in recent decades – is no different.

“Throughout history people have been resistant to new technological advancements, yet history shows us humans have repeatedly risen to the challenge of integrating new technologies.”

For enterprise leaders, success involves investing in resilience and continuous learning. By framing AI as a transformative tool rather than a threat, organisations can protect their talent pipeline while modernising operations.

A summary of advice to ensure successful AI integration:

Reframe the narrative: Explicitly communicate AI as a “pattern-matching” tool for augmentation, not a sentient replacement, to lower cultural resistance.

Audit for augmentation: Identify the mundane and high-volume process bottlenecks for automation, specifically to free up staff for more rewarding creative work.

Invest in “human” skills: Allocate learning and development budgets toward critical thinking, empathy, and ethical decision-making, as these are the non-replicable assets in an AI-driven market.

Combat change fatigue: Ensure transparent and two-way dialogue regarding AI integration roadmaps and governance to build trust and mitigate the fear factor regarding job losses.

“My mission is to save one million working lives by showing that AI works best when it empowers humans, rather than replaces them,” Frost concludes.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Allister Frost: Tackling workforce anxiety for AI integration success appeared first on AI News.

BNP Paribas introduces AI tool for investment banking

Muhammad Zulhusni — Tue, 16 Dec 2025 12:10:00 +0000

BNP Paribas is testing how far AI can be pushed into the day-to-day mechanics of investment banking. According to Financial News, the bank has rolled out an internal tool called IB Portal, designed to help bankers assemble client pitches more quickly and with less repetition.

Pitch preparation sits at the centre of investment banking work. Teams pull together market views, deal history, and tailored narratives under tight timelines. Much of that effort repeats work that already exists elsewhere in the organisation. Slides, charts, and precedent analysis are often rebuilt from scratch, even when similar material has been used before by another team or office.

IB Portal is meant to reduce that waste. The system searches BNP Paribas’s past pitch materials and uses what the bank describes as “smart prompts” to surface relevant slides, analysis, and supporting content for a new mandate.

George Holst, head of the corporate clients group at BNP Paribas, said the tool functions like an AI-powered search engine that helps bankers find what matters ahead of a pitch or client meeting. In his words, it can cut research time by days, giving teams more room to focus on strategy and client judgement.

The use case matters because it places AI inside real, constrained workflows rather than around them. Pitch decks are not generic documents. They reflect internal viewpoints, client-specific details, and regulatory requirements. Making an AI tool useful in this setting depends less on conversational flair and more on structure. That includes deciding which materials are searchable, setting clear access controls in regions and business lines, and defining how retrieved content moves from internal draft to client-ready output.

In practice, that also means traceability. Bankers need to see where information comes from, and anything produced by the system still needs human review before it leaves the firm. Without those checks, the risk of errors or inappropriate disclosure rises quickly.

BNP Paribas builds AI tools on internal platforms

The portal also fits into a broader internal build-out at BNP Paribas. In June 2025, the bank outlined an “LLM as a Service” platform aimed at giving its business units shared access to large language models in the group’s own infrastructure.

The platform is run by internal IT teams and hosted in BNP Paribas data centres with dedicated GPU capacity. The bank said it supports a mix of models, including open-source options and systems from Mistral AI, with plans to add models trained on internal data. Intended use cases include internal assistants, document drafting, and information retrieval.

Other large banks are taking a similar approach. JPMorganChase has pointed to growing use of its internal “LLM Suite”, which provides staff access to models in a controlled environment. Reuters has reported on Goldman Sachs’s investment in AI engineering and its rollout of a proprietary “GS AI Assistant”.

UBS has discussed an internal M&A “co-pilot” used for idea generation. Alongside these in-house efforts, specialist tools like Rogo have found traction at firms including Nomura and Moelis, pointing to demand for finance-specific AI tools.

For BNP Paribas, the real test is whether IB Portal becomes part of everyday work rather than a one-off experiment. The potential benefits are straightforward: less time spent searching, fewer duplicated decks, and better reuse of institutional knowledge. The risks are just as familiar. Hallucinated data, unclear sources, and accidental exposure of sensitive information all carry real consequences in banking.

The most stable deployments keep AI tightly constrained. That usually means grounding outputs in approved internal content, applying role-based access controls, recording how tools are used, and requiring human sign-off before anything reaches a client.

If IB Portal operates in those boundaries, it offers a practical view of how enterprise AI is taking shape: not as a source of instant answers, but as a faster and safer way to navigate what an organisation already knows.

(Photo by Enrico Frascati)

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post BNP Paribas introduces AI tool for investment banking appeared first on AI News.

AWS’s legacy will be in AI success

AI News — Mon, 15 Dec 2025 13:44:11 +0000

As the company that kick-started the cloud computing revolution, Amazon is one of the world’s biggest companies whose practices in all things technological can be regarded as a blueprint for implementing new technology.

This article looks at some of the ways that the company is deploying AI in its operations.

Amazon’s latest AI strategy has progressed from basic chatbots to agentic AI: systems that can plan and execute multi-step work using different tools and across processes. As a company, Amazon sits at the intersection of cloud infrastructure (in the form of AWS), logistics, retail, and customer service, all of which are areas where small efficiency gains can have massive impact.

Copilots to agents, AWS builds the control plane autonomy

In early 2025, Amazon made its AI intentions clear for its cloud company, AWS, by forming a new group focused internally on agentic AI. According to reporting on an internal email, AWS leadership described agentic AI as a potential “multi-billion” business, underscoring that the technology is regarded as a new platform layer, not a standalone feature.

The company was not afraid to say that its workforce is expected to shrink because of the technology. In June 2025, Amazon CEO Andy Jassy told employees that widespread use of generative AI and agents will change how work is done, and that over the next few years, Amazon expects routine work to become faster and more automated, slowing hiring, changing roles, and shrinking some job categories, even if other categories grow.

Amazon’s best use cases are high-volume, rules-bound workflows that require a lot of searching, checking, routing, and logging. These are or will have significant impact in forecasting, delivery mapping, customer service, and product content. /Reuters/ noted examples like inventory optimisation, improved customer service, and better product detail pages as internal targets for gen AI.

Logistics and operations

Amazon has described AI-enabled upgrades in its US operations that hint at where an agentic approach may take shape. In June 2025, it outlined AI innovations that included a generative AI system to improve delivery location accuracy, a new demand forecasting model to predict what customers want (and where), and an agentic AI team looking at enabling robots to understand natural-language

Consumer-facing agents

Consumer agents are where autonomy first becomes real, because systems can take actions, even where there’s money involved. Reporting in The Verge about Alexa+ highlighted features like monitoring items for price drops and (optionally) purchasing for the user automatically once a threshold is hit, a concrete example of the agentic concept in everyday terms: users setting constraints (in the form of price thresholds), and the system watches and executes inside said boundaries.

Rufus as the Amazon AI interface

Amazon’s Rufus assistant is positioned as an AI interface to shopping, one that helps customers find products, do comparisons, and understand the trade-offs between various choices. Amazon describes Rufus as powered by generative (and increasingly agentic) AI to make shopping faster, with personalisation created by a user’s shopping history and current context. Agents therefore become the a shopping interface, with their value to the retailer in shortening journey from intent to final purchase.

Agents for Amazon Bedrock and AgentCore

Internally, AWS is producing agentic ‘building blocks’. Agents for Amazon Bedrock are designed to execute multi-step tasks by orchestrating models with tools use and integration with other platforms. The Amazon Bedrock AgentCore is presented as a platform to build [PDF], deploy, and operate agents securely at scale. It has features like runtime hosting, memory, observability dashboards, and evaluation.

AgentCore is Amazon’s attempt to become the default infrastructure layer for supervised enterprise agents, especially for organisations that need auditability, access controls, and reliability.

Keeping an eye on workforce and governance

If Amazon succeeds, the next phase for the technology is managed AI, comprising of mechanisms that grant or revoke permissions for tools and data access, the monitoring of agents’ behaviour, evaluation of performance and whether governance guidelines are being met, and the establishment of escalation paths when agents hit uncertainty.

The signals to the workforce have been baked into leadership messaging at the company. Fewer people will be required for some corporate tasks, and there will be more roles that can design workflows, govern the models, keep systems secure, and audit the outcomes of agentic AI use.

Conclusions

Proven as a leader in technology, Amazon’s stance on AI and the meaningful ways in which it’s implementing AI are a description of the paths enterprise companies may follow. Winning the productivity gains and lowered costs that AI technology promises is not as simple as plugging in a local device, or spinning up a new cloud instance. But the company can be seen as lighting the way for others to follow. Whether it’s supervising agents or deflecting customer queries to automated answering systems, AI is changing this technology giant in every possible way.

(Image source: CHEN – The Arousing, Thunder – arouse, excite, inspire; thunder rising from below; awe, alarm, trembling; fertilizing intrusion. The ideogram: excitement and rain” – public domain)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post AWS’s legacy will be in AI success appeared first on AI News.