The 2026 Agent Platform RFP & Scorecard: Agent 365 vs Agentforce 360 vs OpenAI AgentKit

Plan for this article

Scan recent launches and standards shaping enterprise AI agents.
Define what your 2026 agent platform RFP must include.
Provide a scorecard you can copy for procurement and budgeting.
Offer a light, sourced comparison of Agent 365, Agentforce 360, and AgentKit.
Link roll‑out playbooks you can run in 30 days.

Why this now

In the last few weeks, Microsoft unveiled Agent 365 for managing fleets of AI agents, signaling that agent registries and governance are moving into the Microsoft stack. citeturn2news12 Salesforce pushed Agentforce 360 deeper into Slack and enterprise workflows. citeturn0search1 OpenAI’s AgentKit targets faster build–deploy loops and evals for production agents. citeturn0search0 Alongside platforms, interop standards are maturing: Microsoft is aligning with Google’s A2A for agent‑to‑agent collaboration, and MCP remains the dominant tool/data protocol for agents. citeturn0search3turn5search1turn5search0

Commerce leaders are also watching real demand signals: Shopify reports a 7× rise in AI‑originated traffic and 11× growth in AI‑attributed orders, while WIRED notes consumer agents still need oversight for high‑stakes checkouts. Build with ambition—but instrument guardrails and human‑in‑the‑loop. citeturn0search5turn2search6

Your 2026 Agent Platform RFP: 12 sections to include

Interoperability & Standards — Require native support or roadmaps for:
- Model Context Protocol (MCP) for tool/data access via standardized servers. Ask for a list of supported MCP servers and SDK languages. citeturn5search1
- Agent‑to‑Agent (A2A) for cross‑vendor agent collaboration. Request an “Agent Card” schema and examples. citeturn5search0
Agent Registry & Identity — Does the platform provide a registry of agents, capabilities, and permissions (scopes)? How are agent identities issued, rotated, and revoked? (See Microsoft’s positioning for why registries matter.) citeturn2news12
Observability & Evals — Require OpenTelemetry export for traces, metrics, and logs; ask for support of GenAI/agent semantic conventions and dashboards. Include latency, cost, token usage, tool success rates, and eval hooks. citeturn4search3turn4search2
Security & Governance — Ask about prompt‑injection defenses, policy enforcement, approval workflows, data loss prevention, and role‑based action controls. (NVIDIA’s guardrails push shows growing enterprise expectations.) citeturn0search6
Compliance — Map features to your frameworks (EU AI Act risk controls, SOC 2, ISO/IEC 42001, and data residency). Require audit logs exportable to your SIEM.
Workflow & Orchestration — Graph‑based multi‑agent support; human‑in‑the‑loop steps; long‑running tasks and resumability (A2A tasks, UX negotiation). citeturn5search0
Memory & Context — RAG connectors, vector stores, episodic memory limits, PII handling, and cache policies.
Channels & Surfaces — Email, chat, WhatsApp, web, and browser automation support; mobile SDKs.
Deployment Options — SaaS, VPC, and on‑prem; bring‑your‑own‑model and model routing; plugin/extension ecosystem maturity.
Commerce & Payments — If you sell online, ask for agent‑assisted checkout patterns, order risk checks, and rollback controls. Cross‑check claims against your PSP’s roadmap (consumer agents still need oversight). citeturn2search6
Cost & FinOps — Budget controls, per‑agent quotas, dynamic model routing, and cost guardrails exported to your data warehouse for unit economics.
Customer Proof — Ask for production references, agent SLOs, incident postmortems, and time‑to‑resolution stats.

Copy‑paste Scorecard (100 points)

Use this as a baseline; weight to fit your priorities.

Interoperability (MCP/A2A): 15
Observability & Evals (OpenTelemetry + dashboards): 15
Security & Governance (controls, reviews, auditability): 15
Workflow Orchestration (multi‑agent, HITL, long‑running): 10
Memory & Data (RAG, vector stores, PII policies): 10
Channels & Surfaces (supportdesk, email, browser): 5
Deployment & Extensibility (SaaS/VPC/on‑prem, SDKs): 10
Commerce‑readiness (checkout, risk, rollback): 5
Cost & FinOps (routing, budgets, quotas): 10
Proof & References (SLOs, incidents, case studies): 5

Light comparison: what to ask each vendor

Microsoft Agent 365

Positioned as an agent management and governance layer (registry, access, monitoring). Ask about: Entra integration for identities; OpenTelemetry export; A2A roadmap for cross‑vendor collaboration; how approvals, scopes, and audit logs work across Copilot Studio and external agents. citeturn2news12turn0search3

Salesforce Agentforce 360

Focuses on enterprise workflow integration (Sales/Service/Slack) and new prompting tools (Agent Script) plus an Agentforce Builder. Ask about: Slack first‑class agent surfaces; MCP/A2A compatibility for external tools; governance and SLOs; and how Agent Script models “if/then” branching for predictable outcomes. citeturn0search1

OpenAI AgentKit

A toolkit aimed at building and deploying agents with evals, connectors, and UI components. Ask about: connector registry security, eval coverage, MCP alignment, and OpenTelemetry‑friendly traces for agent steps and tool calls. citeturn0search0

Standards to require (no matter who you pick)

MCP for tool/data connectivity and a published catalog of supported servers. citeturn5search1
A2A for agent‑to‑agent collaboration across stacks, including task lifecycle and UX negotiation. citeturn5search0
OpenTelemetry GenAI/agent semantic conventions for end‑to‑end visibility. citeturn4search3

Instrument before you scale (observability quickstart)

Adopt OpenTelemetry’s emerging agentic semantic conventions for spans (agent creation, planning, tool calls, memory), and export traces to your APM. Many teams pair this with open‑source GenAI observability projects so you see latency, cost, and tool success in one place. citeturn4search3turn4search0

30‑day pilot plan (with deep dives)

Define 1–2 high‑ROI use cases (support triage; SEO experiments; order status) and KPIs.
Stand up a minimal registry, identities, and SLOs; require MCP/A2A‑ready components.
Instrument OpenTelemetry, wire dashboards and error budgets.
Run weekly evals; enforce approvals for risky actions; enable human‑in‑the‑loop.
Document incidents, costs, and ROI; decide go/no‑go and scale plan.

Use these vendor‑agnostic playbooks from our library to help execute:

FAQ: common procurement questions

Q: Should we standardize on a single vendor?
A: Choose a primary platform but demand A2A and MCP to avoid lock‑in and to connect external agents and tools. citeturn5search0turn5search1

Q: Do we really need all this observability from day one?
A: Yes—consumer agents are improving but still unreliable for high‑stakes flows; instrumentation and approvals de‑risk rollouts. citeturn2search6

Q: How do we justify budget?
A: Tie to measurable deflection in support, faster SEO experiment cycles, or improved conversion from agent‑assisted shopping—then monitor with OpenTelemetry to prove ROI. citeturn0search5

The bottom line

2026 will reward teams that buy for interoperability (MCP + A2A), prove reliability with OpenTelemetry, and govern agents like employees. Use the RFP and scorecard above to select a platform—and instrument before you scale.

CTA: Want help drafting your RFP or standing up a 30‑day pilot? Subscribe and book a working session with HireNinja’s team.

HireNinja: Blog

recent posts

about