Scheduled RCA
Run recurring RCA checks around deploy windows, high-risk services, provider events, or known operational weak spots.
AI root cause analysis for modern operations
NOT ANOTHER SAAS. REAL DEPLOYMENT INSIDE THE CUSTOMER ENVIRONMENT.
MCP connector fabric
Each service gets a scoped OpsDiag agent path: the agent talks to the matching MCP Server, the MCP Server talks to the provider API, and AI tokens carry only the context needed for RCA.
Agentic RCA simulation
The 3D scene below simulates how Agentic RCA works: an incident becomes a plan, scoped agents query MCP Servers and provider APIs asynchronously, and evidence returns into one reviewable RCA analysis.
3D Agentic RCA simulation
Loading the Agentic RCA path from incident plan to MCP evidence and review.
AI RCA workflow
OpsDiag is built for teams that want scheduled checks, live incident questions, service evidence through MCP Servers, and terminal-native analysis through Codex CLI or Claude CLI.
Run recurring RCA checks around deploy windows, high-risk services, provider events, or known operational weak spots.
Start from an alert, incident note, or natural-language question and keep asking follow-ups against the same evidence thread.
Use MCP Servers as the path into logs, metrics, traces, Kubernetes, cloud APIs, deploy history, edge rules, and alerting systems.
Continue the RCA from terminal workflows, challenge the evidence, and refine the analysis without switching into another dashboard.
What OpsDiag actually does
Scheduled RCA runs can watch known risk windows. Chat-driven RCA can start from a fresh alert. MCP Servers connect the analysis to services and providers. Codex CLI and Claude CLI keep the investigation available from the terminal.
Scheduled
Recurring RCA checks can inspect service risk, recent changes, and provider signals before responders are deep in escalation.
Chat
Responders can ask what changed, what is affected, and what still needs validation from the same RCA conversation.
MCP
MCP Servers connect scoped agents to services, clusters, observability tools, cloud providers, edge controls, and incident systems.
CLI
Terminal workflows can inspect the RCA, ask follow-up questions, challenge weak causes, and keep investigation notes close to the operator.
Change
Recent releases, infrastructure changes, edge rules, policy updates, and provider events are placed beside symptoms.
Evidence
Each proposed cause includes the signals that support it and the signals that still need validation.
Action
Responders get validation checks, rollback candidates, mitigation steps, owners, and handoff-ready notes.
Prevent
The RCA closes with guardrails, alert tuning, runbook updates, and checks that reduce repeat incidents.
RCA capabilities
OpsDiag provides scheduled checks, chat-based RCA, MCP Server evidence collection, and terminal-native analysis through Codex CLI or Claude CLI.
Scheduled RCA
OpsDiag can run RCA analysis around planned releases, high-risk services, recurring alert windows, and provider-change windows, so the evidence thread already exists when responders need it.
Chat to RCA
Responders can ask what changed, what is affected, which service is suspicious, what evidence is missing, or which rollback is safest, then continue against the same RCA context.
MCP + CLI
MCP Servers connect to logs, metrics, traces, Kubernetes, cloud APIs, deploy history, edge rules, and alerting systems, while Codex CLI or Claude CLI keeps the RCA available from terminal workflows.