# Performance Investigation Kickoff Template

Use or adapt this prompt to activate the agent's full capabilities with maximum context.

---

**Activate ApexPerf — Senior Performance Engineer mode.**

I need a comprehensive, evidence-based performance diagnosis and prioritized remediation plan for the following system.

## System Under Investigation
- **Component / Service / Flow**: (e.g., Checkout gRPC API, user profile aggregation job, feed generation pipeline)
- **Primary Language & Framework**: (e.g., Go 1.22 + ConnectRPC, Java 17 + Spring Boot 3.2 + Hibernate, Node 20 + Fastify)
- **Infrastructure**: (e.g., Kubernetes on GKE, 18 pods, 4 vCPU/8 GiB, HPA enabled; PostgreSQL 16 on Cloud SQL; Redis cluster; Cloudflare CDN)
- **Traffic Profile**: (average RPS, peak RPS, daily/weekly pattern, multi-tenant or single-tenant, read/write mix)

## Observed Symptoms & Business Impact
- Latency: p50=__, p95=__, p99=__, p999=__ (target: __)
- Throughput: __ RPS sustainable before degradation
- Error rate: __% (types: __)
- Resource symptoms: (CPU throttling, connection pool exhaustion, GC pressure, high I/O wait, etc.)
- Business impact: (user abandonment, revenue impact, on-call toil, monthly infra cost)

## Provided Telemetry & Artifacts
(Paste or attach the highest-fidelity data available)
- Metrics / dashboards (Prometheus/Grafana key panels during incident or peak)
- Distributed traces (trace ID, JSON, or screenshots of slow spans)
- CPU / memory / allocation profiles (async-profiler collapsed stacks, pprof, JFR, flamegraph links)
- Database: slow query log, pg_stat_statements or performance_schema top queries, EXPLAIN ANALYZE for the worst queries, wait events
- Load test reports (k6 summary JSON or HTML)
- Recent changes (deployments, config, schema, traffic shift)
- Architecture / data flow description or diagram (mermaid or text)

## Goals & Constraints
- Target SLO: (e.g., p99 < 80 ms at 2.5× current peak)
- Constraints: (cannot increase instance count >15%, must stay within current monthly budget, cannot relax durability on payments path, 4-week delivery window, etc.)

## Your Task
1. Apply the full USE/RED/Golden Signals + queueing theory methodology.
2. Identify the top 1-3 bottlenecks with evidence, confidence levels, and contribution percentages.
3. Deliver a prioritized list of interventions with quantified expected impact, effort/complexity, risk, implementation sketch, and exact validation steps (metrics + duration + statistical method).
4. Explicitly list any critical observability gaps that are slowing diagnosis today and that should be closed permanently.
5. Structure the entire response according to the exact format defined in STYLE.md.

If the provided data is insufficient for high-confidence conclusions, state the 3-5 highest-value additional artifacts or experiments required, in priority order, before giving detailed recommendations.

Begin now.