# Aegis Standard Audit Protocol (v2.3)

This document defines the mandatory phased methodology that Aegis follows for all comprehensive engagements. Deviations require explicit justification and client approval.

## Phase 0: Engagement Setup & Scoping (1-3 days)

**Objectives**: Establish precise boundaries, success criteria, and communication protocols. Identify all relevant stakeholders and information sources. Surface any scope limitations that would prevent reasonable assurance.

**Activities**:
1. Kick-off workshop (2-4 hours) with key roles.
2. Review of all provided artifacts against the Aegis Artifact Checklist.
3. Construction of initial Threat Model and Risk Universe tailored to the use case.
4. Drafting of detailed Scope Statement, including: In-scope models, versions, environments, time periods; In-scope risk categories and regulatory frameworks; Explicit exclusions; Sampling strategy and statistical confidence targets.
5. Agreement on secure data handling procedures and access credentials.

**Exit Criteria**: Signed Scope Document | Information Request List #1 delivered | Preliminary Risk Hypotheses (minimum 8-12) documented.

## Phase 1: Documentation & Governance Review (3-7 days)

**Objectives**: Assess maturity of AI governance, risk management, and documentation practices. Identify "paper vs practice" gaps.

**Key Review Areas** (mapped to ISO 42001 and NIST Govern/Map): AI Policy, Risk Appetite, Roles & Responsibilities (RACI) | AI Impact Assessment / DPIA / FRIA quality and completeness | Model Card, Data Card, System Card currency and accuracy | Training & evaluation records, hyperparameter logs, data provenance | Change management and approval gates for model promotion | Incident management, complaint handling, and human override procedures | Third-party / vendor AI risk management | Board and executive oversight mechanisms.

**Deliverable**: Phase 1 Checkpoint Report with maturity scoring (0-5) across 12 governance dimensions + initial finding set.

## Phase 2: Technical Architecture & Implementation Review (5-10 days)

**Objectives**: Verify that the "as-built" system matches documented claims. Identify technical weaknesses in data pipelines, model design, serving infrastructure, and security controls.

**Activities**: Static code analysis of training, evaluation, and inference codebases | Configuration and dependency review (including supply chain) | Data pipeline audit (collection → labeling → cleaning → feature engineering → storage) | Model architecture and training process review | Inference stack security (API authz, rate limiting, input sanitization, logging) | Monitoring & observability stack evaluation | Secrets and PII handling audit.

**Deliverable**: Technical Findings Report + updated risk register.

## Phase 3: Behavioral Evaluation & Fairness Measurement (4-8 days)

**Objectives**: Empirically measure performance, robustness, fairness, and safety properties under a wide range of conditions.

**Core Workstreams**:
1. Capability & Performance Benchmarking against claimed use-case requirements and relevant public leaderboards.
2. Distribution Shift & Stress Testing: OOD detection, adversarial distribution shifts, long-tail scenarios.
3. Fairness Audit: Protected attribute analysis, subgroup performance disparities, disparate impact ratios, causal fairness probes.
4. Safety & Harmlessness Evaluation using standardized harm taxonomies + client-specific policies.
5. Explainability & Transparency Testing: Quality and stability of explanations provided to end users or downstream systems.

**Methods**: Mix of automated evaluation harnesses, human red team, and expert review.

## Phase 4: Adversarial Red Teaming & Security Testing (5-12 days)

**Objectives**: Discover vulnerabilities that would not be found through standard testing.

**Mandatory Elements** (tailored by system type):
- Prompt Injection & Jailbreak Suite (for generative/LLM systems)
- Adversarial Example Generation (gradient-based, query-based, transfer attacks)
- Data Poisoning & Backdoor Detection (where training history allows)
- Model Extraction & Membership Inference Attempts
- Agentic Misuse Scenarios: Tool abuse, goal misgeneralization, multi-step attacks, sandbox escape
- Human Red Teaming: Structured exercises with domain experts and/or paid red teamers following agreed rules of engagement
- Supply Chain Attack Vectors (poisoned dependencies, compromised model weights)

All successful attacks are documented with full reproduction steps, severity, and suggested countermeasures.

## Phase 5: Risk Aggregation, Compliance Mapping & Reporting (3-5 days)

**Objectives**: Synthesize all evidence into coherent, prioritized, and defensible conclusions. Map every material finding to specific control requirements.

**Activities**:
1. Consolidation of all findings into master register with cross-references.
2. Application of quantitative or semi-quantitative risk scoring (likelihood × impact × detectability × velocity).
3. Production of visual risk heatmaps and trend views.
4. Detailed compliance matrices for each regulatory framework in scope.
5. Drafting of formal Opinion Letter (modeled on ISAE 3000 / SSAE 18 / ISO 17021 assurance standards language).
6. Development of prioritized remediation roadmap with sequencing, dependencies, and rough order-of-magnitude effort.
7. Internal quality review and challenge session (meta-audit of the audit itself).

## Phase 6: Reporting, Presentation & Handover

**Standard Deliverables**: Full written report (PDF + editable source) | Executive presentation (45-60 min board / risk committee ready) | Raw finding data export (CSV/JSON) for client's GRC system | Optional: Remediation tracking workbook.

**Close-out**: Management response collection | 30-day follow-up checkpoint (optional) | Lessons-learned session for Aegis process improvement.

---
**End of Protocol**