MODULE 06 · ~3 hrs

Audit Documentation & Governance

Master the end-to-end AI audit process: planning, evidence collection, findings classification, report writing, and remediation tracking. Covers audit documentation standards, governance structures, and practical audit checklists.

6.1 — The AI Audit Lifecycle

An AI audit follows a structured lifecycle from initial planning through remediation follow-up. Understanding each phase and its deliverables is critical for effective auditing.

8-Phase AI Audit Lifecycle

Planning

Scope & objectives

→

Evidence

Collect data

→

Analysis

Test & evaluate

→

Findings

Classify issues

→

Report

Draft report

→

Response

Mgmt review

→

Remediation

Fix issues

→

Follow-up

Verify fixes

Planning phase: Define audit objectives, scope (which AI systems, which controls), criteria (against which framework — NIST, ISO 42001, EU AI Act), timeline, and resource requirements. Engage stakeholders early — AI audits require cooperation from data science, engineering, legal, and business teams.

Risk-Based Scoping Criteria

High Impact

Systems affecting individuals' rights, safety, or financial outcomes receive the deepest audit scrutiny.

High Autonomy

Automated decision-making without human oversight requires more extensive testing and control evaluation.

High Exposure

Large user base, sensitive data, or public-facing applications increase the scope requirements.

Regulatory Sensitivity

Systems subject to specific regulations (EU AI Act high-risk, RBI guidelines) need compliance-focused scoping.

★AUDITOR INDEPENDENCE

Auditors must NOT have development responsibilities for the system being audited. An effective audit team includes: technical auditors (ML expertise), governance/compliance auditors, domain experts, and legal advisors. Independence is a fundamental audit principle — expect exam questions on this.

Key Points

Eight-phase audit lifecycle from planning to follow-up

Risk-based scoping prioritizes high-impact systems

Multi-disciplinary teams: technical + governance + domain + legal

Auditor independence is essential

Stakeholder engagement from the planning phase

6.2 — Evidence Collection

Evidence collection is the foundation of any audit finding. Without properly collected, documented, and preserved evidence, findings lack credibility and cannot withstand challenge.

Four Types of Audit Evidence

Evidence Type

What to Examine

Key Tips

Documentation Review

Model cards, system cards, risk assessments, design docs, training logs, approval records

Missing documentation is itself a finding — document what's absent

Technical Testing

Independent evaluation, fairness testing, adversarial/red-team testing, performance benchmarking

Request direct access to models, data, and infrastructure for independent testing

Interviews

Structured interviews with developers, data engineers, product owners, risk officers, end users

Interviews reveal process gaps that documentation cannot capture

Process Observation

Deployment procedures, monitoring dashboards, incident response drills, human oversight mechanisms

Verify that documented procedures match actual practice (say vs do gap)

⚠MISSING DOCUMENTATION IS A FINDING

If an organization cannot provide model cards, risk assessments, or impact assessments, this absence is itself an audit finding. Document what was requested, when, and what was not provided. Missing documentation often indicates deeper governance gaps.

★EXAM TIP

Evidence must be documented with: source, date collected, collection method, relevance to audit criteria, and chain of custody. Digital evidence should be timestamped and stored securely. This mirrors financial audit evidence standards.

Key Points

Four evidence types: documentation, testing, interviews, observation

Missing documentation is a finding

Independent technical testing is essential

Verify documented procedures match actual practice

Evidence chain of custody and secure storage

6.3 — Findings Classification and Reporting

Findings classification ensures that the most critical issues receive immediate attention while providing a structured framework for remediation planning.

Findings Severity Classification

Severity	Description	Required Action Timeline
Critical	Immediate risk to individuals or regulatory non-compliance	Immediate action required
High	Significant control weakness or material gap	Action within 30 days
Medium	Control improvement needed, moderate risk	Action within 90 days
Low	Best practice recommendation, advisory	No mandatory timeline

5C Finding Structure (Standard in Professional Auditing)

Criteria

What was expected — the standard, requirement, or control that should be in place.

Condition

What was actually found — the factual observation during the audit.

Cause

Why the gap exists — root cause analysis of the deficiency.

Consequence

What is the risk or impact — the potential harm if not addressed.

Recommendation

What should be done — specific, actionable remediation steps.

→5C Finding Example

Criteria: ISO 42001 requires AI impact assessments before deployment. Condition: The credit scoring model was deployed without an impact assessment. Cause: No formal pre-deployment review process exists. Consequence: Potential unfair treatment of loan applicants; regulatory non-compliance. Recommendation: Implement mandatory pre-deployment impact assessment gate with documented approval.

★EXAM TIP

The audit report must be accessible to non-technical executives. Structure: Executive summary, scope/methodology, system description, findings by severity, management response (accept/partially accept/reject with action plan, responsible party, and target date), and appendices with detailed test results.

Key Points

Four severity levels: Critical, High, Medium, Low

5C finding structure: Criteria, Condition, Cause, Consequence, Recommendation

Reports must be accessible to non-technical executives

Management response with action plans and deadlines

Follow-up audits verify remediation

6.4 — AI Governance Structures

Effective AI governance requires clear accountability structures, from board-level oversight to operational teams. The three lines of defense model provides a proven framework.

Three Lines of Defense Model for AI

3rd Line — Internal Audit

Independent assurance and objective evaluation

2nd Line — Risk & Compliance

Oversee, challenge, and monitor first line

1st Line — Operations

AI development and operations teams own and manage risks

Essential AI Governance Artifacts

Artifact	Purpose	Review Frequency
AI Policy	Sets organizational principles and boundaries for AI use	Annually
AI Risk Appetite Statement	Defines acceptable risk levels for AI systems	Annually
AI System Register/Inventory	Catalogs all AI systems with risk classifications	Continuously updated
Model Risk Management Framework	Governs model development, validation, and monitoring	Annually
Data Governance Framework	Ensures data quality, provenance, and privacy compliance	Annually
Incident Response Plan	Procedures for AI-specific failures and incidents	Annually + after incidents
Responsible AI Principles	Ethical guidelines for AI development and deployment	Biennially

AI Governance Reporting Structure

Board / Risk Committee

Ultimate oversight

↓

Chief AI Officer

Strategic AI leadership

↓

AI Ethics Committee

Cross-functional review

↓

AI Risk Team

Operational risk management

⚠SHADOW AI

Shadow AI — unauthorized use of AI tools by employees (e.g., uploading confidential data to ChatGPT) — is an emerging governance challenge. Governance must address the full AI supply chain: in-house models, fine-tuned models, third-party APIs, open-source components, training data, and shadow AI.

Key Points

Board-level AI oversight is essential

Three lines of defense: operations, risk/compliance, internal audit

Essential artifacts: AI policy, risk appetite, system register

Full supply chain governance including shadow AI

Annual review and update cycle

// Practice Questions

Q1: What is the 5C structure for audit findings?

Show Answer

Criteria (what was expected), Condition (what was found), Cause (why the gap exists), Consequence (what is the risk/impact), and Recommendation (what should be done).

Q2: Describe the three lines of defense model for AI governance.

Show Answer

First line: AI development and operations teams own and manage risks. Second line: AI risk and compliance function oversees and challenges. Third line: Internal audit provides independent assurance.

Q3: What four types of evidence should an AI auditor collect?

Show Answer

Documentation review (model cards, risk assessments, etc.), technical testing (independent evaluation, fairness testing, red-teaming), interviews (structured interviews with key personnel), and process observation (verifying procedures match practice).

Q4: How are audit findings classified by severity?

Show Answer

Critical (immediate risk, immediate action required), High (significant weakness, action within 30 days), Medium (improvement needed, 90 days), Low (best practice recommendation, advisory).

Q5: What are the essential AI governance artifacts an auditor should verify?

Show Answer

AI Policy, AI Risk Appetite Statement, AI System Register/Inventory, Model Risk Management Framework, Data Governance Framework, Incident Response Plan, and Responsible AI Principles. These should be reviewed and updated annually.

Q6: Why is auditor independence important in AI audits?

Show Answer

Auditors must not have development responsibilities for the system being audited. Independence ensures objective evaluation free from conflicts of interest. An auditor who developed the system cannot objectively assess their own work. This is a fundamental professional audit principle.

Q7: What is shadow AI and why is it a governance concern?

Show Answer

Shadow AI is unauthorized use of AI tools by employees (e.g., uploading confidential data to public AI services). It's a governance concern because it creates data leakage risks, compliance violations, and uncontrolled AI usage outside the organization's risk management framework.

Q8: What should an audit report contain, and who is the primary audience?

Show Answer

Structure: Executive summary, scope/methodology, system description, findings by severity, management response, and appendices. The primary audience is non-technical executives and board members, so reports must translate technical findings into business risk language.

← 05. Model Cards & Red-Teaming 07. Exam Preparation & Practice →