← All Modules
MODULE 06 · ~3 hrs

Audit Documentation & Governance

Master the end-to-end AI audit process: planning, evidence collection, findings classification, report writing, and remediation tracking. Covers audit documentation standards, governance structures, and practical audit checklists.

6.1 — The AI Audit Lifecycle

An AI audit follows a structured lifecycle from initial planning through remediation follow-up. Understanding each phase and its deliverables is critical for effective auditing.

8-Phase AI Audit Lifecycle
Planning
Scope & objectives
Evidence
Collect data
Analysis
Test & evaluate
Findings
Classify issues
Report
Draft report
Response
Mgmt review
Remediation
Fix issues
Follow-up
Verify fixes

Planning phase: Define audit objectives, scope (which AI systems, which controls), criteria (against which framework — NIST, ISO 42001, EU AI Act), timeline, and resource requirements. Engage stakeholders early — AI audits require cooperation from data science, engineering, legal, and business teams.

Risk-Based Scoping Criteria
01
High Impact

Systems affecting individuals' rights, safety, or financial outcomes receive the deepest audit scrutiny.

02
High Autonomy

Automated decision-making without human oversight requires more extensive testing and control evaluation.

03
High Exposure

Large user base, sensitive data, or public-facing applications increase the scope requirements.

04
Regulatory Sensitivity

Systems subject to specific regulations (EU AI Act high-risk, RBI guidelines) need compliance-focused scoping.

AUDITOR INDEPENDENCE

Auditors must NOT have development responsibilities for the system being audited. An effective audit team includes: technical auditors (ML expertise), governance/compliance auditors, domain experts, and legal advisors. Independence is a fundamental audit principle — expect exam questions on this.

Key Points
Eight-phase audit lifecycle from planning to follow-up
Risk-based scoping prioritizes high-impact systems
Multi-disciplinary teams: technical + governance + domain + legal
Auditor independence is essential
Stakeholder engagement from the planning phase

6.2 — Evidence Collection

Evidence collection is the foundation of any audit finding. Without properly collected, documented, and preserved evidence, findings lack credibility and cannot withstand challenge.

Four Types of Audit Evidence
Evidence Type
What to Examine
Key Tips
Documentation Review
Model cards, system cards, risk assessments, design docs, training logs, approval records
Missing documentation is itself a finding — document what's absent
Technical Testing
Independent evaluation, fairness testing, adversarial/red-team testing, performance benchmarking
Request direct access to models, data, and infrastructure for independent testing
Interviews
Structured interviews with developers, data engineers, product owners, risk officers, end users
Interviews reveal process gaps that documentation cannot capture
Process Observation
Deployment procedures, monitoring dashboards, incident response drills, human oversight mechanisms
Verify that documented procedures match actual practice (say vs do gap)
MISSING DOCUMENTATION IS A FINDING

If an organization cannot provide model cards, risk assessments, or impact assessments, this absence is itself an audit finding. Document what was requested, when, and what was not provided. Missing documentation often indicates deeper governance gaps.

EXAM TIP

Evidence must be documented with: source, date collected, collection method, relevance to audit criteria, and chain of custody. Digital evidence should be timestamped and stored securely. This mirrors financial audit evidence standards.

Key Points
Four evidence types: documentation, testing, interviews, observation
Missing documentation is a finding
Independent technical testing is essential
Verify documented procedures match actual practice
Evidence chain of custody and secure storage

6.3 — Findings Classification and Reporting

Findings classification ensures that the most critical issues receive immediate attention while providing a structured framework for remediation planning.

Findings Severity Classification
SeverityDescriptionRequired Action Timeline
CriticalImmediate risk to individuals or regulatory non-complianceImmediate action required
HighSignificant control weakness or material gapAction within 30 days
MediumControl improvement needed, moderate riskAction within 90 days
LowBest practice recommendation, advisoryNo mandatory timeline
5C Finding Structure (Standard in Professional Auditing)
01
Criteria

What was expected — the standard, requirement, or control that should be in place.

02
Condition

What was actually found — the factual observation during the audit.

03
Cause

Why the gap exists — root cause analysis of the deficiency.

04
Consequence

What is the risk or impact — the potential harm if not addressed.

05
Recommendation

What should be done — specific, actionable remediation steps.

5C Finding Example

Criteria: ISO 42001 requires AI impact assessments before deployment. Condition: The credit scoring model was deployed without an impact assessment. Cause: No formal pre-deployment review process exists. Consequence: Potential unfair treatment of loan applicants; regulatory non-compliance. Recommendation: Implement mandatory pre-deployment impact assessment gate with documented approval.

EXAM TIP

The audit report must be accessible to non-technical executives. Structure: Executive summary, scope/methodology, system description, findings by severity, management response (accept/partially accept/reject with action plan, responsible party, and target date), and appendices with detailed test results.

Key Points
Four severity levels: Critical, High, Medium, Low
5C finding structure: Criteria, Condition, Cause, Consequence, Recommendation
Reports must be accessible to non-technical executives
Management response with action plans and deadlines
Follow-up audits verify remediation

6.4 — AI Governance Structures

Effective AI governance requires clear accountability structures, from board-level oversight to operational teams. The three lines of defense model provides a proven framework.

Three Lines of Defense Model for AI
3rd Line — Internal Audit
Independent assurance and objective evaluation
2nd Line — Risk & Compliance
Oversee, challenge, and monitor first line
1st Line — Operations
AI development and operations teams own and manage risks
Essential AI Governance Artifacts
ArtifactPurposeReview Frequency
AI PolicySets organizational principles and boundaries for AI useAnnually
AI Risk Appetite StatementDefines acceptable risk levels for AI systemsAnnually
AI System Register/InventoryCatalogs all AI systems with risk classificationsContinuously updated
Model Risk Management FrameworkGoverns model development, validation, and monitoringAnnually
Data Governance FrameworkEnsures data quality, provenance, and privacy complianceAnnually
Incident Response PlanProcedures for AI-specific failures and incidentsAnnually + after incidents
Responsible AI PrinciplesEthical guidelines for AI development and deploymentBiennially
AI Governance Reporting Structure
Board / Risk Committee
Ultimate oversight
Chief AI Officer
Strategic AI leadership
AI Ethics Committee
Cross-functional review
AI Risk Team
Operational risk management
SHADOW AI

Shadow AI — unauthorized use of AI tools by employees (e.g., uploading confidential data to ChatGPT) — is an emerging governance challenge. Governance must address the full AI supply chain: in-house models, fine-tuned models, third-party APIs, open-source components, training data, and shadow AI.

Key Points
Board-level AI oversight is essential
Three lines of defense: operations, risk/compliance, internal audit
Essential artifacts: AI policy, risk appetite, system register
Full supply chain governance including shadow AI
Annual review and update cycle
// Practice Questions
Q1: What is the 5C structure for audit findings?
Show Answer

Criteria (what was expected), Condition (what was found), Cause (why the gap exists), Consequence (what is the risk/impact), and Recommendation (what should be done).

Q2: Describe the three lines of defense model for AI governance.
Show Answer

First line: AI development and operations teams own and manage risks. Second line: AI risk and compliance function oversees and challenges. Third line: Internal audit provides independent assurance.

Q3: What four types of evidence should an AI auditor collect?
Show Answer

Documentation review (model cards, risk assessments, etc.), technical testing (independent evaluation, fairness testing, red-teaming), interviews (structured interviews with key personnel), and process observation (verifying procedures match practice).

Q4: How are audit findings classified by severity?
Show Answer

Critical (immediate risk, immediate action required), High (significant weakness, action within 30 days), Medium (improvement needed, 90 days), Low (best practice recommendation, advisory).

Q5: What are the essential AI governance artifacts an auditor should verify?
Show Answer

AI Policy, AI Risk Appetite Statement, AI System Register/Inventory, Model Risk Management Framework, Data Governance Framework, Incident Response Plan, and Responsible AI Principles. These should be reviewed and updated annually.

Q6: Why is auditor independence important in AI audits?
Show Answer

Auditors must not have development responsibilities for the system being audited. Independence ensures objective evaluation free from conflicts of interest. An auditor who developed the system cannot objectively assess their own work. This is a fundamental professional audit principle.

Q7: What is shadow AI and why is it a governance concern?
Show Answer

Shadow AI is unauthorized use of AI tools by employees (e.g., uploading confidential data to public AI services). It's a governance concern because it creates data leakage risks, compliance violations, and uncontrolled AI usage outside the organization's risk management framework.

Q8: What should an audit report contain, and who is the primary audience?
Show Answer

Structure: Executive summary, scope/methodology, system description, findings by severity, management response, and appendices. The primary audience is non-technical executives and board members, so reports must translate technical findings into business risk language.

05. Model Cards & Red-Teaming07. Exam Preparation & Practice