EU AI Act enforcement begins August 2, 2026 — Are you ready?
← Back to Case Studies

scikit-learn Governance Audit

scikit-learn scores 18/100 on enforcement posture -- the ML library powering insurance underwriting, medical diagnostics, and fraud detection has zero hardcoded secrets (best in our portfolio) but zero enforcement hooks and no AI agent instructions.

Overall Score: 18/100 (Grade: F)

20/100
Enforcement Maturity
Grade: D
10/100
Context Hygiene
Grade: F
26/100
Automation Readiness
Grade: D

Executive Summary

scikit-learn is the foundational machine learning library for the Python ecosystem, with 60,000+ GitHub stars and ubiquitous adoption across safety-critical domains including insurance underwriting (Gradient AI), medical diagnostics, fraud detection, and credit scoring. It is the default ML toolkit for enterprises building regulated AI systems.

An automated governance audit reveals that despite scikit-learn's maturity and excellent security posture (zero hardcoded secrets -- the best in our portfolio), the project has critical structural gaps in AI governance. Tests exist but are embedded within package directories, and the absence of enforcement hooks and agent instructions leaves a structural governance gap in safety-critical ML infrastructure.

Enforcement Ladder Distribution

L5 - Hooks0 found

No automated enforcement before commits or tool use

L4 - Tests0 at root*

Tests exist within sklearn/ packages (sklearn/tests/, sklearn/cluster/tests/) but not discovered at root

L3 - Templates21 GH Actions + CircleCI + Makefile

Strong CI pipeline with multi-platform testing

L2 - Prose0 rules

AGENTS.md present but no CLAUDE.md or agent-specific enforcement rules

L1 - ConversationDefault

Default mode for all AI interactions

Diagnosis: scikit-learn has strong L3 investment (21 GitHub Actions + CircleCI + Makefile) and embedded L4 tests, but zero L2 (prose rules) and L5 (hooks). The test discovery gap creates a misleading governance picture -- the project has more test infrastructure than the score suggests, but the lack of enforcement hooks means AI agents operate with zero structural guardrails on a library used in safety-critical ML pipelines.

Critical Gaps Found

1. No L5 (Hook) Enforcement [CRITICAL]

No pre-commit hooks or Claude Code hooks were found. AI agents can modify any module -- including safety-critical estimators used in medical diagnostics and credit scoring -- without structural gatekeeping. A subtle change to a default parameter in sklearn.linear_model could silently affect thousands of downstream models.

2. Test Discovery Gap [CRITICAL]

Scanner detects 0 test files at root because tests are embedded within sklearn/ package directories -- a standard Python packaging pattern. While this is idiomatic, it creates governance visibility issues. Note: this is a scanner limitation, not a project deficiency.

3. No CLAUDE.md / Agent Instructions [HIGH]

No CLAUDE.md or equivalent AI agent instruction file was found. AGENTS.md is present (early governance awareness) but does not provide enforcement-level instructions. Every AI session starts from zero context on scikit-learn's complex estimator interface and API design patterns.

4. High TODO/FIXME Debt [MEDIUM]

624 TODO/FIXME/HACK markers found across the codebase. No systematic process for converting TODOs to actionable work items. AI agents may encounter and incorrectly "fix" TODO items in safety-critical code paths.

Positive: Zero Hardcoded Secrets

scikit-learn has 0 potential hardcoded secrets detected across 660 source files -- the best security posture in our entire audit portfolio. This demonstrates clean credential hygiene and a security-conscious development culture that other projects should emulate.

EU AI Act Compliance Mapping

scikit-learn is the foundational ML library underlying many high-risk AI systems (EU AI Act enforcement deadline: August 2, 2026). Organizations using scikit-learn in insurance underwriting, medical diagnostics, fraud detection, or credit scoring must ensure governance extends through the library layer.

Article 9: Risk Management System

RequirementReadiness
9(2)(a) Risk identification10%
9(2)(b) Risk evaluation5%
9(2)(d) Risk management measures15%
9(6) Testing for risk management30%
9(7) Lifecycle risk management5%

Article 15: Accuracy, Robustness and Cybersecurity

RequirementReadiness
15(1) Accuracy levels25%
15(2) Error resilience20%
15(3) Manipulation robustness5%
15(4) Cybersecurity40%

Article 17: Quality Management System

RequirementReadiness
17(1)(a) Compliance strategy5%
17(1)(b) Design/development procedures20%
17(1)(c) Test/validation procedures25%
17(1)(g) Post-market monitoring0%
Overall EU AI Act Readiness: ~15%

This is especially concerning for the ML library most commonly used in regulated domains. Organizations building high-risk AI systems with scikit-learn inherit these governance gaps unless they implement their own enforcement layer.

Recommendations

Immediate (Week 1)

  1. Create CLAUDE.md with estimator interface requirements, API compatibility rules, deprecation workflow, and testing requirements -- 1 hour effort, high impact
  2. Add 3 pre-commit hooks for estimator interface validation, parameter deprecation checks, and test co-location -- 2 hours effort
  3. Add root-level test orchestration to improve governance tool compatibility -- 30 minutes effort

Short-term (Month 1)

  1. Deploy L5 enforcement hooks for safety-critical estimator paths
  2. Set up violation tracking to build a risk register from enforcement data
  3. Create AI agent governance documentation mapping to EU AI Act articles

Strategic (Quarter)

  1. Build enforcement ladder documentation mapping to EU AI Act requirements
  2. Establish violation tracking across contributor AI tool usage
  3. Autoresearch optimization -- auto-tune enforcement rules based on violation patterns

Appendix: Raw Scan Data

0*
Test Files
660
Source Files
21
GitHub Actions
0
Potential Secrets
624
TODO/FIXME
490
Dead Code Markers
0
CLAUDE.md Files
0
L5 Hooks
3
Doc Files

*Test files show 0 at root level. Tests are embedded within sklearn/ package directories (sklearn/tests/, sklearn/cluster/tests/, etc.) following standard Python packaging conventions.

Want this analysis for your codebase?

Get the same structural governance audit -- risk classification, violation scan, and enforcement recommendations.

Request a Free Audit
This governance audit was generated by walseth.ai using automated enforcement posture scanning. The findings are based on static analysis of the repository structure, configuration files, and code patterns -- no code was executed during the audit.

Get Your Free AI Governance Audit

Submit your repository and receive a structural governance assessment -- risk classification, violation scan, and enforcement recommendations. No cost, no commitment.

Request Free Audit