AI Governance Tools for Startups: A Pragmatic 2026 Buyer's Guide
A pragmatic buyer's guide for early-stage startups: free and open-source AI governance tools (Promptfoo, Giskard, Langfuse, NIST AI RMF Playbook, FairLearn), free vendor tiers, and when to upgrade to paid platforms — with a 30-day implementation plan grounded in the EU AI Act, NIST AI RMF, and SOC 2.
By AI Compliance Vendors Editorial · Published April 26, 2026 · Last verified April 26, 2026
TL;DR
- Enterprise procurement teams now treat AI governance documentation as a hard gate: no model inventory, no incident response plan, no RFP. Promptfoo's 2025 AI Regulation Report documents the shift — security questionnaires added AI-specific sections and customers began demanding model cards and evaluation artifacts.
- The EU AI Act has GDPR-style extraterritorial reach: if your AI outputs are consumed by EU users, Article 2 applies regardless of where your company is incorporated. High-risk obligations fully activate on 2 August 2026.
- A seed-stage startup can build a credible minimum viable governance posture in 30 days using free and open-source tools — Promptfoo, Giskard, Langfuse, and the NIST AI RMF Playbook cover the critical gaps.
- ISO/IEC 42001 and SOC 2 are becoming RFP requirements for AI vendors selling into mid-market and enterprise accounts; ISO 42001 certification for a small organization typically costs $15K–$40K total and takes four to nine months.
- Free tiers across LangSmith, Arize AX, Galileo, Patronus, Langfuse, and Weights & Biases give you production-grade observability at zero cost during early growth; the paid upgrade triggers are clear and manageable.
Why Startups Need AI Governance Now
Enterprise Procurement Has Changed
For years, a startup building an AI product could defer governance work until "later." That window has closed. Enterprise security teams now embed AI-specific questions directly into procurement workflows, and those questions are not advisory — they are gates.
Promptfoo's December 2025 analysis of the regulatory landscape put it plainly: "Enterprise security questionnaires added AI sections. Customers started asking for model cards and evaluation reports. RFPs began requiring documentation that didn't exist six months ago." The artifact list that enterprise procurement teams now request includes a system card (which models you use, what prompts and policies govern them, what retrieval sources are in scope), evaluation artifacts (red-team results for prompt injection, tool misuse, and data leakage), an acceptable use policy, and a documented feedback/incident triage mechanism.
According to Quantarra's 2025 SOC 2 report, auditors are embedding AI governance requirements directly within the SOC 2 Trust Service Criteria, focusing on how organizations control data that trains and powers their models. AI vendors processing regulated data face the toughest scrutiny.
The EU AI Act Reaches Your Startup Even Without an EU Office
The EU AI Act (Regulation 2024/1689) entered into force on 1 August 2024 and applies fully from 2 August 2026. Its extraterritorial scope is explicit in Article 2: the Act applies to providers placing AI systems on the EU market "irrespective of whether those providers are established or located within the Union or in a third country," and also covers providers and deployers established outside the EU where "the output produced by the AI system is used in the Union."
In plain terms: if a US startup's SaaS product has EU customers, or if an EU company embeds your API, you are in scope. Modulos's analysis notes that non-EU providers of high-risk systems must also designate an authorised representative in the EU under Article 22. Tredence's 2026 compliance guide summarizes the penalty exposure: up to €35 million or 7% of global annual turnover for the most serious violations.
Key dates for planning purposes:
| Date | Obligation |
|---|---|
| 2 February 2025 | Prohibited AI practices and AI literacy obligations (Chapter II) |
| 2 August 2025 | GPAI model obligations (Chapter V), governance rules (Chapter VII) |
| 2 August 2026 | High-risk AI system requirements (Annex III) fully applicable |
| 2 August 2027 | High-risk AI embedded in regulated products; GPAI models already on market |
Source: EU AI Act text, Articles 113–114
For a startup building a product that touches recruitment, credit scoring, healthcare, or law enforcement — all defined as high-risk use cases under Annex III of the Act — the August 2026 deadline is material. Even for minimal-risk AI, the transparency obligations and the GPAI Code of Practice (published July 2025 per the European Commission) affect how you document and disclose model behavior.
ISO 42001 and SOC 2 Are Appearing in RFPs
ISO/IEC 42001:2023 — published December 2023 as the world's first AI management system standard — specifies requirements for establishing, implementing, and continually improving an Artificial Intelligence Management System (AIMS). It is voluntary but increasingly demanded by enterprise buyers who need auditor-verified evidence of AI risk controls. Workstreet's March 2026 analysis identifies the trigger: "If prospects are asking about model training practices, bias controls, or AI risk management, that's often a signal that ISO 42001 could be worth it."
SOC 2 remains the baseline floor. Per Comp AI's 2026 SOC 2 guide: "Enterprise procurement requires it. Mid-market and enterprise security teams use SOC 2 as a gating check. No report, no RFP, no POC. That goes double for AI vendors processing regulated data." The practical 2026 compliance stack for an AI startup is SOC 2 for data-security trust, ISO 42001 for AI governance, the NIST AI RMF as internal risk methodology, and the EU AI Act for anything touching the EU market. See the SOC 2 cost calculator and audit-firm directory at soc2vendors.com for audit cost estimates specific to your stage.
Investor Due Diligence Is Asking the Same Questions
VCs conducting technical due diligence in 2025 added AI governance to their standard framework. Kruze Consulting's 2025 VC due diligence trends report notes that data security practices and regulatory compliance — particularly for AI and ML work — are now central to due diligence at late seed and growth stages. Stealth Cloud's analysis puts the funding context: in 2025, venture capital invested over $97 billion globally in AI companies, yet fewer than 15% of VC firms reported having a formal AI data practices assessment framework, creating a knowledge asymmetry that founders can turn into an advantage by documenting governance proactively.
For a startup reading this, the practical implication is straightforward: having a model inventory, documented third-party LLM usage, and a basic incident plan will distinguish you in both enterprise sales cycles and fundraising processes. Our AI compliance vendor due diligence guide covers what sophisticated buyers examine at each stage.
What "Good Enough" Looks Like at Seed–Series A: Minimum Viable Governance
Governance work scales with risk and resources. A five-person seed startup does not need the same governance infrastructure as a 500-person regulated-industry AI vendor. The minimum viable governance (MVG) posture at seed–Series A has four components.
1. Model Inventory (AI Bill of Materials)
A model inventory — sometimes called an AI Bill of Materials (AI-BOM) — is a structured record of every AI model your product uses, whether built internally or called via third-party API. At minimum, each entry should document: model name and version, provider or source, intended use case, input/output data types, risk classification (per the EU AI Act's four tiers or your own framework), and the owner responsible for monitoring.
For most seed-stage products, this is a spreadsheet. It does not need to be a sophisticated tool. What matters is that it exists, stays current, and can be produced on request. Wiz's AI-BOM guide notes that NIST AI RMF and ISO 42001 both explicitly reference model asset documentation as a requirement for traceability and audit readiness.
2. Basic AI Use Policy
A written AI use policy establishes what your organization considers acceptable use of AI tools, both in your product and internally. It should cover: which AI systems are approved for use, data classification (what data can and cannot be processed by external AI APIs), human oversight requirements for high-stakes outputs, and expectations for staff who work with AI tools.
The NIST AI RMF Playbook provides a free template structure organized around the four AI RMF functions: Govern, Map, Measure, and Manage. Annex A of ISO 42001 provides a controls checklist you can use to gap-assess your policies at no cost — the standard itself requires purchase, but the controls structure is described in publicly available summaries. Our AI Impact Assessment template provides a structured starting point aligned to both NIST and EU AI Act requirements.
3. Third-Party LLM Tracking
Most startups at this stage use one or more foundation model APIs — OpenAI, Anthropic, Google, Cohere, or similar. Each API relationship creates vendor risk that needs to be tracked: what data is sent to the provider, whether that data is used for training, what the provider's data retention policy is, and what the Data Processing Agreement (DPA) says.
A basic vendor risk register capturing these details satisfies both SOC 2 vendor management requirements and the EU AI Act's supply chain traceability expectations. For each API dependency, document the provider, the data categories transmitted, the contractual data handling terms, and the date of last review.
4. Incident Response Plan (AI-Specific)
A standard incident response plan covers data breaches and security events. An AI-specific addendum should define what constitutes an AI incident (harmful outputs, model failure, prompt injection success, discriminatory decisions, data leakage through model outputs), who is responsible for triage, what the escalation path is, how affected users are notified, and how root cause analysis is conducted. The NIST AI RMF Playbook includes specific suggested actions for the Manage function covering incident response.
This documentation does not need to be lengthy. A two-page addendum to your existing incident response plan is sufficient to demonstrate that AI incidents are in scope.
For a deeper framework on structuring these four components, see our NIST AI RMF implementation guide.
Free and Open-Source Tools Deep Dive
Promptfoo
What it does: Promptfoo is an MIT-licensed, open-source testing framework for LLM applications. It enables automated prompt evaluation, model comparison, red-teaming, and vulnerability scanning for AI applications. You define test cases in YAML config files; Promptfoo runs them against your models, reports results, and integrates with CI/CD pipelines.
Key capabilities: - Automated red-teaming for prompt injection, data leakage, harmful content, and jailbreak attempts - Side-by-side comparison of model outputs across GPT, Claude, Gemini, Llama, and others - CI/CD integration for regression testing as models or prompts change - Exportable evaluation reports suitable for attaching to security questionnaires
When it's enough: At seed stage, Promptfoo covers the red-teaming and evaluation artifact requirements that enterprise procurement now demands. Running it against your production prompts and exporting results gives you documented evidence of pre-deployment safety testing.
When it isn't: Promptfoo does not provide observability on production traffic, anomaly detection, or ongoing monitoring. It is a pre-deployment and CI/CD testing tool. You need a separate observability layer (Langfuse, LangSmith, or Arize) for production.
Giskard
What it does: Giskard is an Apache 2.0-licensed open-source Python library that automatically detects performance, bias, and security issues in AI systems. It covers both LLM-based applications (RAG agents, chatbots) and traditional ML models for tabular data.
Key capabilities: - Automated scan for hallucinations, harmful content, prompt injection, robustness issues, sensitive information disclosure, and stereotypes/discrimination - RAG Evaluation Toolkit (RAGET) for automatically generating evaluation datasets and scoring RAG pipeline components - Integration with LangChain, FAISS, and other common frameworks - Test suite generation from scan results, enabling regression testing
When it's enough: For teams building RAG-based products, Giskard's scan output directly addresses the bias testing and performance documentation requirements in ISO 42001 and the EU AI Act's high-risk documentation obligations. It is particularly well-suited for demonstrating that you have evaluated your system for discriminatory outputs.
When it isn't: Like Promptfoo, Giskard is a testing tool rather than a production monitoring platform. It also requires Python expertise to configure and run, which may be a barrier for non-ML teams.
Langfuse (OSS)
What it does: Langfuse is an MIT-licensed (with enterprise features under the ee folder) open-source LLM engineering platform. It provides production observability: tracing of LLM calls and agent actions, prompt management with version control, evaluation scoring, dataset management, and a playground for iteration. It was part of Y Combinator's W23 batch.
Key capabilities: - Full tracing of LLM calls, agent steps, retrieval operations, and user sessions - Cost and token tracking across providers - Prompt version control and release management - LLM-as-a-judge and human annotation evaluation pipelines - Self-hostable in minutes; MIT license for core features
When it's enough: For a startup that needs to demonstrate audit trails of AI system behavior, Langfuse's tracing functionality is the right tool. Self-hosting means no data leaves your infrastructure, which matters for DPA compliance with third-party AI APIs.
When it isn't: The self-hosted version requires infrastructure management. If you want managed cloud with team features, Langfuse's Hobby tier is free (50,000 units/month, 2 users, 30-day data access). Beyond 30-day retention and 2 users, you move to paid tiers. Early-stage startups qualify for a 50% discount on the Core plan ($29/month).
NIST AI RMF Playbook
What it does: The NIST AI RMF Playbook is a free, voluntary companion resource to the NIST AI Risk Management Framework 1.0. It provides suggested actions for achieving outcomes across the four AI RMF functions — Govern, Map, Measure, and Manage — aligned to each sub-category in the framework. The Playbook PDF is available at airc.nist.gov/docs/AI_RMF_Playbook.pdf. It is updated approximately twice per year.
When it's enough: As a policy template and governance checklist, the NIST AI RMF Playbook is genuinely sufficient for a seed-stage startup building its first governance program. It is vendor-neutral, free, and respected by enterprise procurement teams as a credible framework. Stating that your AI risk management program is "aligned with NIST AI RMF 1.0" carries weight in RFP responses.
When it isn't: The Playbook is a menu of suggestions, not a certification. It will not satisfy buyers who specifically require ISO 42001 certification or a SOC 2 report. It also requires significant internal judgment to translate into actual controls — consider our NIST AI RMF implementation guide for a structured path from framework to controls.
ISO 42001 Annex A as a Free Controls Template
ISO/IEC 42001's full text requires purchase from ISO, but the standard's Annex A controls are described in publicly available implementation guides. These controls cover: AI policy and objectives, organizational roles for AI governance, risk assessment and treatment specific to AI systems, AI impact assessment, data management controls, third-party AI supply chain requirements, and monitoring and review. You can use these control categories as a self-assessment checklist before deciding whether to pursue formal certification. Sprinto's ISO 42001 certification guide provides a detailed breakdown of controls and implementation steps.
Fairlearn
What it does: Fairlearn is an MIT-licensed open-source Python library from Microsoft that helps developers assess and mitigate fairness issues in machine learning models. It provides metrics for comparing model behavior across demographic groups and mitigation algorithms for reducing disparate impact.
Key capabilities: - Group fairness metrics: demographic parity, equalized odds, and related measures - Fairness dashboard for visualizing model behavior across groups - Mitigation algorithms for allocation harms (unequal access to resources) and quality-of-service harms - Integrates with scikit-learn and other standard ML libraries
When it's enough: For traditional ML models in higher-risk use cases (hiring tools, credit scoring, content moderation), Fairlearn provides the bias testing evidence that EU AI Act high-risk documentation requires. A Fairlearn evaluation report demonstrates that you assessed for discriminatory outputs before deployment.
When it isn't: Fairlearn is designed for traditional supervised ML models on tabular or structured data. It does not cover LLM-based systems, where fairness evaluation requires different tools (Giskard, Promptfoo). It also does not provide ongoing production monitoring.
Free Tiers of Commercial Tools
The following verified free tiers are accurate as of April 2026, based on direct inspection of each vendor's public pricing page.
LangSmith (Developer Plan)
LangSmith offers a free Developer plan with: - 5,000 base traces per month - 1 seat - 14-day retention (base traces) - Tracing, online and offline evaluations, Prompt Hub, Playground, monitoring and alerting - 1 Fleet agent, up to 50 Fleet runs per month
Fit for: Individual developers or founders evaluating LangSmith for their LangChain-based applications. The 5,000 trace limit is adequate for development and limited production testing. Additional traces are priced at $2.50 per 1,000 (base) or $5.00 per 1,000 (extended, 400-day retention). A startup program with discounted rates and generous trace allotments is listed in the FAQ section of the pricing page.
Limitations: Single seat means no team collaboration. 14-day retention is insufficient for audit trails. Move to paid when you need team access or longer data retention for compliance purposes.
Arize AX Free
Arize offers an AX Free tier (SaaS) for individuals and startups: - 25,000 span traces per month - 1 GB ingestion per month - 7-day retention - Includes Alyx (Arize AI agent), online evals, product observability (monitors and custom metrics) - Community support
Arize also offers Phoenix, a fully self-hosted open-source tier with user-managed limits and no cost. The AX Pro tier is $50/month for small teams with startup pricing available.
Fit for: Startups that need cloud-managed LLM observability with minimal setup. 25,000 spans/month covers moderate production traffic. The self-hosted Phoenix option is suitable for teams with data residency requirements.
Limitations: 7-day retention on the free tier is inadequate for compliance audit trails. The 1 GB/month ingestion cap will be hit quickly with high-volume LLM applications.
Galileo Free
Galileo offers a free tier: - 5,000 traces per month - Unlimited users - Unlimited custom evaluations - $0/month
Fit for: Small teams and individual developers experimenting with LLM evaluation and observability. Galileo's strength is its evaluation and guardrails functionality. The free tier's unlimited custom evals is useful for teams building evaluation pipelines early.
Limitations: 5,000 traces per month is restrictive for production use. Galileo Pro starts at $100/month for 50,000 traces.
Patronus AI Developer Tier
Patronus AI offers a free Developer tier with no credit card required: - Access to Patronus Experiments, Logs, Traces (last 2 weeks) - 2 projects, 5 experiments per project - $10 in free API credits (for evaluator API calls) - Unlimited comparisons and datasets
Fit for: Teams evaluating LLM output quality and safety using Patronus's evaluator library. The API credit model means you pay per evaluation call ($10/1k small evaluator calls, $20/1k large evaluator calls).
Limitations: 2-week data retention and the 2-project limit constrain the free tier to development and evaluation use, not production compliance monitoring.
Weights & Biases Free (Cloud)
Weights & Biases offers a free cloud tier: - $0/month - All W&B Models, W&B Core features, and W&B Weave (LLM observability) - Up to 5 GB storage - Up to 5 seats - Community support
Fit for: ML teams that are training or fine-tuning models and need experiment tracking alongside LLM observability. W&B Weave (included in the free tier) provides LLM tracing and evaluation comparable to LangSmith and Langfuse for teams already in the W&B ecosystem. The 5-seat, 5 GB limit works for a small founding team.
Limitations: The free cloud plan states it is for personal development/small projects. The separate Personal (local) plan is explicitly for personal projects only — corporate use is not allowed. Teams needing audit-quality data retention should move to the Teams tier ($50/month per user, 100 GB storage).
Vanta (Startup Pricing)
Vanta does not publish pricing on its website and requires a sales call. Based on multiple third-party pricing analyses, Vanta's entry-level Essentials plan starts at approximately $10,000 per year for a single compliance framework (typically SOC 2 or ISO 27001). For a startup needing AI governance alongside SOC 2, the relevant add-ons (Trust Center, vendor risk management) increase the total. Spendflo's Vanta pricing guide reports pricing from approximately $10,000 to $80,000+ per year depending on size and frameworks. VC-backed startups in programs such as YC or AWS Activate may qualify for partner discounts — confirm directly with Vanta sales.
Note: Vanta does not cover ISO 42001 or AI-specific governance as a primary product. Its value is SOC 2 and ISO 27001 automation. Treat it as a compliance automation platform, not an AI governance platform.
Drata (Startup Pricing)
Like Vanta, Drata does not publish pricing publicly. Based on AWS Marketplace listings and multiple analyst sources, Drata's Foundation plan starts at approximately $7,500–$10,000 per year for startups under 50 employees on a single framework. Startups in AWS Activate, YC, or similar accelerator programs can request program discounts. Comp AI's Vanta vs Drata comparison notes Drata's average annual contract value is approximately $13,500, with pricing ranging from $7,500 to $42,750 in verified buyer reports.
Note: Drata covers SOC 2, ISO 27001, HIPAA, GDPR, and NIST CSF. It is adding ISO 42001 support per Comp AI's SOC 2 guide. Like Vanta, it is a compliance automation platform — not a purpose-built AI governance or observability tool.
Paid Platforms in the Startup-Affordable Range
For most seed-to-Series A startups with 5–50 employees, the combination of free open-source tools and free SaaS tiers described above provides adequate coverage. The decision to move to a paid dedicated AI governance platform is driven by specific triggers (covered in a later section).
The following platforms have verifiable startup-tier or self-serve pricing under $25,000 per year:
Langfuse (Core/Pro plans): Langfuse's Core plan is $29/month ($348/year) with full observability features, longer retention, and up to unlimited users. The Pro plan is $199/month. Early-stage startups receive 50% off in the first year. This is the most affordable path to full-featured LLM observability for a funded startup.
Arize AX Pro: $50/month ($600/year) for small teams with startup pricing available. Includes 50,000 spans/month, 100 GB ingestion, and 15-day retention.
Galileo Pro: $100/month ($1,200/year) for 50,000 traces, advanced analytics, and Slack support.
Weights & Biases Teams: $50/month per user with 100 GB storage, team-based access controls, and priority support. A 5-person team costs $3,000/year.
For purpose-built AI risk and compliance platforms (model risk management, AI impact assessments, full AIMS aligned to ISO 42001), the market is dominated by vendors with enterprise pricing. Most require a sales conversation and do not publicly disclose self-serve pricing. Based on Gartner and Forrester market analyses, annual budgets for dedicated AI governance platforms typically fall in the $30,000–$100,000+ range for mid-market deployments. A startup spending under $25,000/year on AI governance should plan the architecture around the free-and-paid-tier stack described above, with a clear upgrade path to a dedicated governance platform once Series B funding and enterprise revenue justify the investment. See our AI governance platform comparison for a detailed vendor evaluation.
A 6-Step 30-Day Implementation Plan
This plan is calibrated for a startup with 5–50 employees, an AI product in production or near-production, and no existing formal AI governance program.
Week 1: Inventory and Policy Foundation (Days 1–7)
Day 1–2: Build your model inventory. List every AI model and API your product uses. For each: model name, version or endpoint, provider, use case, data types processed, and risk classification. Use a shared spreadsheet. Assign an owner. This document is the foundation for everything else.
Day 3–4: Draft your AI use policy. Use the NIST AI RMF Playbook Govern function as a template. Cover: approved AI tools and APIs, data handling rules (what customer data can/cannot be sent to external APIs), human oversight requirements, and acceptable use boundaries for internal staff. Two to three pages is sufficient. Get it reviewed by your legal counsel before publishing.
Day 5–7: Map your DPA obligations. For each external AI API in your model inventory, locate the provider's Data Processing Agreement (DPA). Confirm that the DPA covers your customers' data. Document the retention period, subprocessor list, and any prohibition on training use. Flag gaps for legal review.
Week 2: Testing and Evaluation (Days 8–14)
Day 8–10: Install and run Promptfoo. Promptfoo installs via npm or Homebrew. Write test cases for your production prompts covering: prompt injection attempts, sensitive information disclosure, harmful content generation, and off-topic output. Run the red team scan. Export the results report. This document becomes your evaluation artifact for RFP responses.
Day 11–12: Run Giskard scan. If you have a traditional ML model or a RAG pipeline, install Giskard and run the automated scan. The scan report documents bias, hallucination risk, and robustness issues. Save the output.
Day 13–14: Set up Langfuse or LangSmith. Choose based on your framework (Langfuse integrates with more providers; LangSmith is optimal for LangChain). Deploy the free tier or self-host Langfuse. Instrument your application to trace all LLM calls. Verify that traces are appearing and that cost and token data are being captured.
Week 3: Incident Plan and Vendor Register (Days 15–21)
Day 15–17: Write your AI incident response addendum. Add a two-page appendix to your existing incident response plan. Define AI-specific incident categories, assign an AI incident owner (typically the CTO or head of engineering), document the triage workflow, and specify customer notification timelines. Reference the EU AI Act's post-market monitoring and serious incident reporting requirements for any high-risk use cases.
Day 18–19: Build your vendor risk register. Create a structured record of all AI and non-AI vendors that process customer data. For AI vendors specifically, include: DPA status, data retention policy, training data opt-out status, last security review date. This register directly satisfies SOC 2 vendor management requirements and the EU AI Act's supply chain documentation expectations.
Day 20–21: Run a Fairlearn evaluation (if applicable). If you have a traditional ML model making decisions about individuals, run Fairlearn to assess fairness metrics across protected groups. Document the results and any mitigation steps taken. This satisfies the EU AI Act's high-risk system data quality and bias assessment requirements.
Week 4: Documentation and Gap Assessment (Days 22–30)
Day 22–24: Use the NIST AI RMF Playbook for gap assessment. Work through the four functions — Govern, Map, Measure, Manage — and note which suggested actions you have addressed and which remain open. This becomes your governance roadmap. Prioritize gaps that are most likely to appear in enterprise questionnaires or VC due diligence.
Day 25–27: Prepare your AI governance one-pager. Summarize your governance posture for external audiences: which framework you follow (NIST AI RMF), which evaluations you run (Promptfoo, Giskard), how you monitor production (Langfuse/LangSmith/Arize), your DPA coverage, and your incident response process. This document answers 80% of the AI governance questions in enterprise security questionnaires.
Day 28–30: Review against ISO 42001 Annex A. Use the publicly available controls list (available through Sprinto's ISO 42001 guide and similar resources) to identify your gaps against the standard. This is your input for deciding whether and when to pursue formal certification. See our ISO/IEC 42001 certification path guide for a staged approach to certification.
Common Mistakes That Kill Enterprise Deals
Waiting Until the RFP Arrives
The most common and costly mistake. Enterprise security reviews take weeks. If you receive a security questionnaire with AI-specific questions at the start of a deal cycle and you have no documentation, you will either lose the deal or delay it by months while you build documentation under pressure. The governance work described in this guide takes 30 days to establish; starting before your first enterprise conversation is a strict requirement.
No AI Bill of Materials
Enterprise procurement teams increasingly ask: "What AI models does your product use?" If your answer is verbal and unverifiable, you have failed the question. The AI-BOM is the document that answers it. Without a model inventory, you also cannot complete accurate DPA negotiations because you do not know which providers are in scope.
No Fundamental Rights Impact Assessment (FRIA)
The EU AI Act requires high-risk AI system deployers to conduct a Fundamental Rights Impact Assessment (FRIA) before deploying a high-risk AI system. For startups building in HR, credit, healthcare, or law enforcement adjacent categories, this is not optional from August 2026. An FRIA is also increasingly appearing in enterprise procurement questionnaires for AI vendors in sensitive use cases. Our AI Impact Assessment template covers the FRIA structure.
No Model Inventory = No DPA Negotiation
When enterprise legal teams negotiate a DPA, they will ask which sub-processors you use and what data you transmit to them. If you are sending customer data to OpenAI, Anthropic, or any other LLM API, that provider is a sub-processor. Without a documented and current list, DPA negotiations stall. The model inventory solves this problem because it already captures which providers process which data types.
Hand-Waving AI-Specific Security Questions
"We take security seriously" is not an answer to "What red-teaming have you conducted on your LLM?" A Promptfoo evaluation report with specific test cases and results is an answer. "Our models are tested" is not an answer to "Do you have documented evaluation artifacts?" The Promptfoo 2025 AI Regulation Report notes that documentation is now structural: "Whether you're responding to a federal RFP, complying with a state law, or filling out an enterprise security questionnaire, you'll be asked for documentation about how your system works and how you tested it."
Treating SOC 2 as Sufficient for AI Governance
Comp AI's analysis is unambiguous: "SOC 2 still doesn't cover model bias, explainability, evaluation rigor, or AI lifecycle governance. Those live in ISO/IEC 42001, the NIST AI RMF Generative AI Profile, and the EU AI Act." SOC 2 is necessary but not sufficient for AI vendors. Buyers who know this will ask additional questions; be ready with answers that reference NIST, ISO 42001, or your own documented evaluation practice.
When to Upgrade from Free to Paid
Use the following triggers to decide when to move from the free tier stack to paid tools or platforms.
Upgrade your observability tool (Langfuse, LangSmith, Arize) when: - You need more than 30 days of data retention for compliance or debugging purposes - Your team grows beyond 2 users and you need collaboration features - Monthly trace volume consistently exceeds the free tier limit - A customer or auditor requests an audit log export and the free tier cannot provide it
Upgrade to a compliance automation platform (Vanta, Drata, or equivalent) when: - You are beginning a SOC 2 Type II or ISO 27001 audit - You are receiving more than 10 security questionnaires per month and manual responses are consuming engineering time - You have more than 30 employees and managing evidence collection manually is impractical - An enterprise deal has made compliance automation a stated requirement
Upgrade to a dedicated AI governance platform when: - You are pursuing ISO 42001 certification and need automated controls mapping and evidence collection - You have more than 5 distinct AI models in production requiring lifecycle tracking - A regulated-industry customer (financial services, healthcare, government) requires a dedicated AI risk management system - You have completed a Series B round and can allocate the $30,000–$100,000+ annual budget that dedicated platforms require
For an evaluation of dedicated AI governance platforms with feature-level comparisons, see our AI governance platforms directory.
Pursue ISO 42001 certification when: - Multiple enterprise prospects in a single quarter ask specifically for ISO 42001 certification - Your target market is EU financial services, healthcare, or government where AI-specific certification is a regulatory expectation - You already hold ISO 27001 and can leverage the shared management system structure
Sprinto's ISO 42001 certification guide estimates total certification costs (implementation plus audit) at $15,000–$40,000 for small organizations, with a four-to-nine-month timeline from gap assessment to certificate. Workstreet recommends pursuing certification when enterprise buyers are asking AI-specific governance questions that SOC 2 alone cannot answer.
FAQ
Q: My startup only uses the OpenAI API — do I really need an AI governance program?
A: Yes, for three reasons. First, using a third-party LLM API does not transfer your regulatory obligations. Under the EU AI Act, you are the provider or deployer of the AI system, and the fact that the model is provided by a third party does not reduce your compliance obligations. Second, enterprise procurement questionnaires ask about your governance practices regardless of whether you train your own models. Third, your DPA with OpenAI (or any other provider) needs to be in place before you can legitimately process customer data through their API — and your customers' legal teams will ask about this.
Q: Does the EU AI Act apply to me if I'm a US company with no EU employees or servers?
A: If any of your AI system's outputs are consumed by users in the EU, Article 2 of the EU AI Act applies to you. Modulos's analysis confirms: "A US company with no EU entity, no EU staff and no EU servers is still in scope if its AI system is placed on the EU market or its outputs are used in the EU." This is the same extraterritorial structure as GDPR. The practical first step is confirming whether any of your current or prospective customers are EU-based.
Q: What's the difference between NIST AI RMF and ISO 42001?
A: The NIST AI RMF is a voluntary, non-certifiable framework published by the US National Institute of Standards and Technology. It provides a structured approach to AI risk management across four functions (Govern, Map, Measure, Manage) and is primarily used as an internal methodology and a reference in US government procurement. ISO/IEC 42001 is a certifiable international standard for AI Management Systems. Third-party auditors can certify your organization as compliant with ISO 42001. Enterprise buyers, particularly in Europe and regulated industries, may require ISO 42001 certification rather than just NIST alignment.
Q: Can I use these free tools to answer a SOC 2 audit?
A: Partially. SOC 2 auditors assess your security controls, not your AI governance tooling specifically. However, the observability data from Langfuse, LangSmith, or Arize can serve as evidence of system monitoring controls. The vendor risk register you build to track your AI APIs directly satisfies SOC 2 vendor management requirements. Promptfoo and Giskard evaluation reports are useful evidence for change management and testing controls. You will still need a compliance automation platform (Vanta, Drata, or equivalent) to manage the full SOC 2 audit process efficiently.
Q: How long does it take to get ISO 42001 certified?
A: Sprinto estimates four to nine months: two to four weeks for gap assessment, one to three months for designing and documenting the AI Management System, one to two months for implementation and internal audits, and one to two months for the certification audit (Stage 1 and Stage 2). The total cost for a small organization is typically $15,000–$40,000 including implementation work, consultant fees, and the audit itself.
Q: What's a Fundamental Rights Impact Assessment (FRIA) and when do I need one?
A: Under Article 27 of the EU AI Act, deployers of high-risk AI systems are required to conduct a FRIA before putting the system into use. A FRIA assesses how the AI system may affect the rights of individuals — including privacy, non-discrimination, freedom of expression, and access to justice. High-risk categories (Annex III of the Act) include AI in recruitment, credit scoring, healthcare, biometric identification, law enforcement, and critical infrastructure. If your product falls into any of these categories and you have EU deployers or users, you need a FRIA. Our AI Impact Assessment template provides a structured FRIA format.
Q: Should I use Langfuse self-hosted or the cloud version?
A: For startups with customer data sensitivity or DPA requirements that restrict sending data to third parties, the self-hosted version of Langfuse is the right choice — you control all data, and there is no per-seat cost for the MIT-licensed core. For startups where cloud hosting is acceptable, the free Hobby tier (50,000 units/month, 2 users) or the $29/month Core plan with startup discount ($14.50/month in year one) is easier to maintain. The decision should be driven by your DPA obligations and your team's capacity to maintain infrastructure.
Q: What should I actually say on a security questionnaire when asked about AI governance?
A: Reference specific tools and documents. "We follow the NIST AI RMF and conduct quarterly red-team evaluations using Promptfoo" is a verifiable statement that signals maturity. "We take AI security seriously" is not. The artifact list that answers the most common enterprise questions: (1) model inventory / AI-BOM, (2) Promptfoo evaluation report, (3) AI use policy, (4) DPA with each AI API provider, (5) incident response plan with AI-specific addendum, (6) Langfuse/LangSmith trace access demonstrating production monitoring. For a detailed breakdown of what vendors need to provide in enterprise procurement, see our AI compliance vendor due diligence guide.
Sources / Further Reading
- Promptfoo GitHub (MIT license) — https://github.com/promptfoo/promptfoo
- Giskard GitHub (Apache 2.0 license) — https://github.com/Giskard-AI/giskard
- Langfuse GitHub (MIT license, core) — https://github.com/langfuse/langfuse
- Fairlearn GitHub (MIT license) — https://github.com/fairlearn/fairlearn
- NIST AI RMF Playbook (AIRC) — https://airc.nist.gov/airmf-resources/playbook/
- NIST AI RMF Playbook PDF — https://airc.nist.gov/docs/AI_RMF_Playbook.pdf
- NIST AI RMF Playbook page (nist.gov) — https://www.nist.gov/itl/ai-risk-management-framework/nist-ai-rmf-playbook
- EU AI Act full text (EUR-Lex, Regulation 2024/1689) — https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689
- EU AI Act regulatory framework (European Commission) — https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai
- ISO/IEC 42001:2023 standard page — https://www.iso.org/standard/81230.html
- Modulos: EU AI Act applies to US companies — https://www.modulos.ai/blog/eu-ai-act-us-companies/
- Tredence: EU AI Act 2026 compliance guide for US companies — https://www.tredence.com/blog/eu-ai-act-compliance-guide-us-companies
- LawFlex: GC's guide to EU AI Act cross-border compliance — https://lawflex.com/navigating-the-eu-ai-act-in-2026-a-general-counsels-guide-to-cross-border-compliance-and-ai-governance/
- Promptfoo: How AI regulation changed in 2025 — https://www.promptfoo.dev/blog/ai-regulation-2025/
- Rockfort AI: How to answer enterprise AI security questionnaires — https://blog.rockfort.ai/post/how-to-answer-enterprise-ai-security-questionnaires-a-complete-guide-for-ai-startups
- Quantarra: SOC 2 AI compliance 2025 — https://quantarra.io/blog/soc-2-ai-compliance-news-2025-edition-the-trends-that-reshaped-security-audits
- Comp AI: SOC 2 for AI companies (2025/2026) — https://trycomp.ai/soc-2-for-ai-companies
- Comp AI: Vanta vs Drata 2026 comparison — https://trycomp.ai/vanta-vs-drata
- Workstreet: ISO 42001 for startups — https://www.workstreet.com/blog/iso-42001-for-startups
- Sprinto: ISO 42001 certification steps, cost, timelines — https://sprinto.com/blog/iso-42001-certification/
- Sprinto: Drata pricing — https://sprinto.com/blog/drata-pricing/
- Sprinto: Vanta pricing — https://sprinto.com/blog/vanta-pricing/
- Kruze Consulting: 2025 VC due diligence trends — https://kruzeconsulting.com/blog/vc-due-diligence-trends/
- Stealth Cloud: AI due diligence for VCs — https://stealthcloud.ai/ai-privacy/ai-due-diligence-vcs/
- LangSmith pricing page — https://www.langchain.com/langsmith-pricing
- Arize AI pricing page — https://arize.com/pricing/
- Galileo pricing page — https://www.rungalileo.io/pricing
- Patronus AI pricing page — https://www.patronus.ai/pricing
- Weights & Biases pricing page — https://wandb.ai/site/pricing
- Langfuse pricing page — https://langfuse.com/pricing
- Spendflo: Vanta pricing guide 2025 — https://www.spendflo.com/blog/comprehensive-guide-to-vanta-pricing
- Complyjet: Vanta pricing guide 2025 — https://www.complyjet.com/blog/vanta-pricing-guide-2025
- Complyjet: Drata pricing 2025 — https://www.complyjet.com/blog/drata-pricing-plans
- Wiz: AI bill of materials guide — https://www.wiz.io/academy/ai-security/ai-bom-ai-bill-of-materials
- Blott: AI in venture capital 2026 — https://www.blott.com/reports/ai-use-cases-in-venture-capital
aicompliancevendors.com publishes practical guidance on AI governance, compliance tooling, and vendor selection. The authors have no financial relationship with any of the tools or vendors named in this guide.