What don't they tell you?
Below are excerpts from actual model documentation. Click on the highlighted text to understand what's transparent, what's vague, and what's missing.
Excerpts from actual model card • Click highlights to learn more
Read the full documentation:
GPT-4 Official Documentation →Model cards were supposed to standardize AI transparency. Introduced by Google researchers in 2019, they aimed to document: who built it, what data trained it, how it performs across demographics, what it fails at, and ethical considerations.
In practice, leading AI companies provide incomplete documentation. Compare GPT-2 (2019)—which disclosed 1.5B parameters, dataset composition, and training details—to GPT-4 (2023), which explicitly withholds all three. This represents a deliberate shift away from transparency.
Vague language obscures accountability. Terms like "large-scale," "extensive testing," and "filtered for quality" sound scientific but mean nothing without specifics. Who decides what counts as "quality"? How extensive is "extensive"? This language prevents independent verification.
Critical information is systematically omitted. Training data sources, exact model sizes, environmental costs, quantified failure rates, and demographic performance breakdowns are missing from both GPT-4 and Claude documentation—not by accident, but by design.
Better transparency exists. Academic models like BLOOM documented 25 tons of CO2 emissions, full training data sources, and comprehensive evaluation. Meta's LLaMA provided detailed architecture specs. These prove that transparency is possible—companies choose opacity.
The foundational framework for AI documentation
Independent scoring of AI model transparency across companies
Analysis of decreasing transparency in AI models
Questions to Ask Any AI Company
• What specific datasets were used for training?
• How many parameters does this model have?
• How does it perform across different demographics and languages?
• What is the measured failure rate for this use case?
• What are the carbon emissions from training and inference?
• Who decided what information to disclose vs. withhold, and why?