Summary of "How AI Image Generators Make Bias Worse"

How AI image generators make bias worse

Key technologies and mechanics

Popular image generators discussed: Midjourney and Stable Diffusion. Some reports predict generative AI images could soon make up a very large share of online images (reports suggest up to ~90% in coming years).
Core model family explained: Generative Adversarial Networks (GANs), which have two components:
- Generator: produces images trying to mimic real ones.
- Discriminator: judges whether images are real or fake and trains the generator through repeated adversarial rounds.
Output quality and behavior strongly depend on training datasets (millions of labeled images). If datasets contain social biases, the models learn and reproduce them — there is no truly “neutral” dataset.

Empirical analysis and findings

Bloomberg Technology (Leonardo Nicoletti and Dina Bass) generated and analyzed more than 5,000 Stable Diffusion portraits by profession, then categorized images by perceived skin tone and gender.
- Findings:
  - Higher-paying roles (CEO, lawyer, politician, doctor, engineer) skewed toward lighter skin tones and men.
  - Lower-income roles (dishwasher, janitor, fast-food worker, housekeeper) skewed toward darker skin tones and women.
  - Generated images amplified real-world disparities. Example: women are about 39% of U.S. doctors in statistics but appeared in only ~7% of generated doctor images.
Examples of extreme representational harms from other reporting:
- BuzzFeed’s AI-generated “Barbies by country” series:
  - Latin American Barbies shown as fair-skinned (colorism).
  - German Barbie dressed in an SS/Nazi-like uniform.
  - South Sudan Barbie shown holding a rifle.

Concepts and harms

Representational harms: AI outputs that degrade or stereotype social groups by reinforcing status quos or amplifying prejudices.
Feedback loops: biased AI outputs populate the web, then get scraped into future training datasets, causing bias amplification across generations of models.
No easy technical fix: improving datasets helps but doesn’t solve deeper normative questions about what “fair” representation means.

“All data is historical data.” — Melissa Terras

Philosophical and policy issues

Defining fairness is contested. Possible approaches each have trade-offs:
- Mirror current statistics (risk reproducing existing inequities).
- Enforce demographic parity (e.g., 50/50 gender split).
- Randomization or other normative rules.
- Note: binary gender categories in many datasets introduce inherent bias and exclude nonbinary identities.
Governance options:
- Self-regulation by tech firms is often insufficient.
- Government intervention could include oversight bodies, complaint mechanisms, mandates to update algorithms/retrain on better datasets, and standards/requirements for dataset transparency and representativeness.
Regulatory timing challenge — the Collingridge dilemma:
- Regulate too early and rules may be irrelevant or stifle innovation.
- Regulate too late and harms may be entrenched and hard to reverse.

Takeaway

Generative image models can and do exaggerate societal biases. Addressing this requires interdisciplinary work (technical, ethical, legal), clear choices about what counts as fairness, timely and effective governance, and dataset transparency to avoid runaway feedback loops.