Automation

AI Agent Teams Need Different Foundation Models to Perform Best

Research shows diverse model stacks outperform homogeneous systems by up to 25%, but most enterprises still rely on identical AI infrastructure.

Omega Editorial· June 18, 2026· 3 min read

Key takeaways

Research shows diverse AI agent teams outperform homogeneous ones by 25%, but most enterprises build agents on identical foundation models and data architectures.
Surface-level personality prompting creates cosmetic diversity; structural diversity requires different foundation models, training data, and retrieval systems.
Industry-wide model uniformity creates systemic risks including correlated errors in fraud detection, pricing convergence, and missed market signals.
Seven practical strategies include diversifying the model stack, enriching training data, implementing board-level model portfolio governance, and cultural red-teaming.
McKinsey now counts 20,000 AI agents among its 60,000-person workforce, up from 3,000 agents 18 months prior, making diversity practices increasingly urgent.

The hidden uniformity problem in AI agent deployment

As AI agents become standard members of enterprise workforces—McKinsey now counts 20,000 AI agents among its 60,000-person workforce—a critical weakness is emerging. Most organizations are building their agentic teams on identical technical foundations, creating what amounts to cognitive monoculture.

Recent research demonstrates that agent teams selected for diversity resolve software engineering problems 25% more effectively than individual agents. Another study found that just two diverse agents can match or exceed the performance of 16 homogeneous ones. Yet enterprises remain largely unprepared to create genuinely diverse agentic systems.

Why it matters

When entire industries run AI agents on the same foundation models and data architectures, they face systemic rather than isolated risks. Financial services firms experience identical fraud detection failures simultaneously. Retailers using the same AI stack converge on similar pricing, compressing competitive differentiation. Insurance companies miss novel fraud patterns their uniform models weren't trained to recognize. As agentic AI scales across workforces, these uniformity risks compound at the organizational and market level.

Surface diversity versus structural diversity

Most current approaches to agentic diversity focus on personality traits—prompting agents to act as extroverts or introverts, or to adopt different cultural attitudes. Enver Cetin, director at AI company Ciklum, calls this approach insufficient: "When the stack underneath is uniform, dressing the agents in different personas is mostly cosmetic. Costume change is not cognition."

Research supports this view. Studies show that major language models respond to psychological profiling tests like people from Western, Educated, Industrialized, Rich and Democratic societies—what researchers term "WEIRD" populations—failing to capture the diversity of other value systems.

Seven strategies for building diverse agent teams

Enterprises can take concrete steps now to create structurally diverse agentic systems:

Diversify the foundation model stack. Use different models for different roles—Anthropic's Claude for reasoning, Google's Gemini for evaluation, OpenAI's GPT for generation. Different training data and alignment approaches reduce correlated errors.

Enrich training data. Incorporate multi-dimensional psychometric datasets like the Big Five Framework and cross-cultural resources like the World Values Survey to develop agents that better mirror human personality and cultural variation.

Fine-tune with internal data. Use enterprise HR systems, employee surveys, and psychometric evaluations to customize agents to reflect actual workforce composition.

Enable work-shadowing. Train agents on email communications and meeting transcripts so they learn team dynamics from human colleagues in different geographic contexts.

Implement model portfolio governance. Treat foundation model concentration like any critical supplier risk. Board-level policies should limit the percentage of decisions depending on a single model vendor.

Deploy cultural red-teaming. Test agents using multidisciplinary expert teams to identify bias, cultural sensitivity issues, and societal impacts before deployment.

Create agentic talent marketplaces. Develop platforms that enable enterprises to recruit agents reflecting diverse roles, skills, personality types, and cultural backgrounds.

The business case for cognitive friction

Numerous studies confirm that personality and cultural variation drive team success by creating "cognitive friction" that accelerates complex problem-solving. As AI agents achieve greater workforce preponderance, the risks of stifling diverse perspectives grow proportionally.

Cetin identifies three industry-level consequences of agentic uniformity: correlated errors creating systemic risk in regulated sectors, competitive differentiation compression as AI systems converge on identical answers, and loss of insight into edge cases that could signal emerging opportunities or threats.

Mark Purdy, writing in Harvard Business Review, argues that enterprises still in early-stage agentic AI implementation can avoid future problems by adopting diversity practices now. The details in this analysis were first reported by Harvard Business Review.

#ai agents#foundation models#agentic ai#ai diversity#enterprise ai#model governance

This is an original analysis by the Omega editorial team. Source reporting: AI Watch.

Want systems like this working for your business?

Book a Call

AI Agent Teams Need Different Foundation Models to Perform Best

The hidden uniformity problem in AI agent deployment

Why it matters

Surface diversity versus structural diversity

Seven strategies for building diverse agent teams

The business case for cognitive friction

More in Automation

Comau pivots to battery, energy automation after Stellantis split

Google Study: Workers Use AI to Assist, Not Replace Themselves

Leena AI rebuilt its platform twice to reach 70% ticket deflection