The Raven Group
Digital Infrastructure
Intelligence Systems
Consulting
Insights
About
Schedule Consultation
Schedule
The Raven Group
InsightsAbout
Schedule Consultation
The Raven Group
The Raven GroupInfrastructure consultancy · AI-native partner

We operate the digital infrastructure behind small and mid-sized businesses — quietly, and well.

Direct line

+1 303-351-1691hello@theravengroup.com

Denver, Colorado · operating since 1993

Services
  • Digital Infrastructure→
  • Networking & Security→
  • Apple & Business→
  • Consulting→
  • Managed Websites→
AI & Intelligence
  • Intelligence Systems→
  • AI Systems & Automation→
  • Cogneros→
  • Cerebra→
  • HomeOS by TRG→
Company
  • About→
  • Our Story→
  • Philosophy→
  • Clients→
  • Case Studies→
Insights
  • All Insights→
  • AI→
  • Infrastructure→
  • Strategy→
  • Security→
Get Started
  • Get in Touch→
  • Account & Billing→
Assessments & tools
  • AI Opportunity Assessment
  • ·AI Readiness Assessment
  • ·Infrastructure Audit
  • ·Website Infrastructure Score
  • ·Book an Infrastructure Review
Serving Denver & Colorado
  • Denver Web Infrastructure
  • ·Denver AI Consulting
  • ·Colorado AI Consulting
  • ·Denver Apple Consultant
  • ·Denver UniFi Consultant
  • ·Denver Managed Websites
  • ·Denver Business Technology
Live in Denver, CO·© 2026 The Raven Group
PrivacyTermsAccessibility
  1. Home
  2. ›Insights
  3. ›AI
AI

Model selection isn't a model decision

September 28, 2025·3 min read

Teams new to building AI features tend to treat model selection as the central decision: which model is best? Should we use GPT or Claude or Gemini? Open source or hosted? This framing leads to long evaluations that conclude with "Claude is slightly better at this category and GPT at that one," and the team ships against whichever one was favored on the day the decision got made. Six months later, a better model from a different provider exists, and switching is a six-week project.

The more useful framing: model selection is an evaluation decision, not a model decision. The best AI feature you can ship is the one with a model swap that takes a day, not a quarter. The architectural pattern that gets you there is straightforward — abstract the model interaction behind a small internal API, run evals against the actual outputs you care about, and let the model choice be a config setting that points to whichever provider is winning this month.

What the evals should test is the thing your product actually does, not generic benchmarks. If your feature summarizes customer emails, your eval is fifty customer emails with hand-written gold-standard summaries, and a scoring function (LLM-as-judge, or a simple rubric) that tells you whether the model's output is acceptable. With this in place, every new model that comes out is a one-day experiment: swap the config, run the evals, look at the numbers. Switch or don't.

The companies that ship durable AI features in 2025 aren't the ones that picked the right model in 2024. They're the ones who built the swap-in/swap-out architecture in 2024 and have changed models four times since. Models will keep getting better, cheaper, and more specialized. The decision you're making isn't which model to use; it's whether your codebase can change its mind without a rewrite.

Want to talk about something in this post? Get in touch.More on AI
More on AI
  • How to evaluate an AI feature before you ship it

    Most AI feature launches skip the evaluation step entirely. They demo well, ship, and quietly hallucinate at customers. The eval doesn't have to be fancy. It does have to exist.

    June 25, 20263 min read
  • Why your first AI agent should be embarrassingly small

    The agents that work in production tend to start tiny — one task, one human in the chair next to them, a tight feedback loop. The flashy demo can come after.

    February 10, 20263 min read