The Raven Group
Digital Infrastructure
Intelligence Systems
Consulting
Insights
About
Schedule Consultation
Schedule
The Raven Group
InsightsAbout
Schedule Consultation
The Raven Group
The Raven GroupInfrastructure consultancy · AI-native partner

We operate the digital infrastructure behind small and mid-sized businesses — quietly, and well.

Direct line

+1 303-351-1691hello@theravengroup.com

Denver, Colorado · operating since 1993

Services
  • Digital Infrastructure→
  • Networking & Security→
  • Apple & Business→
  • Consulting→
  • Managed Websites→
AI & Intelligence
  • Intelligence Systems→
  • AI Systems & Automation→
  • Cogneros→
  • Cerebra→
  • HomeOS by TRG→
Company
  • About→
  • Our Story→
  • Philosophy→
  • Clients→
  • Case Studies→
Insights
  • All Insights→
  • AI→
  • Infrastructure→
  • Strategy→
  • Security→
Get Started
  • Get in Touch→
  • Account & Billing→
Assessments & tools
  • AI Opportunity Assessment
  • ·AI Readiness Assessment
  • ·Infrastructure Audit
  • ·Website Infrastructure Score
  • ·Book an Infrastructure Review
Serving Denver & Colorado
  • Denver Web Infrastructure
  • ·Denver AI Consulting
  • ·Colorado AI Consulting
  • ·Denver Apple Consultant
  • ·Denver UniFi Consultant
  • ·Denver Managed Websites
  • ·Denver Business Technology
Live in Denver, CO·© 2026 The Raven Group
PrivacyTermsAccessibility
  1. Home
  2. ›Insights
  3. ›AI
AI

Building a private LLM that knows your business

May 16, 2025·4 min read

Off-the-shelf chatbots are useful, until the question they're asked is specific to your business. Then they hallucinate or hedge — and rightly so, because they have no idea what your product is, who your customers are, or how your team uses certain words to mean very specific things. The fix isn't a better model; it's a model that has access to your knowledge. That's what people mean by "a private LLM" in 2025, and the way you get there is a stack of two pieces: a good model, plus retrieval (RAG).

Retrieval-augmented generation, simplified: when somebody asks the assistant a question, the system searches your internal documents (Notion, Drive, knowledge base, etc.) for relevant chunks, stuffs those chunks into the prompt alongside the question, and asks the model to answer using that context. The model is doing what it always does — generating plausible text — but now it's grounded in your actual material, with citations back to where the information came from. Done well, this is the difference between a chatbot that knows trivia and an assistant that knows your business.

The pieces are mostly commodity now. A vector database (Pinecone, pgvector, others). An embedding model (OpenAI, Cohere, open-source). A chat model (GPT, Claude, Gemini, or self-hosted). An ingestion pipeline that pulls your docs in, chunks them, embeds them, indexes them. And a UI somewhere. The total cost for a 50-person business to run this internally is in the low four figures per year for tooling, plus the engineering time to wire it up and the ongoing care to keep the index fresh. Nothing about it requires custom ML expertise.

What does require care is the boring part — keeping the index up to date as your docs change, handling permissions so different users only see what they should, monitoring the system for embarrassing failures, evaluating new models without breaking the old answers. This is operational work, not ML work, and it's where most internal AI projects falter. The model isn't the problem. The plumbing around the model is. Build that plumbing first, with the cheapest model you can find, and only then start optimizing for quality. You'll know what to optimize for once you have something in production that real people are using.

Want to talk about something in this post? Get in touch.More on AI
More on AI
  • How to evaluate an AI feature before you ship it

    Most AI feature launches skip the evaluation step entirely. They demo well, ship, and quietly hallucinate at customers. The eval doesn't have to be fancy. It does have to exist.

    June 25, 20263 min read
  • Why your first AI agent should be embarrassingly small

    The agents that work in production tend to start tiny — one task, one human in the chair next to them, a tight feedback loop. The flashy demo can come after.

    February 10, 20263 min read