Building AI That Understands Insurance Policies

One of the most common bottlenecks in commercial insurance is the speed and accuracy of policy interpretation. Even experienced brokers can spend hours combing through dense documents to identify coverages, exclusions, and endorsements.

Objective

For TwinCover, I wanted an AI model that could take in PDF or scanned policy documents and return a structured understanding of the key terms — enough for a broker to make a quick decision on whether to proceed with a quote.

Approach

OCR preprocessing to convert documents into text while preserving section headings.
Embedding the text into a vector store (Pinecone) for semantic search.
Fine-tuning a language model to answer policy-specific queries accurately.

Challenges

Ambiguity in policy language — trained prompts to request clarifications when the source was unclear.
Data privacy — implemented on-site processing for client documents to avoid external storage.

Results

The assistant now correctly identifies 95% of relevant clauses in test policies, reducing the review time from ~45 minutes to under 5 minutes in most cases.

Takeaways

Industry-specific fine-tuning is worth the cost — generic LLMs underperform without it.
Embedding-based retrieval improves accuracy more than prompt tuning alone.