One of the most common bottlenecks in commercial insurance is the speed and accuracy of policy interpretation. Even experienced brokers can spend hours combing through dense documents to identify coverages, exclusions, and endorsements.
Objective
For TwinCover, I wanted an AI model that could take in PDF or scanned policy documents and return a structured understanding of the key terms — enough for a broker to make a quick decision on whether to proceed with a quote.
Approach
- OCR preprocessing to convert documents into text while preserving section headings.
- Embedding the text into a vector store (Pinecone) for semantic search.
- Fine-tuning a language model to answer policy-specific queries accurately.
Challenges
- Ambiguity in policy language — trained prompts to request clarifications when the source was unclear.
- Data privacy — implemented on-site processing for client documents to avoid external storage.
Results
The assistant now correctly identifies 95% of relevant clauses in test policies, reducing the review time from ~45 minutes to under 5 minutes in most cases.
Takeaways
- Industry-specific fine-tuning is worth the cost — generic LLMs underperform without it.
- Embedding-based retrieval improves accuracy more than prompt tuning alone.