A typical document chatbot retrieves similar chunks and asks a language model to answer from them. That works for many narrative questions. It is weaker when the answer depends on exact amounts, date ranges, table rows, or complete coverage across multiple documents.
Common Numeric Failure Modes
- Partial windows: The retrieved chunks cover March and April, but the question asks for the full quarter.
- Conflicting rows: Two documents list different amounts and the model blends them.
- Table context loss: A chunk separates a value from the row or column that explains it.
- Unit confusion: Monthly, annual, pre-tax, post-tax, and prorated values get mixed.
What Better Systems Add
Better answers need structured records, planners, coverage checks, and refusal or warning paths when evidence is incomplete. FAQ Ally can use these patterns for supported document types and configured routes, while still keeping source passages available for review.
Practical Rule
If the question asks "what does this policy say," cited text may be enough. If it asks "how much," "how many," "which period," or "what changed," structured evidence and human review become more important.
Related: Structured data + AI search | AI invoice knowledge base | Beyond RAG
