Suppose we have a client who hands us 24 bank statements at the year end. We have a working optical character recognition system which can process these rapidly. Sometimes the narrative on the statements will be useless, so we will need to type it in. We have a narrative prediction and overtyping system which can speed things up.
This system is very useful for an elite of experienced accountants to use, but what about junior trainees? Suppose that the client registers for VAT, and starts bringing in 6 bank statements per quarter? Suppose that some statements are torn and won’t go through the scanner? Suppose that there is a queue to use the OCR scanner, or suppose we just cannot be bothered?
To simulate the likely usage of OCR in a typical accountant’s office, we are going to prohibit the use of OCR for runs of less than 7 bank statements. We will then be forced to type it all in, but we can still use narrative prediction and correction. In addition, with smallish batches of 7-11 bank statements where we do use OCR, we are likely to scan dates and numbers only, and use narrative prediction for the narrative.
This will encourage us to develop a balanced OCR system which is not just a fair weather friend. Our alternative systems will be inspired by seeing an OCR system in action, and will be designed to be almost competitive with it. We will have an OCR-equivalent level of ambition. Tactically, we will be slightly less competitive, but strategically, we will aim to be better and are likely to succeed.