When we use optical character recognition to scan bank statements, we require the accounts clerk to identify the columns each time a batch of bank statements is loaded. We might have let the software try to do this, but getting the user to do it provides more certainty. Quite a few of our bank statements can be rather faint, and clients often tick or annotate the bank statements.
We use an onscreen display which resembles a blink comparator to show each scanned statement. The computer “reads” the bank statement and highlights any numbers which are obviously wrong by reference to the running total so that the user can fix them straight away. If absolutely everything were wrong, then in effect the user would be typing the bank statement directly onto the screen, so this is a system which doesn’t actually need OCR as anything more than a very helpful auxiliary system. If a bank statement were utterly unscannable, then we have a system for typing the numbers in by the column in a single sweep, so there are two levels of backup to the primary OCR system when we are scanning numbers.
Narratives may also be scanned in, but our system will reject anything which looks wrong. The clerk can then run a Narrative Prediction routine which guesses missing narratives, and can then overtype the guesses that are wrong. If narratives are typed in, then autocomplete is available, but the clerk can also reprogram the function keys F1 … F10 to generate useful narratives, and this reprogramming is also remembered next year, so in effect all our clients have their own private keyboard. This gives four ways to enter narrative, namely OCR, Narrative Prediction, reprogrammed keys and old-fashioned typing (with autocomplete), so there are three levels of backup.
Some client records resemble handwritten bank statements, and these are typed in by the column in the order numbers, dates and narratives. After entering numbers and dates, we can still use Narrative Prediction to have a go at guessing the narratives, and this is often right first time, particularly with handwritten narrative which tends to be more repetitive and therefore easier to predict. Thus we have a system of Narrative Prediction which can deal with more than just machine-generated bank statements, and we are very pleased with this.
Dates can be scanned by OCR, and anything unscannable can be filled in by interpolation if it is not material. We only need to type in the day, and the month and the year are copied down automatically. We can do date entry under the assumption that each new date is later than the previous one, or under the assumption that it falls within the same month until we say otherwise. When we type in a pile of invoices, we just batch them by the month and enter the day. Often one of the function keys can enter the narrative, but if not, we type it in and have autocomplete to help out. The number does have to be typed in. With VAT invoices, we batch them by supplier, and we can often skip narrative entry because the software will fill in any blank narratives when we estimate the value.
This system has been developed over several years and is more than just another OCR system. The emphasis is on grabbing the data on a spreadsheet by whatever means does the job the best. Once the data is on a spreadsheet in a standard format, processing is all-electronic and quick to do. The clerk may need to update a mapping table in order to code up any new narratives, but there is a macro to look up the narrative on the Internet which will help out.
Auxiliary data analysis, such as that needed for an overdrawn director’s current account, is easy to do by copying from the ledger to a specialist spreadsheet. Working papers can be generated on the side. This is a system produced by an accountant from the inside looking out, and it satisfies the Law of Requisite Variety. Client records arrive in all shapes and sizes, but we can now deal effectively with all of them.