We all might think we know what a bank statement looks like. However, it often turns out that real bank statements are quite a bit scruffier and tattier than we might have thought. We will see this when we attempt to scan them using optical character recognition. Here is a mock bank statement which is typical of what we deal with in the real world. It is easy enough for the computer to pick out the business area because of the descriptions “Balance brought forward” and “Balance carried forward”, but someone has written on the statement and ticked off a few transactions without being too heavy-handed. This really is so typical, being neither the best nor the worst that we will see : In a typical run of twenty four bank statements, we can expect to see one big scribble, one coffee stain, one messy paper crease, one dead insect and the last statement will be a computer print-out with the transactions in reverse order. This is said a little tongue-in-cheek, but real bank statements are like that. Here is the statement just after processing by Able2Extract and transfer to Excel. We used a five-column template : Now we run our own cleanup routine from our computer-assisted blink comparator. Non-business areas are greyed out and obvious errors are highlighted. Note that the computer does not know the order of the transactions or the in and out columns at this stage, but it flags anything consistently wrong in all four combinations : The cleanup routine ensures that dates and numbers are properly formatted. Only dates and the word “Date” can appear in the leftmost column, but misreadings like “Data”, “Dote” and “Dale” are automatically corrected. Only numbers and a limited range of column headings and trailers can appear in the three rightmost columns. The routine is able to read narrative such as “Balance brought forward” and to colour it greyish blue to show that it has recognised it. If narrative like this were missing, the routine would use other cues to work out the business area of the bank statement, but sometimes it would give up and let the accounts clerk decide by highlighting it manually. If the cleanup routine knows that it is a looking at a number in the three rightmost columns, then any letters O, I, Z, S or B will be converted to numbers 0, 1, 2 , 5 or 8 before further processing. Actually Able2Extract is so good that this is rarely necessary. A leading or trailing V or slash / or backslash \ before or after a number will be assumed to be a tick and will be ignored, so the routine has a limited ability to deal with extraneous rubbish. After we run the cleanup routine a second time, the grey area is deleted : It is then up to the accounts clerk to overtype anything on the spreadsheet that is wrong, with the error flagging providing a clue. In this case there are two items to fix and one to delete. Then the clerk runs the cleanup routine again. This time the computer is able to see that everything is in the usual order and it colours the “Paid out” and “Paid in” columns accordingly : We now have a spreadsheet-based facsimile of the bank statement which we can happily read into our main software. The OCR job is done. If bank statements arrive covered in ticks and handwriting, then it will take a bit longer to process them, but never catastrophically longer. This is called “graceful degradation” and we feel that it is an essential feature of any OCR system to be found in an accountant’s office. If the statements are just bashed through without a human reviewer in charge, then the computer will be just too stupid to deal with everything that can go wrong and the result will be farcical, something like the tale of the sorcerer’s apprentice. On the other hand, it is still a lot quicker and less tiring to use OCR and then overtype indicated errors, as compared to typing in the whole lot, or to using traditional analysis pad and pencil, or to using a spreadsheet-based analysis pad. Sometimes we get bank statements where the transaction order is upside-down and the column order is different. This is how the statement would end up on the comparator : This uses brighter colours and fancy column headings to keep the clerk awake. As well as using colour, we try to vary the typeface to provide an additional cue. We preserve the transaction order because it is after all a “comparator” which allows us to check back directly to the original paper statement. Software further down the line can rectify the transaction order and transpose the columns. This is done by looking at the date order and the column headings in the first place, but if these were all missing then the internal logic of the numbers would be used. If we were to deliberately change one of the numbers at the right before further processing, the software would spot it and fix it by reference to the running total. This gives a further line of defence, but there must not be too many such errors or the computer will get confused. Obviously there won’t be many errors after the blink comparator stage, if any at all. There were no ticks or scribbles on this second statement, so its quality is good enough that we can just run the cleanup routine three times in succession without bothering to look at it. We can do this by double-clicking on the routine’s button, so we get three for the price of two. This is as much autonomy as we care to give the computer at this stage to avoid that sorcerer’s apprentice problem, and we will still force the clerk to look at each individual bank statement. The comparator is permitted to repair an item by inspection of the item itself, but not by reference to neighbouring items, which is a human prerogative. Further down the line the software makes repairs by reference to neighbouring items, but by that time the error rate is very low and the computer can be allowed more autonomy. Once data is captured on a spreadsheet, life is easy. We can program the computer to list narratives such as Credit card, Paid in, Service Charges and To Deposit Ac in alphabetical order on a mapping table on a second spreadsheet and then to invite the clerk to code them up. This mapping table is reuseable next year, and has a direct link to the Internet to look up unusual items. In the first year we use a small generic mapping table for a typical business in Carlisle. Any new narratives found in a new year will need to be coded by the clerk, but often there won’t be too many. If we had some bank statements where the use of OCR really was impossible, then we have a fast data entry system where we type in all the numbers in one go, with income items being treated as negative numbers. This single-sweep action generates “datepoints” which cues us to enter the dates of significant or material items, and we then interpolate the rest. We then enter a few narratives, guess the rest using a Narrative Prediction routine, and overtype the guesses that are wrong. The NP routine has a secondary action of reprogramming the function keys F1 … F10 to be able to generate the commonest narratives found so far, which is very useful for overtyping. We also use this slick futuristic system for credit cards and handwritten records which resemble bank statements. Experience indicates that while a single-item OCR pen can just about outperform typing it in, it cannot match typing plus datepointing. There are other technologies that can compete with OCR. Credit card statements do not have a running total. We could scan them using OCR just the same, but we have decided not to for two reasons. Firstly, few clients provide them, and the clerk might have to re-learn using a different OCR system every time they appear, in which case it would probably be quicker just to type them in by the column. Secondly, having a pool of data which always must be typed in provides a stimulus to the development of non-OCR systems so we are not over-dependent on OCR. We do not expect to be able to use OCR for handwritten records, but we have an alternative system to give us competitive edge. One other point to make is that credit card and cash transactions tend to be more repetitive than bank transactions, so they can be dealt with effectively by Narrative Prediction and the advantage of OCR is less evident. It is conceivable that all the credit card payments are for Diesel fuel, in which case NP would be better than OCR since NP would merely copy down the one instance of “Diesel fuel” that the clerk has typed in, and would not make any scanning mistakes. In fact, the narrative “Diesel fuel” might have been actually typed a few years ago, and the system remembers it and has a function key reprogrammed to generate it. Generally OCR is still the best technology, but our dual-action NP comes close and we don’t bother with OCR if there are just a few bank statements per each VAT quarter. We don’t use OCR on our own bank statements which arrive once a month in A5 format (it can be done by rotating the statements, but we don’t bother). OCR has a secondary benefit of stimulating us to use our imagination to look for other systems which can almost compete with it, and which may occasionally outperform it, and which are useful to have as backup systems in a real accountant’s office. Were we to abandon OCR, the real casualty would be the loss of this imagination about what we do. We are now working on a hybrid OCR/NP system. This will be operated like an NP system, but it will be able to make restricted use of OCR input as long as that input correlates with an existing lexicon which includes last year’s mapping table. We are trading off some ideal-case performance against plenty of graceful degradation, which will be a worthwhile exchange in a real accountant’s office. It has often been felt that the quality of scanned narrative is poor, the fault lying with the banks rather than the software, and that we might as well use NP all the time. If we can have an NP system with extra features like redefining function keys and making some use of scanned narrative within constraints, then so much the better. We will then have a standard way of doing things whether we use OCR or not. The clerk will use NP for some handwritten records and for credit card statements and bank statements in short runs or unusual formats, and will select NP for use on bank statements where scanned narrative of variable quality is already present. Any use of scanned material made by the NP system will be a bonus. When our NP system redefines the function keys, it tries to ensure that key F7 is some kind of telephone expense and key F9 is some kind of motor expense like petrol or Diesel fuel. Key F6 is not allocated to anything in particular, but it can be used for a second type of telephone expense. Key F8 is postage or stamps, but it can be used for a second type of motor expense if there is no frequent expenditure on postage. Key F10 is bank charges, but it can be used for a third type of motor expense if bank charges are too infrequent. In situations where there are no narratives to predict, such as when typing in a pile of invoices, we can type in a few invoices, run the NP routine after the last narrative and get the function keys redefined to help out with common narratives like petrol or Diesel fuel, and then type in the rest. So the dual-action NP system can be used with a wide spectrum of client record types from a pile of invoices to OCR-scanned bank statements. Maybe we should say triple-action because it also does something useful with OCR. Those twenty four scruffy bank statements that we mentioned above will be processed in three phases: Phase one: Scan what we can and type in what we must. The last statement, which is a computer print-out, will just be typed in. A block of computer print-outs in reverse order can be treated as a separate scanning job. When typing in, we only do the numbers by the column in debit-positive convention, and have a 31-button onscreen toolpad to do the dates. Narratives are left until the next phase. Some of our clients give us handwritten records which resemble bank statements for which we just type everything in, but otherwise processing is the same as for printed statements. Phase two: We now have all the numbers and most of the dates in the system. Missing dates can be entered by interpolation, but the computer will ask the clerk to enter a few material dates. Most of the narratives are there as well, but they need tidying up. Run the Narrative Prediction routine. This will amend scanned narratives by reference to the dictionary. Unrecognised narratives will be deleted unless they are of high quality. Blank narratives will be filled in by narrative prediction. For example, in our own accounts it can be guessed that any payment of £13 is something to do with a company Confirmation Statement. The last act of the NP routine is to offer to redefine the functions keys F1 … F10 so that they can generate the commonest narratives found. These redefinitions will be remembered next year and are special to the client, so every one of our clients has his or her own customised keyboard. Phase three: Review all the narratives. Both OCR and NP are capable of getting everything right first time. Failing that, the narratives in error will need to be overtyped. It is possible that this can be done quickly with redefined function keys F1 … F10. Failing that, Excel’s autocomplete can be helpful. Failing that, we will just have to type in a few rare narratives. Scruffier bank statements will take a bit longer to process, but never catastrophically longer, so we have good graceful degradation. Disappointing performance will be rare and it is good for morale to have a system which tries to be helpful when things do get a bit more difficult. On the whole OCR is useful and its day has come, but we keep tight control over the results and have lots of backup systems. In particular, we have NP as another artificial intelligence system which acts as a factotum and which is unique to us. At the front end of our OCR system we have a blink comparator stage which can deal with formal bank statements and computer print-outs in the four commonest formats. This may also be unique. Scanning and typing it in are intermiscible as a matter of principle so NP goes on working with handwritten statements from Gringotts Bank, credit card statements and the odd bank statement in a different format.