We have now done enough development work on optical character recognition for bank statements to be able to say for sure that OCR is here to stay, and there’s no going back. Our OCR system is merely a numbers and dates system, and does not deal with narratives. This has the merit that we are dividing up the problem into manageable chunks. Also, the character set that we need to scan is reduced, and the fonts that we are scanning have a fixed spacing and therefore easier to scan, the latter being a subtle point that we have only just appreciated today, and which explains why our system using Able2Extract is normally pretty good compared to some of our other experience with OCR systems. Ticks and scribbles are easier to deal with when it’s ONR only.
Our OCR system is supplemented by a Computer Assisted Blink Comparator. Usually this should be just gilding the lily, but when things go wrong we have a backup system. Narratives are dealt with using Narrative Prediction and also a reprogrammable keyboard for the commonest narratives, and this system does cheque book stubs as well.
Any use of OCR offers such a big increase in productivity that it’s well worth having, and now we’ve got it. Perhaps in the future we will develop a system to scan narratives as well, or we will just use somebody else’s system, but the pressure’s off.