The weird case of the non-rectangular rectangle
Receipts are paper documents. Paper documents are rectangular. Therefore, receipts are rectangular documents. Simple logic, right? Well, we found that mostly wrong. In fact, most if not practically all receipt documents we see at WAY2VAT are non-rectangular paper documents. Non-rectangular documents are a big deal for us, as they require much more image processing to bring to an orthogonal form that OCR techniques favor. We’d like to share some of our woes working with receipt images, and some solutions we found.
Receipts are non-rectangular, and it has to do with how they are produced. Consider the following:
Receipt documents are usually printed on thermal paper. Thermal paper is delivered in rolls and inserted into the printer, where a heating element quickly touches the paper to turn it black and reproduce the text. The paper comes out still curled, a memory of its time on the roll. It is therefore – not rectangular, but a curled developed surface. This is the reason why many disgruntled users trying to scan receipts with a camera (as opposed to a flatbed scanner that flattens down the curl) will try to bring the curled paper to a flatter state by folding it against the curl.
To make things worse, the receipt paper comes in a continuous roll and needs to be cut with each receipt produced. This again makes for a very non-rectangular “rectangle”. Many receipt printers have a serrated cutting edge near the out opening, so the receipt can be easily cut. These serrated edges result in serrated edges on the receipt paper too. Some printers have an internal scissor type mechanism that makes a cleaner cut, but those are not devoid of imperfections. Remember your last visit to the supermarket and the way your receipt was cut from the machine. In most cases there’s a little lip remaining from the cutting operation in one of the top corners, and usually an inverse missing part in the bottom corner.
Again, to make things even worse (just when you thought the bad news are over), thermal paper is much more prone for crumpling and creasing. This stems from how it is produced and its compound. Thermal paper is usually just regular paper or synthetic (polypropylene) with a special heat-sensitive coating that allows printing. However, the paper itself is mostly very thin, some 45 microns, so it crumples very easily (https://brother.com.au/pdf/consumables/ThermalPaperWhitePaper.pdf). Many vendors will not invest in thicker premium receipt paper for the ephemeral nature of these documents. It was shown that thin paper crumples more easily; physicists expressed the deformation in terms of the force applied to it with relation to the material thickness (https://www.nature.com/articles/43395).
Fighting from the Corner
In boxing, the worst situation is to be boxed in the corner. It gives the fighter no room to move. But we love corners when analyzing documents. The problem with our receipt images is that many them simply don’t have corners! Corners let us use homographies to rectify the document in the image. A homography is a linear transform from one plane (flat surface) to another. If we can find the corners of the document, we can assume everything within the polygon the corners create is flat, and then inverse this transform to get a flattened receipt document.
The problem with this simplistic approach is that we’re making too many assumptions while using inaccurate data. This is the worst way to do science. In practice, most receipt scanner apps out there are looking for rectangles, but we know better. In our smart scanner technology, we’re looking for semi-rectangles, or incomplete rectangles, knowing that we will not be able to find the “real” corners of the document. Some receipts that we encounter very often, such as train tickets, have rounded corners, so they never had any corners in the first place! (see Figure 2) The case for the non-rectangular rectangle is set.
From here there are two paths to go down: (1) Assume receipts are mostly flat and thus finding the corners will solve the problem, and (2) try to reconstruct the geometry of the folded receipt paper from just the single image and rectify it. Path (2) is an active area of research in computer vision, without a definitive solution yet but with very promising directions (http://openaccess.thecvf.com/content_cvpr_2018/html/Ma_DocUNet_Document_Image_CVPR_2018_paper.html). Path (1) has its fair share of difficult computer vision problems to solve, and is the center of our work, although we’re making strides to converge with path (2) on our mobile devices apps.
Without giving out too much details, we can say an edge image analysis helps us find the major lines of the receipt in the image. We look for intersections of those lines to predict where the hidden corners are. How to find which the major lines are, and what direction they are facing – well that’s our secret sauce, so you are welcome to guess. We can hint it has to do with deep neural networks.
Happy corner hunting!
For more information contact us