{"id":3594,"date":"2020-09-08T05:00:58","date_gmt":"2020-09-08T05:00:58","guid":{"rendered":"https:\/\/way2vat.com\/?p=3594"},"modified":"2020-09-08T05:00:58","modified_gmt":"2020-09-08T05:00:58","slug":"going-beyond-pixels-part-1-invoice-images-as-graphs","status":"publish","type":"post","link":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/","title":{"rendered":"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs"},"content":{"rendered":"

We usually think about invoice images as, well, images \u2013 a matrix of pixels, light intensity values on a regular grid. This preserves the highest fidelity of information, since all the information of the invoice is encoded in those pixel values. On the other hand, pixels, which can well be in the millions for a standard image from a smartphone, are a whole lot of data for very little information that the receipt actually holds \u2013 e.g. the amounts and dates. Therefore, invoice image analysis systems are about hierarchical abstraction:<\/i> Moving information from lower-level representation to an increasingly more complex structure. In the past we modeled this abstraction visually with convolutional neural networks: https:\/\/way2vat.com\/visual-linguistic-methods-for-receipt-field-tagging\/.<\/a><\/u><\/span><\/p>\n

A different way of abstracting visual information is using Graphs<\/i>, a node-edge connective structure that encodes visual information by assigning nodes <\/i>to spatial locations and connecting the various nodes with edges <\/i>that encode the visual relationship <\/i>between nodes. Graph structures have been used traditionally in computer vision and image analysis mostly to do segmentation \u2013 splitting the image to smaller parts with the same semantic meaning. So how do we use this abstraction method to analyze invoices?<\/p>\n

\"\"<\/p>\n

In our upcoming technical paper in the ACM Symposium on Document Engineering<\/a><\/u><\/span> we propose a new method for invoice analysis with graphs. The first step is thinking about the invoice as a graph, abstracting the visual structure to a node-edge representation. The question is, what parts of the invoice image should get a node? And what is the relationship between invoice nodes? We choose to represent the image using the OCR process, where each word rectangle is a node. This is a well-used method that we can see used in document analysis research as early as the INFORMys work from Cesarini et al. [1998]<\/a><\/u><\/span>, and up to very recent work by Qian et al. [2019]<\/a><\/u><\/span>.<\/p>\n

Naturally, a single word in an invoice (for example \u201cTotal\u201d or \u201c938a\u201d) does not carry a lot of information on its own. The true power of a hierarchical representation is by drawing edges between the nodes to connect them and allow algorithms to reason about groups of nearby nodes. The next question is: How to connect the nodes of the graph?<\/p>\n

\"\"<\/p>\n

We take a simple approach here, connecting the nodes based on their pixel-location in the image. We consider the cardinal directions<\/a><\/u><\/span>: North, South, East and West. A node will connect to its right (Eastern) neighbor if the neighbor is first node that is located along the positive x axis of the image. Similarly, we connect the left (Western) nodes, the upper (Northern) ones and those below (Southern). In the figure above we can see an example image encoded as a graph. We call this a Cardinal Graph <\/i>representation of the invoice image. The cardinal graph is a convenient modality, since its very intuitive to construct and also understand visually, it simply makes visual sense. There are obvious down sides to it as well:<\/p>\n

    \n
  1. \n

    Currently we encode the cardinal directions only, e.g. not using diagonals.<\/p>\n<\/li>\n

  2. \n

    It fails to encode connections that go beyond the first immediate neighbor, it cannot \u201cskip\u201d nodes, which are often errors or noise arising from the OCR.<\/p>\n<\/li>\n

  3. \n

    Some cardinal connections carry very little or no information at all, for example words that are very far apart in the image and have no reason to be connected.<\/p>\n<\/li>\n<\/ol>\n

    Other problems exist, but for the most part this representation is simple and powerful! The shortcomings of the cardinal edge structure are readily addressed by graph analysis algorithms, such as graph-cuts<\/a><\/u><\/span> and belief propagation<\/a><\/u><\/span>, that see past immediate node neighborhoods to find higher-level insights. Additionally, cardinal direction connection carries a strong informational cue, since it captures the relations in most invoices\u2019 tabular layout, where the word \u201cTotal:\u201d lies West <\/i>of the numeric total sum.<\/p>\n

    Constructing the graph is just the first step, next is to use this representation for an analysis application such as information extraction. In the next part in the series we will dive deeper into graph algorithms for information extraction. Stay tuned!<\/p>\n","protected":false},"excerpt":{"rendered":"

    We usually think about invoice images as, well, images \u2013 a matrix of pixels, light intensity values on a regular grid. This preserves the highest fidelity of information, since all the information of the invoice is encoded in those pixel values. On the other hand, pixels, which can well be in the millions for a […]<\/p>\n","protected":false},"author":1,"featured_media":3595,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[26,1],"tags":[],"class_list":["post-3594","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","category-vatgst"],"acf":[],"yoast_head":"\nGoing Beyond Pixels \u2013 Part 1: Invoice Images as Graphs - w2v<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs - w2v\" \/>\n<meta property=\"og:description\" content=\"We usually think about invoice images as, well, images \u2013 a matrix of pixels, light intensity values on a regular grid. This preserves the highest fidelity of information, since all the information of the invoice is encoded in those pixel values. On the other hand, pixels, which can well be in the millions for a […]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/\" \/>\n<meta property=\"og:site_name\" content=\"w2v\" \/>\n<meta property=\"article:published_time\" content=\"2020-09-08T05:00:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"627\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"elad@finext.co.il\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"elad@finext.co.il\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/\",\"url\":\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/\",\"name\":\"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs - w2v\",\"isPartOf\":{\"@id\":\"https:\/\/way2vat.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg\",\"datePublished\":\"2020-09-08T05:00:58+00:00\",\"dateModified\":\"2020-09-08T05:00:58+00:00\",\"author\":{\"@id\":\"https:\/\/way2vat.com\/#\/schema\/person\/58171f6e00c988121574f5cc9bdc2f06\"},\"breadcrumb\":{\"@id\":\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#primaryimage\",\"url\":\"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg\",\"contentUrl\":\"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg\",\"width\":1200,\"height\":627},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/way2vat.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/way2vat.com\/#website\",\"url\":\"https:\/\/way2vat.com\/\",\"name\":\"w2v\",\"description\":\"Just another WordPress site\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/way2vat.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/way2vat.com\/#\/schema\/person\/58171f6e00c988121574f5cc9bdc2f06\",\"name\":\"elad@finext.co.il\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/way2vat.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/270b9d0ffca074ddbdf17bef4d6ab751?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/270b9d0ffca074ddbdf17bef4d6ab751?s=96&d=mm&r=g\",\"caption\":\"elad@finext.co.il\"},\"sameAs\":[\"https:\/\/way2vat.com\"],\"url\":\"https:\/\/way2vat.com\/author\/eladfinext-co-il\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs - w2v","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/","og_locale":"en_US","og_type":"article","og_title":"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs - w2v","og_description":"We usually think about invoice images as, well, images \u2013 a matrix of pixels, light intensity values on a regular grid. This preserves the highest fidelity of information, since all the information of the invoice is encoded in those pixel values. On the other hand, pixels, which can well be in the millions for a […]","og_url":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/","og_site_name":"w2v","article_published_time":"2020-09-08T05:00:58+00:00","og_image":[{"width":1200,"height":627,"url":"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg","type":"image\/jpeg"}],"author":"elad@finext.co.il","twitter_card":"summary_large_image","twitter_misc":{"Written by":"elad@finext.co.il","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/","url":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/","name":"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs - w2v","isPartOf":{"@id":"https:\/\/way2vat.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#primaryimage"},"image":{"@id":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#primaryimage"},"thumbnailUrl":"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg","datePublished":"2020-09-08T05:00:58+00:00","dateModified":"2020-09-08T05:00:58+00:00","author":{"@id":"https:\/\/way2vat.com\/#\/schema\/person\/58171f6e00c988121574f5cc9bdc2f06"},"breadcrumb":{"@id":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#primaryimage","url":"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg","contentUrl":"https:\/\/way2vat.com\/wp-content\/uploads\/2020\/09\/403000_BLOG-BANNER_31-1-copy.jpg","width":1200,"height":627},{"@type":"BreadcrumbList","@id":"https:\/\/way2vat.com\/going-beyond-pixels-part-1-invoice-images-as-graphs\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/way2vat.com\/"},{"@type":"ListItem","position":2,"name":"Going Beyond Pixels \u2013 Part 1: Invoice Images as Graphs"}]},{"@type":"WebSite","@id":"https:\/\/way2vat.com\/#website","url":"https:\/\/way2vat.com\/","name":"w2v","description":"Just another WordPress site","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/way2vat.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/way2vat.com\/#\/schema\/person\/58171f6e00c988121574f5cc9bdc2f06","name":"elad@finext.co.il","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/way2vat.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/270b9d0ffca074ddbdf17bef4d6ab751?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/270b9d0ffca074ddbdf17bef4d6ab751?s=96&d=mm&r=g","caption":"elad@finext.co.il"},"sameAs":["https:\/\/way2vat.com"],"url":"https:\/\/way2vat.com\/author\/eladfinext-co-il\/"}]}},"_links":{"self":[{"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/posts\/3594","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/comments?post=3594"}],"version-history":[{"count":0,"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/posts\/3594\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/media\/3595"}],"wp:attachment":[{"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/media?parent=3594"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/categories?post=3594"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/way2vat.com\/wp-json\/wp\/v2\/tags?post=3594"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}