Our receipt capture features are designed to simplify the collection, digitization, and monetization of receipt data.

You’ve likely got a lot of data on your hands, but what about the data you don’t have?

We’ve had plenty of conversations with data buyers over the past few years, and, logically, there’s a strong desire to purchase hard-to-get data, like transaction-level SKU data. Usually these just live on receipts, so it’s not something you can extract from financial institution data. So we built the tech to easily capture data from receipts, creating a valuable dataset to increase revenue for your users and business.

Physical receipts are scanned and stitched, and data is extracted, serialized, and monetized. Email receipts are fetched, and data is extracted, serialized and monetized.

We are continually working to add new methods of capturing different types of hard-to-get data so expect this section to grow in the near future.

Physical Receipts

The optional physical receipt feature transforms photos of receipts into structured machine-readable datasets through a straightforward scanning process. Powered by Amazon Textract and our data licensing technology, it ensures accurate and efficient text extraction of monetizable data from scanned images —with industry leading privacy and security standards.

Key Functionalities

  • Image Capture: Users can take photos of physical receipts using their mobile devices or other image-capturing tools.
  • Multiple Image Stitching: Multiple images of the same receipt can be stitched together to form a cohesive and a comprehensive representation.
  • Text Extraction: Leveraging Amazon Textract, machine learning (ML) powered optical character recognition (OCR), ensures precise extraction of relevant text data from scanned images.
  • Data Serialization: Extracted data is serialized into a structured machine-readable format, ready for consumption by licensor applications and systems.

Email Receipts

The optional email receipt feature streamlines the process of extracting receipt data from emails. It integrates multiple technologies, including extracting emails from connected Gmail and Outlook accounts. Gmail and Outlook integrations utilize OAuth2 for authorization without storage of user credentials, with emails read exclusively from a list of known receipt senders to preserve user privacy.

Using the same process as Physical Receipts, Amazon Textract and our data licensing technology ensure accurate and efficient text extraction of monetizable data from receipts —with industry leading privacy and security standards.

Key Functionalities

  • Integration with Gmail and Outlook: Users can securely connect their email accounts using Gmail and Outlook's OAuth2 authorization code flows, ensuring a seamless and secure integration without the storage of sensitive credentials.
  • Email Extraction From Known Receipt Senders: Fetch receipt emails from a user's inbox based on a predefined list of known receipt senders, maximizing privacy and minimizing unnecessary data retrieval. Following initial fetch, subsequent fetches only retrieve new emails since last successful extraction date.
  • Text Extraction: Leveraging Amazon Textract, machine learning (ML) powered optical character recognition (OCR), ensures precise extraction of relevant text data from extracted email receipts.
  • Data Serialization: Extracted data is serialized into a structured machine-readable format, ready for consumption by licensor applications and systems.