Sectors: Finance • Information Technoloy • Government • Healthcare • Industries • Education • (show all)
In this article, we'll walk through how you can perform receipt OCR using Python in 10 mins or less.
Receipts contain useful transaction information and most receipts are on paper or in raw digital formats like scanned PDF or image files. Extracting/recognizing data like merchant info, line items and amounts from scanned receipts using Python has now been simplified thanks to the receipt digitization or automated receipt processing via OCR.
In this tutorial, we'll use the image on the right as the sample input. Receipt OCR doesn't only recognize receipts in English. In fact, you can use receipts from any country in any language.
Open your favorite Python editor, you may copy the code snippet from the below and modify accordingly to suit your needs. Alternatively, you may clone the entire complete example program from the open source hosting website Github: github.com/Asprise/receipt-ocr
curl -X POST -F "file=@US-1.jpg" https://ocr.asprise.com/api/v1/receipt
// View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/csharp-vb-net-receipt-ocr
string response = httpPost("https://ocr.asprise.com/api/v1/receipt", // Receipt OCR API endpoint
new NameValueCollection()
{
{"api_key", "TEST"}, // Use 'TEST' for testing purpose
{"recognizer", "auto"}, // can be 'US', 'CA', 'JP', 'SG' or 'auto'
{"ref_no", "ocr_dot_net_123"} // optional caller provided ref code
},
new NameValueCollection() {{"file", "../../US-1.jpg"}} // Modify it to use your own file
);
Console.WriteLine(response); // Result in JSON
// View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/java-receipt-ocr
/**
* Uploads an image for receipt OCR and gets the result in JSON.
* Required dependencies: org.apache.httpcomponents:httpclient:4.5.13 and org.apache.httpcomponents:httpmime:4.5.13
*/
public class JavaReceiptOcr {
public static void main(String[] args) throws Exception {
String receiptOcrEndpoint = "https://ocr.asprise.com/api/v1/receipt"; // Receipt OCR API endpoint
File imageFile = new File("US-1.jpg");
System.out.println("=== Java Receipt OCR ===");
try (CloseableHttpClient client = HttpClients.createDefault()) {
HttpPost post = new HttpPost(receiptOcrEndpoint);
post.setEntity(MultipartEntityBuilder.create()
.addTextBody("api_key", "TEST") // Use 'TEST' for testing purpose
.addTextBody("recognizer", "auto") // can be 'US', 'CA', 'JP', 'SG' or 'auto'
.addTextBody("ref_no", "ocr_java_123'") // optional caller provided ref code
.addPart("file", new FileBody(imageFile)) // the image file
.build());
try (CloseableHttpResponse response = client.execute(post)) {
System.out.println(EntityUtils.toString(response.getEntity())); // Receipt OCR result in JSON
}
}
}
}
// View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/javascript-nodejs-receipt-ocr
console.log("=== JavaScript/Node.js Receipt OCR ===");
var receiptOcrEndpoint = 'https://ocr.asprise.com/api/v1/receipt';
var imageFile = 'US-1.jpg'; // Modify it to use your own file
var fs = require('fs');
var request = require('request');
request.post({
url: receiptOcrEndpoint,
formData: {
api_key: 'TEST', // Use 'TEST' for testing purpose
recognizer: 'auto', // can be 'US', 'CA', 'JP', 'SG' or 'auto'
ref_no: 'ocr_nodejs_123', // optional caller provided ref code
file: fs.createReadStream(imageFile) // the image file
},
}, function(error, response, body) {
if(error) {
console.error(error);
}
console.log(body); // Receipt OCR result in JSON
});
<?php // View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/php-receipt-ocr
function receiptOcr($imageFile) {
$receiptOcrEndpoint = 'https://ocr.asprise.com/api/v1/receipt'; //
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $receiptOcrEndpoint);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, array(
'api_key' => 'TEST', // Use 'TEST' for testing purpose
'recognizer' => 'auto', // can be 'US', 'CA', 'JP', 'SG' or 'auto'
'ref_no' => 'ocr_php_123', // optional caller provided ref code
'file' => curl_file_create($imageFile) // the image file
));
$result = curl_exec($ch);
if(curl_errno($ch)){
throw new Exception(curl_error($ch));
}
echo $result; // result in JSON
}
print("=== Java Receipt OCR ===\n");
receiptOcr('US-1.jpg'); // Modify it to use your own file
# View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/python-receipt-ocr
import requests
print("=== Python Receipt OCR ===")
receiptOcrEndpoint = 'https://ocr.asprise.com/api/v1/receipt' # Receipt OCR API endpoint
imageFile = "US-1.jpg" # // Modify it to use your own file
r = requests.post(receiptOcrEndpoint, data = { \
'api_key': 'TEST', # Use 'TEST' for testing purpose \
'recognizer': 'auto', # can be 'US', 'CA', 'JP', 'SG' or 'auto' \
'ref_no': 'ocr_python_123', # optional caller provided ref code \
}, \
files = {"file": open(imageFile, "rb")})
print(r.text) # result in JSON
The complete source code of the Receipt OCR in C#, Java, JavaScript, PHP and Python can be found at github.com/Asprise/receipt-ocr
If you are not ready to write code now, you may try the 🧾 web based free receipt OCR capture/recognition.
Execute the code, you'll get the result in JSON. Note the result JSON contains both structured data like merchant name, address, phone, VAT/GST tax registration number, receipt number, country, currency, subtotal, total amounts and line items as well full text OCR.
Note the result JSON contains both structured data like merchant name, address, phone, VAT/GST tax registration number, receipt number, country, currency, subtotal, total amounts and line items as well full text OCR.
Besides Python, many other programming languages are supported: C# VB.NET Receipt OCR Tutorial, Java Receipt OCR Tutorial, JavaScript JS Receipt OCR Tutorial, PHP Receipt OCR Tutorial.
OCR your own receipts - No registration required Developer's Guide - easy integration