Wednesday, January 9, 2013

Mechanical Turk - Conclusions

In this simple proof of concept I asked ten people to process a receipt, giving information about the business, date of purchase and the category of purchase. Some of this is clearly possibly to automate with OCR however the detail and accuracy required to make this useful makes this a good candidate for a Mechanical Turk task.

So how did people do?

Well it took about 20 minutes for all 10 HITS to be processed. That's not too bad, and I'm sure response time is highly sensitive to price. I suspect 11 cents for each of these hits is on the high side since the time it takes to process one of these is probably 30 seconds or so at most. That said this is an interesting design decision you have to make, it's pretty clear that response times are not going to be very predictable, and definitely not particularly close to real time unless you pay a significant premium. If you really have a popular application and people get to know your tasks (something you can easily do with MTurk) then you might begin to achieve more consistent and quicker results. Especially since the system allows you to tip particularly diligent workers.

As for accuracy, all ten people correctly identified the business and 9 out of 10 got the date right. The one person who didn't reversed the month and day despite explicit instructions in the HIT to watch out for this. The context of the receipt should have been pretty clear it was from the US, but I suppose its possible the actual mistake made was thinking this was a non-US receipt and my instruction led to confusion that it should be switched.

As for categorization, it was all over the place, I would rate 3 out of 10 as correct answers (something along the line of hardware or home improvement). Many people put food, although there was a food item it was only a small part of the total and the directions were to categorize the most significant part of the purchase. In a full up application I would need to implement a drop down menu or multiple choice mechanism to get consistent categories but the issue of 5 out of 10 categorizing as food is a fundamentally different problem. I will need to experiment with better directions to see if I can improve this accuracy.

MTurk provides a way to have a second person validate the result of the first person. This of course increases cost because it's a separate HIT and would lower response time as well but it appears to be a necessary step in quality control. MTurk also lets you "qualify" people which might help in this case, for instance I need people who understand english well enough to decrypt the extreme abbreviations that are on some receipts and can use context like this was a hardware store receipt to help with categorization.

One fascinating idea might be to try and preprocess receipts with OCR and use MTurk for confirmation and correction instead. Recall that the goal of this proof of concept is to develop a system that can accurately and automatically categorize receipts for a program like quicken or mint and make it easier for people to use that information for more detailed budgeting and money management.

Overall I think MTurk presents some unique capabilities that have not been widely exploited especially in consumer applications. However there are challenges and being able to afford MTurk even for just a few cents per transaction will be a barrier for many possible applications.

Friday, January 4, 2013

Implementing receipt processing with the Mechanical Turk

The process of developing for the Mechanical Turk (MTurk) revolves around creating HITs (Human Intelligence Tasks). To use MTurk you write a program (in a choice of languages) that create one or more HITs which are then processed by humans. A hit is a self-contained entity that explains the task, gives information needed to solve the task and asks for one or more answers. Each HIT is worth some amount of money specified by the creator (typically a few cents). As HITs are completed you can also automate data collection and payment for the HITs. There are a variety of quality control mechanisms in place but quality control remains a critical path of development.

Amazon has a nice getting started guide called Mechanical Turk Getting Started Guide. It supports a variety of languages including command line, C#, Java, Perl, and Ruby. My work shown here is based on examples from that guide using the Java SDK and Eclipse.

Built into MTurk the user can provide three types of answers: Free form text, multiple choice, and file upload. However because you can host the HIT on your own website it can actually be anything, an HTML form or a java applet for instance. For this example I wrote code to display a receipt, and ask three free form questions about it: What was the date of purchase, what was the name of the store, and what was the category of the spending.

I was impressed with the speed of development. I went from knowing nothing about MTurk to having answers back from real users in the production HIT environment in about three hours.

The HIT is defined in an XML structure called a QuestionForm documented here. This is the QuestionForm I used for my project:


<?xml version="1.0"?>
<QuestionForm xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd">
  <Overview>
    <Text>
      Your task is to examine the image of the receipt shown below and answer some basic information about it. If any information is unreadable or unavailable for some reason put unknown.
    </Text>
    <Binary>
      <MimeType>
        <Type>image</Type>
        <SubType>jpg</SubType>
      </MimeType>
      <DataURL>http://www.thenextwave.com/images/IMG_0018.jpg</DataURL>
      <AltText>An image of a receipt from a store.</AltText>
    </Binary>
  </Overview>
  <Question>
    <QuestionIdentifier>Business Name</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the name of the business where the purchase was made?"</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>Purchase Date</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the date of purchase? Use mm/dd/yy format please."</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>Category</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the general category of the purchase: food, fuel, clothes, etc. Use your best judgment to give an overall category. If no clear category choice is available use other."</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>comments</QuestionIdentifier>
    <QuestionContent>
      <Text>Please help us improve this HIT by including any Questions and/or Comments (optional):</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>10</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
</QuestionForm>


This is actually pretty easy to read. The first part is a description of the task and then there are four questions. (The last question is a request for suggestions).

Following the instructions in the getting started guide I loaded the MTurk SDK onto my machine and fired up eclipse. Here is the simple source code that submits the above question to MTurk:

MturkMain1.java:

package createnewhit;

/*
 * Copyright 2007-2012 Amazon Technologies, Inc.
 * 
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at:
 * 
 * http://aws.amazon.com/apache2.0
 * 
 * This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
 * OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and
 * limitations under the License.
 */ 

import com.amazonaws.mturk.addon.HITProperties;
import com.amazonaws.mturk.addon.HITQuestion;
import com.amazonaws.mturk.addon.QAPValidator;
import com.amazonaws.mturk.requester.HIT;
import com.amazonaws.mturk.service.axis.RequesterService;
import com.amazonaws.mturk.service.exception.ValidationException;
import com.amazonaws.mturk.util.PropertiesClientConfig;

/**
 * This is a try at using the Mechanical turk to process photographs of receipts.
 * 
 * mturk.properties must be found in the current file path.
 * 
 * The following concepts are covered:
 * - Using the <FormattedContent> functionality in QAP
 * - File based QAP and HIT properties HIT loading 
 * - Validating the correctness of QAP
 * - Using a basic system qualification
 * - Previewing the HIT as HTML
 *
 */
public class MturkMain1
{


    private RequesterService service;

    // Defining the location of the file containing the QAP and the properties of the HIT
    private String rootDir = ".";
    private String questionFile = rootDir + "/receipt_categorize.question";
    private String propertiesFile = rootDir + "/mturk.properties";

    /**
     * Constructor
     *
     */
    public MturkMain1() {
        service = new RequesterService(new PropertiesClientConfig());
    }

    /**
     * Check to see if your account has sufficient funds
     * @return true if there are sufficient funds. False if not.
     */
    public boolean hasEnoughFund() {
        double balance = service.getAccountBalance();
        System.out.println("Got account balance: " + RequesterService.formatCurrency(balance));
        return balance > 0;
    }

    /**
     * Creates the receipt categorization HIT
     * @param previewFile The filename of the preview file to be generated.  If null, no preview file will be generated
     * and the HIT will be created on Mechanical Turk.
     */
    public void createReceiptCategoryQuestion() {
        try {

            //Loading the HIT properties file.  HITProperties is a helper class that contains the 
            //properties of the HIT defined in the external file.  This feature allows you to define
            //the HIT attributes externally as a file and be able to modify it without recompiling your code.
            //In this sample, the qualification is defined in the properties file.
            HITProperties props = new HITProperties(propertiesFile);

            //Loading the question (QAP) file.  
            HITQuestion question = new HITQuestion(questionFile);

            // Validate the question (QAP) against the XSD Schema before making the call.
            // If there is an error in the question, ValidationException gets thrown.
            // This method is extremely useful in debugging your QAP.  Use it often.
            QAPValidator.validate(question.getQuestion());

            // Create a HIT using the properties and question files
            HIT hit = service.createHIT(null, // HITTypeId 
                    props.getTitle(), 
                    props.getDescription(), props.getKeywords(), // keywords 
                    question.getQuestion(),
                    props.getRewardAmount(), props.getAssignmentDuration(),
                    props.getAutoApprovalDelay(), props.getLifetime(),
                    props.getMaxAssignments(), props.getAnnotation(), // requesterAnnotation 
                    props.getQualificationRequirements(),
                    null // responseGroup
            );
            

            System.out.println("Created HIT: " + hit.getHITId());

            System.out.println("You may see your HIT with HITTypeId '" 
                    + hit.getHITTypeId() + "' here: ");

            System.out.println(service.getWebsiteURL() 
                    + "/mturk/preview?groupId=" + hit.getHITTypeId());
        } catch (ValidationException e) {
            //The validation exceptions will provide good insight into where in the QAP has errors.  
            //However, it is recommended to use other third party XML schema validators to make 
            //it easier to find and fix issues.
            System.err.println("QAP contains an error: " + e.getLocalizedMessage());  

        } catch (Exception e) {
            System.err.println(e.getLocalizedMessage());
        }
    }

    /**
     * @param args
     */
    public static void main(String[] args) {

        MturkMain1 app = new MturkMain1();

        app.createReceiptCategoryQuestion();
    }
}

Each time this program is executed it creates one or more HITs based on the question file. The HITs go to either the sandbox where you can look at them and work on them yourself for free, or to the production environment where people will work on them for real money.

One last file is needed to make this work, a properties file that defines configuration information. Particularly if you are creating HITs in the sandbox or production environment.

mturk.properties:
#
# You can find your access keys by going to aws.amazon.com, hovering your mouse over "Your Web Services Account" in the top-right
# corner and selecting View Access Key Identifiers. Be sure to log-in with the same username and password you registered with your
# Mechanical Turk Requester account. 
#
# If you don't yet have a Mechanical Turk Requester account, you can create one by visiting http://requester.mturk.com/

access_key=
secret_key=

# by default, will first load keys from <USER_HOME_DIR>/.aws/auth

######################################
## Basic HIT Properties
######################################
title:Process an image of a receipt and give basic information about it.
description:The task is to review a receipt, enter information about that receipt including date, time, company, and a categorization of the spending such as "groceries"
keywords:receipt, categorize, image
reward:0.11
assignments:10
annotation:sample#image

#
# -------------------
# ADVANCED PROPERTIES
# -------------------
#
# If you want to test your solution in the Amazon Mechanical Turk Developers Sandbox (http://sandbox.mturk.com)
# use the service_url defined below:
service_url=https://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester

# If you want to have your solution work against the Amazon Mechnical Turk Production site (http://www.mturk.com)
# use the service_url defined below:
#service_url=https://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester

#list of comma separated retriable errors which will be retried by RetryFilter
retriable_errors=Server.ServiceUnavailable
retry_attempts=10
retry_delay_millis=1000

######################################
## HIT Timing Properties
######################################

# this Assignment Duration value
assignmentduration:300

# this HIT Lifetime value is 60*60*24*3 = 3 days
hitlifetime:259200

# this Auto Approval period is 60*60*24*15 = 15 days
autoapprovaldelay:1296000


Next time I'll talk about the effectiveness of using the Mechanical Turk.



Thursday, January 3, 2013

Receipts and the Mechanical Turk

Last time I discussed how using a QR code to categorize receipts would make data entry easier. Unfortunately this solution requires the cooperation of the stores. While easily doable it is hard to get the momentum going to add such a feature. Another thought I had at the same time was using Amazon's Mechanical Turk to do a similar process. I have actually implemented a simple version of this and run a few examples through the production Turk. I think people will find this very interesting both for the specific problem and for a good, yet still fairly simple example of using MTurk (as Amazon likes to call it). See https://www.mturk.com/mturk/welcome for the official Amazon site.

First, for those not familiar, MTurk is a platform that allows the creation of small tasks that are meant to be done by real people in exchange for a small fee, typically a few cents. Examples tend to include "identify the color of the automobile in a picture" or "determine the hours of operation at a particular business". These are trivially easy for a person to accomplish but not for a computer.

My idea is to take a picture of a receipt and extract information from it such as the date of purchase, the name of the business, and the category of spending. Now some of you out there are probably image processing experts and might try to do a full computerized solution to this problem, but I think most people will admit that although you can convert the image to text trying to parse out information like the date of purchase is really hard. Go ahead and take a look at a few receipts and see what you think, I'll wait...

So the idea is to submit the image to MTurk and let a human do the job in a few seconds. I found this a very interesting experiment both from the software and human perspective. It is true that MTurk costs real money but I spent less than $5.00 on my experiments and you can spend nothing to use and develop for it, it only costs money when real people start actually working for you.

To keep this from being an enormous blog post I am going to split this up over several posts and include implementation details later. I will end here with a screen shot of what the person saw when they processed my receipt using MTurk. Keep in mind we are still at an experimental stage and more information would be added in the future.