Friday, January 4, 2013

Implementing receipt processing with the Mechanical Turk

The process of developing for the Mechanical Turk (MTurk) revolves around creating HITs (Human Intelligence Tasks). To use MTurk you write a program (in a choice of languages) that create one or more HITs which are then processed by humans. A hit is a self-contained entity that explains the task, gives information needed to solve the task and asks for one or more answers. Each HIT is worth some amount of money specified by the creator (typically a few cents). As HITs are completed you can also automate data collection and payment for the HITs. There are a variety of quality control mechanisms in place but quality control remains a critical path of development.

Amazon has a nice getting started guide called Mechanical Turk Getting Started Guide. It supports a variety of languages including command line, C#, Java, Perl, and Ruby. My work shown here is based on examples from that guide using the Java SDK and Eclipse.

Built into MTurk the user can provide three types of answers: Free form text, multiple choice, and file upload. However because you can host the HIT on your own website it can actually be anything, an HTML form or a java applet for instance. For this example I wrote code to display a receipt, and ask three free form questions about it: What was the date of purchase, what was the name of the store, and what was the category of the spending.

I was impressed with the speed of development. I went from knowing nothing about MTurk to having answers back from real users in the production HIT environment in about three hours.

The HIT is defined in an XML structure called a QuestionForm documented here. This is the QuestionForm I used for my project:


<?xml version="1.0"?>
<QuestionForm xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd">
  <Overview>
    <Text>
      Your task is to examine the image of the receipt shown below and answer some basic information about it. If any information is unreadable or unavailable for some reason put unknown.
    </Text>
    <Binary>
      <MimeType>
        <Type>image</Type>
        <SubType>jpg</SubType>
      </MimeType>
      <DataURL>http://www.thenextwave.com/images/IMG_0018.jpg</DataURL>
      <AltText>An image of a receipt from a store.</AltText>
    </Binary>
  </Overview>
  <Question>
    <QuestionIdentifier>Business Name</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the name of the business where the purchase was made?"</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>Purchase Date</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the date of purchase? Use mm/dd/yy format please."</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>Category</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the general category of the purchase: food, fuel, clothes, etc. Use your best judgment to give an overall category. If no clear category choice is available use other."</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>comments</QuestionIdentifier>
    <QuestionContent>
      <Text>Please help us improve this HIT by including any Questions and/or Comments (optional):</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>10</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
</QuestionForm>


This is actually pretty easy to read. The first part is a description of the task and then there are four questions. (The last question is a request for suggestions).

Following the instructions in the getting started guide I loaded the MTurk SDK onto my machine and fired up eclipse. Here is the simple source code that submits the above question to MTurk:

MturkMain1.java:

package createnewhit;

/*
 * Copyright 2007-2012 Amazon Technologies, Inc.
 * 
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at:
 * 
 * http://aws.amazon.com/apache2.0
 * 
 * This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
 * OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and
 * limitations under the License.
 */ 

import com.amazonaws.mturk.addon.HITProperties;
import com.amazonaws.mturk.addon.HITQuestion;
import com.amazonaws.mturk.addon.QAPValidator;
import com.amazonaws.mturk.requester.HIT;
import com.amazonaws.mturk.service.axis.RequesterService;
import com.amazonaws.mturk.service.exception.ValidationException;
import com.amazonaws.mturk.util.PropertiesClientConfig;

/**
 * This is a try at using the Mechanical turk to process photographs of receipts.
 * 
 * mturk.properties must be found in the current file path.
 * 
 * The following concepts are covered:
 * - Using the <FormattedContent> functionality in QAP
 * - File based QAP and HIT properties HIT loading 
 * - Validating the correctness of QAP
 * - Using a basic system qualification
 * - Previewing the HIT as HTML
 *
 */
public class MturkMain1
{


    private RequesterService service;

    // Defining the location of the file containing the QAP and the properties of the HIT
    private String rootDir = ".";
    private String questionFile = rootDir + "/receipt_categorize.question";
    private String propertiesFile = rootDir + "/mturk.properties";

    /**
     * Constructor
     *
     */
    public MturkMain1() {
        service = new RequesterService(new PropertiesClientConfig());
    }

    /**
     * Check to see if your account has sufficient funds
     * @return true if there are sufficient funds. False if not.
     */
    public boolean hasEnoughFund() {
        double balance = service.getAccountBalance();
        System.out.println("Got account balance: " + RequesterService.formatCurrency(balance));
        return balance > 0;
    }

    /**
     * Creates the receipt categorization HIT
     * @param previewFile The filename of the preview file to be generated.  If null, no preview file will be generated
     * and the HIT will be created on Mechanical Turk.
     */
    public void createReceiptCategoryQuestion() {
        try {

            //Loading the HIT properties file.  HITProperties is a helper class that contains the 
            //properties of the HIT defined in the external file.  This feature allows you to define
            //the HIT attributes externally as a file and be able to modify it without recompiling your code.
            //In this sample, the qualification is defined in the properties file.
            HITProperties props = new HITProperties(propertiesFile);

            //Loading the question (QAP) file.  
            HITQuestion question = new HITQuestion(questionFile);

            // Validate the question (QAP) against the XSD Schema before making the call.
            // If there is an error in the question, ValidationException gets thrown.
            // This method is extremely useful in debugging your QAP.  Use it often.
            QAPValidator.validate(question.getQuestion());

            // Create a HIT using the properties and question files
            HIT hit = service.createHIT(null, // HITTypeId 
                    props.getTitle(), 
                    props.getDescription(), props.getKeywords(), // keywords 
                    question.getQuestion(),
                    props.getRewardAmount(), props.getAssignmentDuration(),
                    props.getAutoApprovalDelay(), props.getLifetime(),
                    props.getMaxAssignments(), props.getAnnotation(), // requesterAnnotation 
                    props.getQualificationRequirements(),
                    null // responseGroup
            );
            

            System.out.println("Created HIT: " + hit.getHITId());

            System.out.println("You may see your HIT with HITTypeId '" 
                    + hit.getHITTypeId() + "' here: ");

            System.out.println(service.getWebsiteURL() 
                    + "/mturk/preview?groupId=" + hit.getHITTypeId());
        } catch (ValidationException e) {
            //The validation exceptions will provide good insight into where in the QAP has errors.  
            //However, it is recommended to use other third party XML schema validators to make 
            //it easier to find and fix issues.
            System.err.println("QAP contains an error: " + e.getLocalizedMessage());  

        } catch (Exception e) {
            System.err.println(e.getLocalizedMessage());
        }
    }

    /**
     * @param args
     */
    public static void main(String[] args) {

        MturkMain1 app = new MturkMain1();

        app.createReceiptCategoryQuestion();
    }
}

Each time this program is executed it creates one or more HITs based on the question file. The HITs go to either the sandbox where you can look at them and work on them yourself for free, or to the production environment where people will work on them for real money.

One last file is needed to make this work, a properties file that defines configuration information. Particularly if you are creating HITs in the sandbox or production environment.

mturk.properties:
#
# You can find your access keys by going to aws.amazon.com, hovering your mouse over "Your Web Services Account" in the top-right
# corner and selecting View Access Key Identifiers. Be sure to log-in with the same username and password you registered with your
# Mechanical Turk Requester account. 
#
# If you don't yet have a Mechanical Turk Requester account, you can create one by visiting http://requester.mturk.com/

access_key=
secret_key=

# by default, will first load keys from <USER_HOME_DIR>/.aws/auth

######################################
## Basic HIT Properties
######################################
title:Process an image of a receipt and give basic information about it.
description:The task is to review a receipt, enter information about that receipt including date, time, company, and a categorization of the spending such as "groceries"
keywords:receipt, categorize, image
reward:0.11
assignments:10
annotation:sample#image

#
# -------------------
# ADVANCED PROPERTIES
# -------------------
#
# If you want to test your solution in the Amazon Mechanical Turk Developers Sandbox (http://sandbox.mturk.com)
# use the service_url defined below:
service_url=https://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester

# If you want to have your solution work against the Amazon Mechnical Turk Production site (http://www.mturk.com)
# use the service_url defined below:
#service_url=https://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester

#list of comma separated retriable errors which will be retried by RetryFilter
retriable_errors=Server.ServiceUnavailable
retry_attempts=10
retry_delay_millis=1000

######################################
## HIT Timing Properties
######################################

# this Assignment Duration value
assignmentduration:300

# this HIT Lifetime value is 60*60*24*3 = 3 days
hitlifetime:259200

# this Auto Approval period is 60*60*24*15 = 15 days
autoapprovaldelay:1296000


Next time I'll talk about the effectiveness of using the Mechanical Turk.



2 comments:

  1. The site was absolutely fantastic! Lots of great information and inspiration, both of which we all need!b Keep 'em coming... you all do such a great job at such Concepts... can't tell you how much I, for one appreciate all you do!

    Webzemini Software - software company in kolkata

    ReplyDelete
  2. Stunning art, perfectly demonstrated the concept of code, thank you found your information very helpful
    Richard Brown secure data room

    ReplyDelete

Note: Only a member of this blog may post a comment.