Sunday, November 17, 2013

C++ Developers guide to upgrading to OSX 10.9 Mavericks

So Apple's shiny new OS is out and there are a few adjustments every C++ developer needs to know. (Note: This is for people using Eclipse or a similar IDE, not XCode).

First you'll need to go to the App Store and install the latest version of XCode.

If you use Eclipse (or probably any IDE) I recommend checking for updates. In Eclipse look for the option "Check for Updates" under the Help menu.

Next fire up Terminal, we need to install some stuff.

$ xcode-select --install

This installs the command line tools that Eclipse needs. You'll be prompted for your admin password.

Next if you like to use and install open source software you'll need to reinstall a few things. You probably already have a directory you download and install packages from but if not create something like ~/packages and work from there. (Thanks to https://wiki.documentfoundation.org/Development/BuildingOnMac for the following instructions).


$ curl -O http://mirrors.kernel.org/gnu/autoconf/autoconf-2.65.tar.gz
$ tar -xf autoconf-2.65.tar.gz
$ cd autoconf-2.65
$ ./configure
$ make
$ sudo make

$ install curl -O http://mirrors.kernel.org/gnu/automake/automake-1.11.tar.gz
$ tar -xf automake-1.11.tar.gz
$ cd automake-1.11
$ ./configure
$ make
$ sudo make install

$ curl -OL http://ftpmirror.gnu.org/libtool/libtool-2.4.2.tar.gz
$ tar -xf libtool-2.4.2.tar.gz
$ cd libtool-2.4.2
$ ./configure
$ make
$ sudo make install

Finally you'll probably need to rebuild and relink any projects you have in Eclipse. Remember to do a clean first so everything gets rebuilt.

Now you should be good to go.

Wednesday, January 9, 2013

Mechanical Turk - Conclusions

In this simple proof of concept I asked ten people to process a receipt, giving information about the business, date of purchase and the category of purchase. Some of this is clearly possibly to automate with OCR however the detail and accuracy required to make this useful makes this a good candidate for a Mechanical Turk task.

So how did people do?

Well it took about 20 minutes for all 10 HITS to be processed. That's not too bad, and I'm sure response time is highly sensitive to price. I suspect 11 cents for each of these hits is on the high side since the time it takes to process one of these is probably 30 seconds or so at most. That said this is an interesting design decision you have to make, it's pretty clear that response times are not going to be very predictable, and definitely not particularly close to real time unless you pay a significant premium. If you really have a popular application and people get to know your tasks (something you can easily do with MTurk) then you might begin to achieve more consistent and quicker results. Especially since the system allows you to tip particularly diligent workers.

As for accuracy, all ten people correctly identified the business and 9 out of 10 got the date right. The one person who didn't reversed the month and day despite explicit instructions in the HIT to watch out for this. The context of the receipt should have been pretty clear it was from the US, but I suppose its possible the actual mistake made was thinking this was a non-US receipt and my instruction led to confusion that it should be switched.

As for categorization, it was all over the place, I would rate 3 out of 10 as correct answers (something along the line of hardware or home improvement). Many people put food, although there was a food item it was only a small part of the total and the directions were to categorize the most significant part of the purchase. In a full up application I would need to implement a drop down menu or multiple choice mechanism to get consistent categories but the issue of 5 out of 10 categorizing as food is a fundamentally different problem. I will need to experiment with better directions to see if I can improve this accuracy.

MTurk provides a way to have a second person validate the result of the first person. This of course increases cost because it's a separate HIT and would lower response time as well but it appears to be a necessary step in quality control. MTurk also lets you "qualify" people which might help in this case, for instance I need people who understand english well enough to decrypt the extreme abbreviations that are on some receipts and can use context like this was a hardware store receipt to help with categorization.

One fascinating idea might be to try and preprocess receipts with OCR and use MTurk for confirmation and correction instead. Recall that the goal of this proof of concept is to develop a system that can accurately and automatically categorize receipts for a program like quicken or mint and make it easier for people to use that information for more detailed budgeting and money management.

Overall I think MTurk presents some unique capabilities that have not been widely exploited especially in consumer applications. However there are challenges and being able to afford MTurk even for just a few cents per transaction will be a barrier for many possible applications.

Friday, January 4, 2013

Implementing receipt processing with the Mechanical Turk

The process of developing for the Mechanical Turk (MTurk) revolves around creating HITs (Human Intelligence Tasks). To use MTurk you write a program (in a choice of languages) that create one or more HITs which are then processed by humans. A hit is a self-contained entity that explains the task, gives information needed to solve the task and asks for one or more answers. Each HIT is worth some amount of money specified by the creator (typically a few cents). As HITs are completed you can also automate data collection and payment for the HITs. There are a variety of quality control mechanisms in place but quality control remains a critical path of development.

Amazon has a nice getting started guide called Mechanical Turk Getting Started Guide. It supports a variety of languages including command line, C#, Java, Perl, and Ruby. My work shown here is based on examples from that guide using the Java SDK and Eclipse.

Built into MTurk the user can provide three types of answers: Free form text, multiple choice, and file upload. However because you can host the HIT on your own website it can actually be anything, an HTML form or a java applet for instance. For this example I wrote code to display a receipt, and ask three free form questions about it: What was the date of purchase, what was the name of the store, and what was the category of the spending.

I was impressed with the speed of development. I went from knowing nothing about MTurk to having answers back from real users in the production HIT environment in about three hours.

The HIT is defined in an XML structure called a QuestionForm documented here. This is the QuestionForm I used for my project:


<?xml version="1.0"?>
<QuestionForm xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd">
  <Overview>
    <Text>
      Your task is to examine the image of the receipt shown below and answer some basic information about it. If any information is unreadable or unavailable for some reason put unknown.
    </Text>
    <Binary>
      <MimeType>
        <Type>image</Type>
        <SubType>jpg</SubType>
      </MimeType>
      <DataURL>http://www.thenextwave.com/images/IMG_0018.jpg</DataURL>
      <AltText>An image of a receipt from a store.</AltText>
    </Binary>
  </Overview>
  <Question>
    <QuestionIdentifier>Business Name</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the name of the business where the purchase was made?"</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>Purchase Date</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the date of purchase? Use mm/dd/yy format please."</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>Category</QuestionIdentifier>
    <QuestionContent>
      <Text>What is the general category of the purchase: food, fuel, clothes, etc. Use your best judgment to give an overall category. If no clear category choice is available use other."</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
  <Question>
    <QuestionIdentifier>comments</QuestionIdentifier>
    <QuestionContent>
      <Text>Please help us improve this HIT by including any Questions and/or Comments (optional):</Text>
    </QuestionContent>
    <AnswerSpecification>
      <FreeTextAnswer>
        <NumberOfLinesSuggestion>10</NumberOfLinesSuggestion>
      </FreeTextAnswer>
    </AnswerSpecification>
  </Question>
</QuestionForm>


This is actually pretty easy to read. The first part is a description of the task and then there are four questions. (The last question is a request for suggestions).

Following the instructions in the getting started guide I loaded the MTurk SDK onto my machine and fired up eclipse. Here is the simple source code that submits the above question to MTurk:

MturkMain1.java:

package createnewhit;

/*
 * Copyright 2007-2012 Amazon Technologies, Inc.
 * 
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at:
 * 
 * http://aws.amazon.com/apache2.0
 * 
 * This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
 * OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and
 * limitations under the License.
 */ 

import com.amazonaws.mturk.addon.HITProperties;
import com.amazonaws.mturk.addon.HITQuestion;
import com.amazonaws.mturk.addon.QAPValidator;
import com.amazonaws.mturk.requester.HIT;
import com.amazonaws.mturk.service.axis.RequesterService;
import com.amazonaws.mturk.service.exception.ValidationException;
import com.amazonaws.mturk.util.PropertiesClientConfig;

/**
 * This is a try at using the Mechanical turk to process photographs of receipts.
 * 
 * mturk.properties must be found in the current file path.
 * 
 * The following concepts are covered:
 * - Using the <FormattedContent> functionality in QAP
 * - File based QAP and HIT properties HIT loading 
 * - Validating the correctness of QAP
 * - Using a basic system qualification
 * - Previewing the HIT as HTML
 *
 */
public class MturkMain1
{


    private RequesterService service;

    // Defining the location of the file containing the QAP and the properties of the HIT
    private String rootDir = ".";
    private String questionFile = rootDir + "/receipt_categorize.question";
    private String propertiesFile = rootDir + "/mturk.properties";

    /**
     * Constructor
     *
     */
    public MturkMain1() {
        service = new RequesterService(new PropertiesClientConfig());
    }

    /**
     * Check to see if your account has sufficient funds
     * @return true if there are sufficient funds. False if not.
     */
    public boolean hasEnoughFund() {
        double balance = service.getAccountBalance();
        System.out.println("Got account balance: " + RequesterService.formatCurrency(balance));
        return balance > 0;
    }

    /**
     * Creates the receipt categorization HIT
     * @param previewFile The filename of the preview file to be generated.  If null, no preview file will be generated
     * and the HIT will be created on Mechanical Turk.
     */
    public void createReceiptCategoryQuestion() {
        try {

            //Loading the HIT properties file.  HITProperties is a helper class that contains the 
            //properties of the HIT defined in the external file.  This feature allows you to define
            //the HIT attributes externally as a file and be able to modify it without recompiling your code.
            //In this sample, the qualification is defined in the properties file.
            HITProperties props = new HITProperties(propertiesFile);

            //Loading the question (QAP) file.  
            HITQuestion question = new HITQuestion(questionFile);

            // Validate the question (QAP) against the XSD Schema before making the call.
            // If there is an error in the question, ValidationException gets thrown.
            // This method is extremely useful in debugging your QAP.  Use it often.
            QAPValidator.validate(question.getQuestion());

            // Create a HIT using the properties and question files
            HIT hit = service.createHIT(null, // HITTypeId 
                    props.getTitle(), 
                    props.getDescription(), props.getKeywords(), // keywords 
                    question.getQuestion(),
                    props.getRewardAmount(), props.getAssignmentDuration(),
                    props.getAutoApprovalDelay(), props.getLifetime(),
                    props.getMaxAssignments(), props.getAnnotation(), // requesterAnnotation 
                    props.getQualificationRequirements(),
                    null // responseGroup
            );
            

            System.out.println("Created HIT: " + hit.getHITId());

            System.out.println("You may see your HIT with HITTypeId '" 
                    + hit.getHITTypeId() + "' here: ");

            System.out.println(service.getWebsiteURL() 
                    + "/mturk/preview?groupId=" + hit.getHITTypeId());
        } catch (ValidationException e) {
            //The validation exceptions will provide good insight into where in the QAP has errors.  
            //However, it is recommended to use other third party XML schema validators to make 
            //it easier to find and fix issues.
            System.err.println("QAP contains an error: " + e.getLocalizedMessage());  

        } catch (Exception e) {
            System.err.println(e.getLocalizedMessage());
        }
    }

    /**
     * @param args
     */
    public static void main(String[] args) {

        MturkMain1 app = new MturkMain1();

        app.createReceiptCategoryQuestion();
    }
}

Each time this program is executed it creates one or more HITs based on the question file. The HITs go to either the sandbox where you can look at them and work on them yourself for free, or to the production environment where people will work on them for real money.

One last file is needed to make this work, a properties file that defines configuration information. Particularly if you are creating HITs in the sandbox or production environment.

mturk.properties:
#
# You can find your access keys by going to aws.amazon.com, hovering your mouse over "Your Web Services Account" in the top-right
# corner and selecting View Access Key Identifiers. Be sure to log-in with the same username and password you registered with your
# Mechanical Turk Requester account. 
#
# If you don't yet have a Mechanical Turk Requester account, you can create one by visiting http://requester.mturk.com/

access_key=
secret_key=

# by default, will first load keys from <USER_HOME_DIR>/.aws/auth

######################################
## Basic HIT Properties
######################################
title:Process an image of a receipt and give basic information about it.
description:The task is to review a receipt, enter information about that receipt including date, time, company, and a categorization of the spending such as "groceries"
keywords:receipt, categorize, image
reward:0.11
assignments:10
annotation:sample#image

#
# -------------------
# ADVANCED PROPERTIES
# -------------------
#
# If you want to test your solution in the Amazon Mechanical Turk Developers Sandbox (http://sandbox.mturk.com)
# use the service_url defined below:
service_url=https://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester

# If you want to have your solution work against the Amazon Mechnical Turk Production site (http://www.mturk.com)
# use the service_url defined below:
#service_url=https://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester

#list of comma separated retriable errors which will be retried by RetryFilter
retriable_errors=Server.ServiceUnavailable
retry_attempts=10
retry_delay_millis=1000

######################################
## HIT Timing Properties
######################################

# this Assignment Duration value
assignmentduration:300

# this HIT Lifetime value is 60*60*24*3 = 3 days
hitlifetime:259200

# this Auto Approval period is 60*60*24*15 = 15 days
autoapprovaldelay:1296000


Next time I'll talk about the effectiveness of using the Mechanical Turk.



Thursday, January 3, 2013

Receipts and the Mechanical Turk

Last time I discussed how using a QR code to categorize receipts would make data entry easier. Unfortunately this solution requires the cooperation of the stores. While easily doable it is hard to get the momentum going to add such a feature. Another thought I had at the same time was using Amazon's Mechanical Turk to do a similar process. I have actually implemented a simple version of this and run a few examples through the production Turk. I think people will find this very interesting both for the specific problem and for a good, yet still fairly simple example of using MTurk (as Amazon likes to call it). See https://www.mturk.com/mturk/welcome for the official Amazon site.

First, for those not familiar, MTurk is a platform that allows the creation of small tasks that are meant to be done by real people in exchange for a small fee, typically a few cents. Examples tend to include "identify the color of the automobile in a picture" or "determine the hours of operation at a particular business". These are trivially easy for a person to accomplish but not for a computer.

My idea is to take a picture of a receipt and extract information from it such as the date of purchase, the name of the business, and the category of spending. Now some of you out there are probably image processing experts and might try to do a full computerized solution to this problem, but I think most people will admit that although you can convert the image to text trying to parse out information like the date of purchase is really hard. Go ahead and take a look at a few receipts and see what you think, I'll wait...

So the idea is to submit the image to MTurk and let a human do the job in a few seconds. I found this a very interesting experiment both from the software and human perspective. It is true that MTurk costs real money but I spent less than $5.00 on my experiments and you can spend nothing to use and develop for it, it only costs money when real people start actually working for you.

To keep this from being an enormous blog post I am going to split this up over several posts and include implementation details later. I will end here with a screen shot of what the person saw when they processed my receipt using MTurk. Keep in mind we are still at an experimental stage and more information would be added in the future.



Monday, December 31, 2012

Receipt QR codes

I've always found QR codes to be quite interesting. The idea of tagging physical things in a reliable easy for computers to read way has many interesting possibilities. Today I was spending some time thinking of a project to work on. I want to learn more about using Amazon's Web Services and wanted a project that would be suitably challenging but not too difficult. Well actually I got a bit sidetracked, but maybe we'll get back to AWS if you follow this blog for awhile.

Like many engineers I like to think of myself as organized. I dutifully enter financial data into quicken for instance. And that's what really got me thinking. I like services like Mint, its great in fact, but its not very accurate in tracking how I spend my money. For instance anything spent at a gas station shows up as Gas and Fuel but I can assure you its just as likely to be chips and soda. So I prefer Quicken, it helps me budget more accurately and save by knowing what I really spend my money on. Wouldn't it be great if printed on the receipt was something I could scan with my smart phone and have it do more accurate data entry?

So I started to think about this in more detail. First a "standard" QR code, the kind you most typically see has a capacity of only 174 characters at its highest error correction level. There are higher density QR codes but I have doubts that they are printable on the standard thermal paper based receipt printers in most cash registers. So one thing to consider is go to a different format, there are lots of ways to print machine readable data on a cash register receipt. But it's hard to resist the QR code approach, as its popularity makes it instantly recognizable.

With 174 characters there's enough space for a date/time stamp, a company name, a transaction type (debit/cash/etc) and enough left over for at least 10 rows of category information. The total should probably be implicit as the sum of the categories. Let's use a simple pipe delimited format for an example (XML is too bulky for QR codes)
12311214:31|Joes's gas station|Debit|Fuel|45.00USD|Cash|Food|4.75USD

I made this a little more human readable then it needs to be, especially we could use a coding system for the categories, that would save a lot of space. The above is 66 characters long, more than 100 left. I would love to be able to scan something like this in.

I think most stores cash register software could accomodate this if there was a standard format. Many companies already break out a receipt like this to some extent in the human readable portion. Grocery stores even go into details like dairy vs. produce. Why should they do this? Well it's a convenience to their customers, it's a feature they can offer over competition, and if you added a secure hash of some kind you could tie a paper receipt to all sorts of online reward programs and opportunities.

I suppose some people might object on privacy concerns. A couple of thoughts, first make sure the bar code is not encrypted (except the hash :). So anyone can see what the data actually says. Keep any personally identifying data out of it. If you do that it doesn't really contain anything sensitive anyway, nothing more sensitive than the original receipt. I suppose if the general population really objected to it you could tie it into the various reward cards programs that are quite common. I don't foresee this would be a big issue.

Next time I'm going to write about some alternate ideas to accomplish the same goal. What do you think?

Friday, December 28, 2012

Tools Sharp - The Economist

Sirs,

Keeping the tools sharp is a repeated theme of this blog. It's important to remember that this means more than keeping your software skills up to date. It means knowing enough about the world around you to take advantages of unexpected opportunities and also keeping a close eye out for warning signs that might affect your business or your lifestyle.

For many many years I subscribed to US News and World Report. This was the preeminent news magazine of its day. It didn't have the largest subscriber base but it had the broadest and deepest coverage of any American news weekly. Sadly I watched its rapid decline in the early 2000s focusing more and more on sensationalist and celebrity stories while its pages on news international and othewise was cut back.

Clearly this was a victim of the internet age and it is true that many publications experienced similar sharp declines. I could spend weeks on what happend to beloved cable chanels as the reality television format took over like ivy climbing on a wall.

Neverthless some periodicials made a choice to move to quality instead of the lowest common demoninator. Surprisingly Rupert Murdoch has done well with the Wall Street Journal, no doubt because its audidence is razor focused on financial information.

As for the American news weekly, we all saw the decline of Time, Newsweek and US News. All using differing combinations of celebrity and shocking news stories to sell issues while slowly minimizing their day to day hard news information that helps you understand how the world works.

Obviously as a person who makes my living off the internet I should probably just move to many free web sites. Many of these places do good analaysis but they don't do a good job putting together an objective picture. CNN used to the be the best of this lot and yet they seem affected by the same trend of trivia and celebrity that so many other news sites have fallen on.

Despite the internet I keep several print subscriptions and The Economist is the most important. This magazine (or newspaper as it likes to call itself) presents the broadest and deepest news coverage of any weekly on the planet. It's not cheap, about $100 a year (or $0.36 cents a day). But it covers every part of the world with a seriousness and purposefulness yet also with a smart sense of humor that makes it a must read the moment it hits my mailbox or my ipad. They do a technology review every quarter that is literally filled with good ideas for start up companies.

I have no financial interest in The Economist (at economist.com) but they are the last publication I would stop if forced to cut off my feed of information from my mailbox.

[Note I edited this slightly to reduce any political references which only add argument and reduce value ad fixed a couple of spelling mistakes at the same time.]

Sunday, December 23, 2012

Job Search Tools

Last time I briefly discussed what I have learned about the job situation for software engineers as of the end of 2012. I want to also spend some time talking about the tools that I've found to be most effective so far.

First, and really deserving of its own article, is LinkedIn. Any professional should be familiar with LinkedIn and have an up to date profile. Your job search should begin by looking through your contacts and see who's where. Reach out to the people in companies you are interested in working for and especially reach out to those people who know you well who might be in a position to hire you directly. You should always keep your LinkedIn contacts up to date as you work with people. Personally I keep my LinkedIn contacts limited to people I know fairly well... primarily because I personally don't find the friend-of-a-friend connections to be very useful. Your mileage may vary and certain jobs such as sales might encourage different tactics. But for engineering, its all about knowing someone is capable of doing a quality job or knowing someone is good at estimating, or some other aspect of the job that just doesn't communicate very well to secondary and tertiary tiers of relationships.

Next up are the inevitable job search boards. Here in the Midwest careerlink.com is well known and popular. I have also personally found dice.com a good place to go, I use the feature they provide to email you the newest jobs matching certain search criteria. You can have up to five different such searches in their free tier. There are also the meta-job boards, boards that attempt to collate content from other job boards: simplyhired.com and indeed.com are the best known examples. Be certain to check the careers sections of larger companies you might be interested in. You should know the big players in your industry and check them directly.

If you are conducting a more open ended search or really hoping to move up the food chain I also recommend glassdoor.com. A website that provides inside reviews of companies somewhat like how amazon products are reviewed. The same caveats must be applied, always throw out the best and worst reviews but companies with a significant number of reviews should converge to an average. The main point of this site is to sort out great places to work from average or worse places to work.

Finally, remember to research prospective employers before an interview. Find out what's new, what their current products are, how they are doing in the market, etc. All of this information will allow you to ask thoughtful questions during the interview which is a somewhat neglected part of the process. Also, depending on your circumstances of course, you should try to keep some perspective that you are interviewing them just as much as they are interviewing you.