tag:blogger.com,1999:blog-73522367083144555862024-03-04T21:19:04.264-08:00Software Engineer - A JourneyMikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-7352236708314455586.post-25547066931695563662013-11-17T00:57:00.003-08:002013-11-17T00:57:35.318-08:00C++ Developers guide to upgrading to OSX 10.9 MavericksSo Apple's shiny new OS is out and there are a few adjustments every C++ developer needs to know. (Note: This is for people using Eclipse or a similar IDE, not XCode).<br />
<br />
First you'll need to go to the App Store and install the latest version of <a href="https://itunes.apple.com/us/app/xcode/id497799835">XCode</a>.<br />
<br />
If you use Eclipse (or probably any IDE) I recommend checking for updates. In Eclipse look for the option "Check for Updates" under the Help menu.<br />
<br />
Next fire up Terminal, we need to install some stuff.<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">$ xcode-select --install</span><br />
<br />
This installs the command line tools that Eclipse needs. You'll be prompted for your admin password.<br />
<br />
Next if you like to use and install open source software you'll need to reinstall a few things. You probably already have a directory you download and install packages from but if not create something like ~/packages and work from there. (Thanks to https://wiki.documentfoundation.org/Development/BuildingOnMac for the following instructions).<br />
<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">$ curl -O http://mirrors.kernel.org/gnu/autoconf/autoconf-2.65.tar.gz</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ tar -xf autoconf-2.65.tar.gz</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ cd autoconf-2.65</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ ./configure</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ make</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ sudo make</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">$ install
curl -O http://mirrors.kernel.org/gnu/automake/automake-1.11.tar.gz</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ tar -xf automake-1.11.tar.gz</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ cd automake-1.11</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ ./configure</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ make</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ sudo make install</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">$ curl -OL http://ftpmirror.gnu.org/libtool/libtool-2.4.2.tar.gz</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ tar -xf libtool-2.4.2.tar.gz</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ cd libtool-2.4.2</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ ./configure</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ make</span><br />
<span style="font-family: Courier New, Courier, monospace;">$ sudo make install
</span><br />
<br />
Finally you'll probably need to rebuild and relink any projects you have in Eclipse. Remember to do a clean first so everything gets rebuilt.<br />
<br />
Now you should be good to go.<br />
<br />Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com2tag:blogger.com,1999:blog-7352236708314455586.post-65264774175730560262013-01-09T22:54:00.000-08:002013-01-09T22:54:33.075-08:00Mechanical Turk - ConclusionsIn this simple proof of concept I asked ten people to process a receipt, giving information about the business, date of purchase and the category of purchase. Some of this is clearly possibly to automate with OCR however the detail and accuracy required to make this useful makes this a good candidate for a Mechanical Turk task.<br />
<br />
So how did people do?<br />
<br />
Well it took about 20 minutes for all 10 HITS to be processed. That's not too bad, and I'm sure response time is highly sensitive to price. I suspect 11 cents for each of these hits is on the high side since the time it takes to process one of these is probably 30 seconds or so at most. That said this is an interesting design decision you have to make, it's pretty clear that response times are not going to be very predictable, and definitely not particularly close to real time unless you pay a significant premium. If you really have a popular application and people get to know your tasks (something you can easily do with MTurk) then you might begin to achieve more consistent and quicker results. Especially since the system allows you to tip particularly diligent workers.<br />
<br />
As for accuracy, all ten people correctly identified the business and 9 out of 10 got the date right. The one person who didn't reversed the month and day despite explicit instructions in the HIT to watch out for this. The context of the receipt should have been pretty clear it was from the US, but I suppose its possible the actual mistake made was thinking this was a non-US receipt and my instruction led to confusion that it should be switched.<br />
<br />
As for categorization, it was all over the place, I would rate 3 out of 10 as correct answers (something along the line of hardware or home improvement). Many people put food, although there was a food item it was only a small part of the total and the directions were to categorize the most significant part of the purchase. In a full up application I would need to implement a drop down menu or multiple choice mechanism to get consistent categories but the issue of 5 out of 10 categorizing as food is a fundamentally different problem. I will need to experiment with better directions to see if I can improve this accuracy.<br />
<br />
MTurk provides a way to have a second person validate the result of the first person. This of course increases cost because it's a separate HIT and would lower response time as well but it appears to be a necessary step in quality control. MTurk also lets you "qualify" people which might help in this case, for instance I need people who understand english well enough to decrypt the extreme abbreviations that are on some receipts and can use context like this was a hardware store receipt to help with categorization.<br />
<br />
One fascinating idea might be to try and preprocess receipts with OCR and use MTurk for confirmation and correction instead. Recall that the goal of this proof of concept is to develop a system that can accurately and automatically categorize receipts for a program like quicken or mint and make it easier for people to use that information for more detailed budgeting and money management.<br />
<br />
Overall I think MTurk presents some unique capabilities that have not been widely exploited especially in consumer applications. However there are challenges and being able to afford MTurk even for just a few cents per transaction will be a barrier for many possible applications.Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com0tag:blogger.com,1999:blog-7352236708314455586.post-36750504162035450932013-01-04T10:20:00.001-08:002013-01-04T10:20:31.123-08:00Implementing receipt processing with the Mechanical TurkThe process of developing for the Mechanical Turk (MTurk) revolves around creating HITs (Human Intelligence Tasks). To use MTurk you write a program (in a choice of languages) that create one or more HITs which are then processed by humans. A hit is a self-contained entity that explains the task, gives information needed to solve the task and asks for one or more answers. Each HIT is worth some amount of money specified by the creator (typically a few cents). As HITs are completed you can also automate data collection and payment for the HITs. There are a variety of quality control mechanisms in place but quality control remains a critical path of development.<br />
<br />
Amazon has a nice getting started guide called <i><a href="http://www.amazon.com/Amazon-Mechanical-Getting-Started-ebook/dp/B007USB0FY/ref=sr_1_1?ie=UTF8&qid=1357322461&sr=8-1&keywords=mechanical+turk+getting+started">Mechanical Turk Getting Started Guide</a></i>. It supports a variety of languages including command line, C#, Java, Perl, and Ruby. My work shown here is based on examples from that guide using the Java SDK and Eclipse.<br />
<br />
Built into MTurk the user can provide three types of answers: Free form text, multiple choice, and file upload. However because you can host the HIT on your own website it can actually be anything, an HTML form or a java applet for instance. For this example I wrote code to display a receipt, and ask three free form questions about it: What was the date of purchase, what was the name of the store, and what was the category of the spending.<br />
<br />
I was impressed with the speed of development. I went from knowing nothing about MTurk to having answers back from real users in the production HIT environment in about three hours.<br />
<br />
The HIT is defined in an XML structure called a QuestionForm documented <a href="http://docs.aws.amazon.com/AWSMechTurk/2007-03-12/AWSMechanicalTurkRequester/ApiReference_QuestionFormDataStructureArticle.html">here</a>. This is the QuestionForm I used for my project:<br />
<br />
<br />
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><?<span class="s1">xml</span> version="1.0"?></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><QuestionForm <span class="s1">xmlns</span>="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd"></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Overview></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Text></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> Your task is to examine the image of the receipt shown below and answer some basic information about it. If any information is unreadable or unavailable for some reason put unknown.</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </Text></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Binary></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <MimeType></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Type>image</Type></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <SubType><span class="s1">jpg</span></SubType></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </MimeType></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <DataURL>http://www.thenextwave.com/images/IMG_0018.jpg</DataURL></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <AltText>An image of a receipt from a store.</AltText></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </Binary></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </Overview></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionIdentifier>Business Name</QuestionIdentifier></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Text>What is the name of the business where the purchase was made?"</Text></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionIdentifier>Purchase Date</QuestionIdentifier></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Text>What is the date of purchase? Use <span class="s1">mm</span>/<span class="s1">dd</span>/<span class="s1">yy</span> format please."</Text></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionIdentifier>Category</QuestionIdentifier></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Text>What is the general category of the purchase: food, fuel, clothes, etc. Use your best judgment to give an overall category. If no clear category choice is available use other."</Text></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <NumberOfLinesSuggestion>1</NumberOfLinesSuggestion></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionIdentifier>comments</QuestionIdentifier></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <Text>Please help us improve this HIT by including any Questions and/or Comments (optional):</Text></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </QuestionContent></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <NumberOfLinesSuggestion>10</NumberOfLinesSuggestion></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </FreeTextAnswer></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </AnswerSpecification></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </Question></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"></QuestionForm></span></div>
<br />
<br />
This is actually pretty easy to read. The first part is a description of the task and then there are four questions. (The last question is a request for suggestions).<br />
<br />
Following the instructions in the getting started guide I loaded the MTurk SDK onto my machine and fired up eclipse. Here is the simple source code that submits the above question to MTurk:<br />
<br />
MturkMain1.java:<br />
<br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">package createnewhit;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">/*</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * Copyright 2007-2012 Amazon Technologies, Inc.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * Licensed under the Apache License, Version 2.0 (the "License");</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * you may not use this file except in compliance with the License.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * You may obtain a copy of the License at:</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * http://aws.amazon.com/apache2.0</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * OR CONDITIONS OF ANY KIND, either express or implied. See the</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * License for the specific language governing permissions and</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * limitations under the License.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> */ </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">import com.amazonaws.mturk.addon.HITProperties;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">import com.amazonaws.mturk.addon.HITQuestion;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">import com.amazonaws.mturk.addon.QAPValidator;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">import com.amazonaws.mturk.requester.HIT;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">import com.amazonaws.mturk.service.axis.RequesterService;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">import com.amazonaws.mturk.service.exception.ValidationException;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">import com.amazonaws.mturk.util.PropertiesClientConfig;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">/**</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * This is a try at using the Mechanical turk to process photographs of receipts.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * mturk.properties must be found in the current file path.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * The following concepts are covered:</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * - Using the <FormattedContent> functionality in QAP</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * - File based QAP and HIT properties HIT loading </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * - Validating the correctness of QAP</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * - Using a basic system qualification</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * - Previewing the HIT as HTML</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> *</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> */</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">public class MturkMain1</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">{</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> private RequesterService service;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> // Defining the location of the file containing the QAP and the properties of the HIT</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> private String rootDir = ".";</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> private String questionFile = rootDir + "/receipt_categorize.question";</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> private String propertiesFile = rootDir + "/mturk.properties";</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> /**</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * Constructor</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> *</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> */</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> public MturkMain1() {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> service = new RequesterService(new PropertiesClientConfig());</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> }</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> /**</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * Check to see if your account has sufficient funds</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * @return true if there are sufficient funds. False if not.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> */</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> public boolean hasEnoughFund() {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> double balance = service.getAccountBalance();</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> System.out.println("Got account balance: " + RequesterService.formatCurrency(balance));</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> return balance > 0;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> }</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> /**</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * Creates the receipt categorization HIT</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * @param previewFile The filename of the preview file to be generated. If null, no preview file will be generated</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * and the HIT will be created on Mechanical Turk.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> */</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> public void createReceiptCategoryQuestion() {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> try {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //Loading the HIT properties file. HITProperties is a helper class that contains the </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //properties of the HIT defined in the external file. This feature allows you to define</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //the HIT attributes externally as a file and be able to modify it without recompiling your code.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //In this sample, the qualification is defined in the properties file.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> HITProperties props = new HITProperties(propertiesFile);</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //Loading the question (QAP) file. </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> HITQuestion question = new HITQuestion(questionFile);</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> // Validate the question (QAP) against the XSD Schema before making the call.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> // If there is an error in the question, ValidationException gets thrown.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> // This method is extremely useful in debugging your QAP. Use it often.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> QAPValidator.validate(question.getQuestion());</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> // Create a HIT using the properties and question files</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> HIT hit = service.createHIT(null, // HITTypeId </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> props.getTitle(), </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> props.getDescription(), props.getKeywords(), // keywords </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> question.getQuestion(),</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> props.getRewardAmount(), props.getAssignmentDuration(),</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> props.getAutoApprovalDelay(), props.getLifetime(),</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> props.getMaxAssignments(), props.getAnnotation(), // requesterAnnotation </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> props.getQualificationRequirements(),</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> null // responseGroup</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> );</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> System.out.println("Created HIT: " + hit.getHITId());</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> System.out.println("You may see your HIT with HITTypeId '" </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> + hit.getHITTypeId() + "' here: ");</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> System.out.println(service.getWebsiteURL() </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> + "/mturk/preview?groupId=" + hit.getHITTypeId());</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> } catch (ValidationException e) {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //The validation exceptions will provide good insight into where in the QAP has errors. </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //However, it is recommended to use other third party XML schema validators to make </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //it easier to find and fix issues.</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> System.err.println("QAP contains an error: " + e.getLocalizedMessage()); </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> } catch (Exception e) {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> System.err.println(e.getLocalizedMessage());</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> }</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> }</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> /**</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> * @param args</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> */</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> public static void main(String[] args) {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> MturkMain1 app = new MturkMain1();</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> app.createReceiptCategoryQuestion();</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> }</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">}</span><br />
<div>
<br /></div>
<div>
Each time this program is executed it creates one or more HITs based on the question file. The HITs go to either the sandbox where you can look at them and work on them yourself for free, or to the production environment where people will work on them for real money.</div>
<div>
<br /></div>
<div>
One last file is needed to make this work, a properties file that defines configuration information. Particularly if you are creating HITs in the sandbox or production environment.</div>
<div>
<br /></div>
<div>
mturk.properties:</div>
<div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">#</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># You can find your access keys by going to aws.amazon.com, hovering your mouse over "Your Web Services Account" in the top-right</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># corner and selecting View Access Key Identifiers. Be sure to log-in with the same <span class="s1">username</span> and password you registered with your</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># Mechanical <span class="s1">Turk</span> Requester account. </span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">#</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># If you don't yet have a Mechanical <span class="s1">Turk</span> Requester account, you can create one by visiting http://requester.mturk.com/</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p3">
<span class="s2"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">access_key=</span></span></div>
<div class="p3">
<span class="s2"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">secret_key=</span></span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># by default, will first load keys from <USER_HOME_DIR>/.<span class="s1">aws</span>/<span class="s1">auth</span></span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">######################################</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">## Basic HIT Properties</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">######################################</span></div>
<div class="p3">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span class="s2">title:</span>Process<span class="s2"> </span>an<span class="s2"> </span>image<span class="s2"> </span>of<span class="s2"> </span>a<span class="s2"> </span>receipt<span class="s2"> </span>and<span class="s2"> </span>give<span class="s2"> </span>basic<span class="s2"> </span>information<span class="s2"> </span>about<span class="s2"> </span>it.</span></div>
<div class="p3">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span class="s2">description:</span>The<span class="s2"> </span>task<span class="s2"> </span>is<span class="s2"> </span>to<span class="s2"> </span>review<span class="s2"> </span>a<span class="s2"> </span>receipt,<span class="s2"> </span>enter<span class="s2"> </span>information<span class="s2"> </span>about<span class="s2"> </span>that<span class="s2"> </span>receipt<span class="s2"> </span>including<span class="s2"> </span>date,<span class="s2"> </span>time,<span class="s2"> </span>company,<span class="s2"> </span>and<span class="s2"> </span>a<span class="s2"> </span>categorization<span class="s2"> </span>of<span class="s2"> </span>the<span class="s2"> </span>spending<span class="s2"> </span>such<span class="s2"> </span>as<span class="s2"> </span>"groceries"</span></div>
<div class="p3">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span class="s2">keywords:</span>receipt,<span class="s2"> </span>categorize,<span class="s2"> </span>image</span></div>
<div class="p4">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">reward:<span class="s3">0.11</span></span></div>
<div class="p4">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">assignments:<span class="s3">10</span></span></div>
<div class="p3">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span class="s2">annotation:</span>sample#image</span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">#</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># -------------------</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># ADVANCED PROPERTIES</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># -------------------</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">#</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># If you want to test your solution in the Amazon Mechanical <span class="s1">Turk</span> Developers <span class="s1">Sandbox</span> (http://sandbox.mturk.com)</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># use the service_url defined below:</span></div>
<div class="p3">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span class="s2">service_url=</span>https<span class="s2">:</span>//mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester</span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># If you want to have your solution work against the Amazon <span class="s1">Mechnical</span> <span class="s1">Turk</span> Production site (http://www.mturk.com)</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># use the service_url defined below:</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">#service_url=https://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester</span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">#list of comma separated <span class="s1">retriable</span> errors which will be retried by RetryFilter</span></div>
<div class="p3">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><span class="s2">retriable_errors=</span>Server.ServiceUnavailable</span></div>
<div class="p4">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">retry_attempts=<span class="s3">10</span></span></div>
<div class="p4">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">retry_delay_millis=<span class="s3">1000</span></span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">######################################</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">## HIT Timing Properties</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">######################################</span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># this Assignment Duration value</span></div>
<div class="p4">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">assignmentduration:<span class="s3">300</span></span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># this HIT Lifetime value is 60*60*24*3 = 3 days</span></div>
<div class="p4">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">hitlifetime:<span class="s3">259200</span></span></div>
<div class="p2">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"># this Auto Approval period is 60*60*24*15 = 15 days</span></div>
<div class="p4">
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">autoapprovaldelay:<span class="s3">1296000</span></span></div>
</div>
<br />
<br />
Next time I'll talk about the effectiveness of using the Mechanical Turk.<br />
<br />
<br />
<br />Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com2tag:blogger.com,1999:blog-7352236708314455586.post-46093787270428133962013-01-03T11:38:00.001-08:002013-01-03T11:38:46.972-08:00Receipts and the Mechanical TurkLast time I discussed how using a QR code to categorize receipts would make data entry easier. Unfortunately this solution requires the cooperation of the stores. While easily doable it is hard to get the momentum going to add such a feature. Another thought I had at the same time was using Amazon's Mechanical Turk to do a similar process. I have actually implemented a simple version of this and run a few examples through the production Turk. I think people will find this very interesting both for the specific problem and for a good, yet still fairly simple example of using MTurk (as Amazon likes to call it). See <a href="https://www.mturk.com/mturk/welcome">https://www.mturk.com/mturk/welcome</a> for the official Amazon site.<br />
<br />
First, for those not familiar, MTurk is a platform that allows the creation of small tasks that are meant to be done by real people in exchange for a small fee, typically a few cents. Examples tend to include "identify the color of the automobile in a picture" or "determine the hours of operation at a particular business". These are trivially easy for a person to accomplish but not for a computer.<br />
<br />
My idea is to take a picture of a receipt and extract information from it such as the date of purchase, the name of the business, and the category of spending. Now some of you out there are probably image processing experts and might try to do a full computerized solution to this problem, but I think most people will admit that although you can convert the image to text trying to parse out information like the date of purchase is really hard. Go ahead and take a look at a few receipts and see what you think, I'll wait...<br />
<br />
So the idea is to submit the image to MTurk and let a human do the job in a few seconds. I found this a very interesting experiment both from the software and human perspective. It is true that MTurk costs real money but I spent less than $5.00 on my experiments and you can spend nothing to use and develop for it, it only costs money when real people start actually working for you.<br />
<br />
To keep this from being an enormous blog post I am going to split this up over several posts and include implementation details later. I will end here with a screen shot of what the person saw when they processed my receipt using MTurk. Keep in mind we are still at an experimental stage and more information would be added in the future.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUy-QP1G7UHNDwYMIK3EtSQ3LRxLDb_gUGa-8hu6VixsiobsJvYE7AeN1AJMAT5sO3O8uj1R7nlS3L3GS2U0mMSeBPZXKPam_-Hg9vwru-imfc_SQel449BnoAqXq3B3Ss1K0IUvowbwA/s1600/MTurk+HIT.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUy-QP1G7UHNDwYMIK3EtSQ3LRxLDb_gUGa-8hu6VixsiobsJvYE7AeN1AJMAT5sO3O8uj1R7nlS3L3GS2U0mMSeBPZXKPam_-Hg9vwru-imfc_SQel449BnoAqXq3B3Ss1K0IUvowbwA/s1600/MTurk+HIT.jpg" /></a></div>
<br />
<br />Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com12tag:blogger.com,1999:blog-7352236708314455586.post-50266872266663167242012-12-31T22:24:00.001-08:002012-12-31T22:28:36.957-08:00Receipt QR codesI've always found QR codes to be quite interesting. The idea of tagging physical things in a reliable easy for computers to read way has many interesting possibilities. Today I was spending some time thinking of a project to work on. I want to learn more about using Amazon's Web Services and wanted a project that would be suitably challenging but not too difficult. Well actually I got a bit sidetracked, but maybe we'll get back to AWS if you follow this blog for awhile.<br />
<br />
Like many engineers I like to think of myself as organized. I dutifully enter financial data into quicken for instance. And that's what really got me thinking. I like services like Mint, its great in fact, but its not very accurate in tracking how I spend my money. For instance anything spent at a gas station shows up as Gas and Fuel but I can assure you its just as likely to be chips and soda. So I prefer Quicken, it helps me budget more accurately and save by knowing what I really spend my money on. Wouldn't it be great if printed on the receipt was something I could scan with my smart phone and have it do more accurate data entry?<br />
<br />
So I started to think about this in more detail. First a "standard" QR code, the kind you most typically see has a capacity of only 174 characters at its highest error correction level. There are higher density QR codes but I have doubts that they are printable on the standard thermal paper based receipt printers in most cash registers. So one thing to consider is go to a different format, there are lots of ways to print machine readable data on a cash register receipt. But it's hard to resist the QR code approach, as its popularity makes it instantly recognizable.<br />
<br />
With 174 characters there's enough space for a date/time stamp, a company name, a transaction type (debit/cash/etc) and enough left over for at least 10 rows of category information. The total should probably be implicit as the sum of the categories. Let's use a simple pipe delimited format for an example (XML is too bulky for QR codes)<br />
<div style="text-align: center;">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">12311214:31|Joes's gas station|Debit|Fuel|45.00USD|Cash|Food|4.75USD</span></div>
<div style="text-align: center;">
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj96R1zPgOa6WZfWiwu2KZcGoiS1pCeKushIMRnZ8uz6rJ4CL9DGQOGU5tew7OUGjhrVLjzpPZU429o1iLcNBOLam9dz5Ibp1BWVR1U7WzjiTfyYW6cA6iIZbaMbQ7bTt_TWN9l9LOU8rg/s1600/qrcode.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj96R1zPgOa6WZfWiwu2KZcGoiS1pCeKushIMRnZ8uz6rJ4CL9DGQOGU5tew7OUGjhrVLjzpPZU429o1iLcNBOLam9dz5Ibp1BWVR1U7WzjiTfyYW6cA6iIZbaMbQ7bTt_TWN9l9LOU8rg/s200/qrcode.png" width="200" /></a></div>
</div>
I made this a little more human readable then it needs to be, especially we could use a coding system for the categories, that would save a lot of space. The above is 66 characters long, more than 100 left. I would love to be able to scan something like this in.<br />
<br />
I think most stores cash register software could accomodate this if there was a standard format. Many companies already break out a receipt like this to some extent in the human readable portion. Grocery stores even go into details like dairy vs. produce. Why should they do this? Well it's a convenience to their customers, it's a feature they can offer over competition, and if you added a secure hash of some kind you could tie a paper receipt to all sorts of online reward programs and opportunities.<br />
<br />
I suppose some people might object on privacy concerns. A couple of thoughts, first make sure the bar code is not encrypted (except the hash :). So anyone can see what the data actually says. Keep any personally identifying data out of it. If you do that it doesn't really contain anything sensitive anyway, nothing more sensitive than the original receipt. I suppose if the general population really objected to it you could tie it into the various reward cards programs that are quite common. I don't foresee this would be a big issue.<br />
<br />
Next time I'm going to write about some alternate ideas to accomplish the same goal. What do you think?<br />
<br />Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com2tag:blogger.com,1999:blog-7352236708314455586.post-18249659901437780342012-12-28T23:49:00.000-08:002012-12-31T22:37:16.964-08:00Tools Sharp - The EconomistSirs,<br />
<br />
Keeping the tools sharp is a repeated theme of this blog. It's important to remember that this means more than keeping your software skills up to date. It means knowing enough about the world around you to take advantages of unexpected opportunities and also keeping a close eye out for warning signs that might affect your business or your lifestyle.<br />
<br />
For many many years I subscribed to <i>US News and World Report</i>. This was the preeminent news magazine of its day. It didn't have the largest subscriber base but it had the broadest and deepest coverage of any American news weekly. Sadly I watched its rapid decline in the early 2000s focusing more and more on sensationalist and celebrity stories while its pages on news international and othewise was cut back.<br />
<br />
Clearly this was a victim of the internet age and it is true that many publications experienced similar sharp declines. I could spend weeks on what happend to beloved cable chanels as the reality television format took over like ivy climbing on a wall.<br />
<br />
Neverthless some periodicials made a choice to move to quality instead of the lowest common demoninator. Surprisingly Rupert Murdoch has done well with the <i>Wall Street Journal</i>, no doubt because its audidence is razor focused on financial information.<br />
<br />
As for the American news weekly, we all saw the decline of <i>Time</i>, <i>Newsweek</i> and <i>US News</i>. All using differing combinations of celebrity and shocking news stories to sell issues while slowly minimizing their day to day hard news information that helps you understand how the world works.<br />
<br />
Obviously as a person who makes my living off the internet I should probably just move to many free web sites. Many of these places do good analaysis but they don't do a good job putting together an objective picture. CNN used to the be the best of this lot and yet they seem affected by the same trend of trivia and celebrity that so many other news sites have fallen on.<br />
<br />
Despite the internet I keep several print subscriptions and <i>The Economist</i> is the most important. This magazine (or newspaper as it likes to call itself) presents the broadest and deepest news coverage of any weekly on the planet. It's not cheap, about $100 a year (or $0.36 cents a day). But it covers every part of the world with a seriousness and purposefulness yet also with a smart sense of humor that makes it a must read the moment it hits my mailbox or my ipad. They do a technology review every quarter that is literally filled with good ideas for start up companies.<br />
<br />
I have no financial interest in The Economist (at <a href="http://economist.com/">economist.com</a>) but they are the last publication I would stop if forced to cut off my feed of information from my mailbox.<br />
<br />
[Note I edited this slightly to reduce any political references which only add argument and reduce value ad fixed a couple of spelling mistakes at the same time.]Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com0tag:blogger.com,1999:blog-7352236708314455586.post-87225135974955778012012-12-23T16:16:00.003-08:002012-12-23T16:16:48.698-08:00Job Search ToolsLast time I briefly discussed what I have learned about the job situation for software engineers as of the end of 2012. I want to also spend some time talking about the tools that I've found to be most effective so far.<br />
<br />
First, and really deserving of its own article, is LinkedIn. Any professional should be familiar with LinkedIn and have an up to date profile. Your job search should begin by looking through your contacts and see who's where. Reach out to the people in companies you are interested in working for and especially reach out to those people who know you well who might be in a position to hire you directly. You should always keep your LinkedIn contacts up to date as you work with people. Personally I keep my LinkedIn contacts limited to people I know fairly well... primarily because I personally don't find the friend-of-a-friend connections to be very useful. Your mileage may vary and certain jobs such as sales might encourage different tactics. But for engineering, its all about knowing someone is capable of doing a quality job or knowing someone is good at estimating, or some other aspect of the job that just doesn't communicate very well to secondary and tertiary tiers of relationships.<br />
<br />
Next up are the inevitable job search boards. Here in the Midwest <a href="http://careerlink.com/">careerlink.com</a> is well known and popular. I have also personally found dice.com a good place to go, I use the feature they provide to email you the newest jobs matching certain search criteria. You can have up to five different such searches in their free tier. There are also the meta-job boards, boards that attempt to collate content from other job boards: <a href="http://simplyhired.com/">simplyhired.com</a> and <a href="http://indeed.com/">indeed.com</a> are the best known examples. Be certain to check the careers sections of larger companies you might be interested in. You should know the big players in your industry and check them directly.<br />
<br />
If you are conducting a more open ended search or really hoping to move up the food chain I also recommend <a href="http://glassdoor.com/">glassdoor.com</a>. A website that provides inside reviews of companies somewhat like how amazon products are reviewed. The same caveats must be applied, always throw out the best and worst reviews but companies with a significant number of reviews should converge to an average. The main point of this site is to sort out great places to work from average or worse places to work.<br />
<br />
Finally, remember to research prospective employers before an interview. Find out what's new, what their current products are, how they are doing in the market, etc. All of this information will allow you to ask thoughtful questions during the interview which is a somewhat neglected part of the process. Also, depending on your circumstances of course, you should try to keep some perspective that you are interviewing them just as much as they are interviewing you.<br />
<br />
<br />Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com0tag:blogger.com,1999:blog-7352236708314455586.post-31352275688794303112012-12-20T21:45:00.001-08:002012-12-20T21:45:38.504-08:00The Midwest Job MarketLiving in Lincoln Nebraska my job search has focused on the midwest. Primarily cities within a reasonable day's drive. This includes Denver (and the entire front range), Minneapolis, Kansas City, Chicago (barely), Des Moines, and of course Omaha where I have worked for most of my life.<br />
<br />
I decided on an extended job search that looked at locations outside of commuting distance because I wanted to make sure I found the right job. Exactly what that is will of course vary by each individual's situation. For me, I have a child in college and one not far behind so I was definitely looking for a salary that matched my previous job. I also want a company where technology is important, a profit center, not an expense or after thought. So with those parameters in mind I have been looking in a circle roughly defined as a 500 mile radius centered on Lincoln.<br />
<br />
I have found that there are a reasonable number of jobs in any given metropolitan area. And those jobs form a nice spectrum from entry level to very experienced. Very roughly I see about 50% Microsoft technology and 50% Unix. Microsoft shops tend to want VB, .Net, or maybe C# and Unix shops are looking for Java, scripting and some C/C++. I am honestly quite surprised by the high penetration of Java given its history but it certainly is a nice language to work with and I think its popularity derives from a great library and ease of use.<br />
<br />
Recruiters are generally willing to talk to you and give you that chance to get the foot in the door. However it appears that competition is pretty intense. Interviews are tougher now than I can ever recall previously and closing the deal is harder than before. The good news is salaries don't seem to have suffered too much. People more knowledgeable than me in the field seem to indicate Salaries are holding up well.<br />
<br />
Lessons learned: Study, do your homework, practice your skills while you have some time, maybe write a blog. Do things that get you noticed like contribute to open source or work on an iOS or Android application. I would also recommend targeting your prospects carefully so you don't waste a lot of time on marginal opportunities. Look for jobs that are a good match for your skills.<br />
<br />
Good luck and persevere.Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com0tag:blogger.com,1999:blog-7352236708314455586.post-66360851298937025502012-12-18T22:50:00.001-08:002012-12-18T22:50:32.444-08:00Keeping the Tools SharpAny software engineer that isn't right out of school knows how important it is to keep up with the latest technology. I'm a Unix guy so if you are going to follow this blog you are going to hear about Unix and its related technology stack and Java to the extent it is platform agnostic. Nothing against the Microsoft tools, there are just only so many pencils you can keep sharp at once.<br />
<br />
Let me clarify what I mean by keeping your tools sharp. It not only means keeping up with new versions of software you use (Java EE 6 for instance, huge changes over earlier versions). It also means diving into your tools that you use every day and really understanding them. Especially fundamental tools like your IDE or your source control system. Knowing these tools inside and out means you will get a reputation helping others fix problems and you yourself will be more productive.<br />
<br />
But it also means taking the time to revisit things you haven't thought of in awhile. Let's say you are an expert C++ programmer. When was the last time you flipped through Stroustrup's book The C++ Programming Language for those not familiar. Well worth it even if you write C++ code in your sleep. There is always something new to learn.<br />
<br />
I would also highly advise taking advantage of any tuition reimbursement program your company offers and taking some classes at a local University. I recently took an operating systems class from a local public university. I'll be honest I wasn't expecting much, I don't write operating systems. But it was actually a great experience. It reminded me about some concepts I don't have to think about very often in my day to day job like signal handling or interprocess communication. But those are things I should be familiar with, they are tools, they solve problems. Being reminded how operating systems schedule processes or how a real-time system's I/O architecture differs from Unix can be critically important and be the difference between looking like an idiot and being the star of your next project. Go to school occasionally, it is well worth it.<br />
<br />
Keep those tools sharp.Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com0tag:blogger.com,1999:blog-7352236708314455586.post-49806998259299424192012-12-16T23:17:00.001-08:002013-01-10T22:30:21.568-08:00Remember the Algorithms and Data StructuresIt's no secret to anyone who knows me that I've been looking for a job for the last few weeks. I wanted to share a terrific resource that every single software engineer should know about as they get ready for technical interviews.<br />
<br />
Companies, of course, have a range of strategies for interviews but typically the best ones (for us Software Engineers) are going to ask tough technical questions. Many companies make it a point to focus on fundamentals, figuring that engineers who know the fundamentals will be better at learning new things. It doesn't matter what language, what operating system, what environment you are interviewing for, if you know your algorithms and data structures you are going to be more impressive in any interview.<br />
<br />
I've been out of school awhile, I hadn't even thought much about different kinds of sort algorithms or how to create a balanced binary tree in a long time. Most of us know that our languages typically have libraries that implement these things and that's good enough. But a recruiter for one company happened to mention that their engineers recommended that candidates study <i>The Algorithm Design Manual</i> by Steven S. Skiena. This was one of those companies known for asking fundamental computer science questions as part of the interview process. I figured what the heck, its good practice for any interview and so I ordered a copy from Amazon.<br />
<br />
Wow did I quickly learn how much I had known from school but hadn't thought about in ages. This book covers many things that any good engineer will use in their career to develop high quality solutions. You may remember graphs are a great data structure, but do you remember how to calculate the shortest path through one, how about a minimum spanning tree? Do you remember what this is? You should. We all know about NP complete problems and how we have to use heuristics to solve them as best as possible... how many nodes does your graph have to have before you have to give up on an exact solution? 10, 50, 100? (It's a lot closer to 10 than you might remember).<br />
<br />
We all probably also remember that a few really good sort algorithms are really fast and the rest are crap. But there are differences between those best algorithms and certain ones are better for certain circumstances.<br />
<br />
Every chapter is well written, with real world examples, sample code, and practical advice. Even better, the chapters are filled with exercises to give you a chance to practice and make sure you understand the material.<br />
<br />
I also discovered Professor Skiena has his lectures available online. A good option for those who don't want to purchase what is admittedly a somewhat expensive book. And a great way to either supplement the material or learn the material for those who prefer an oral presentation. Here's the link: http://www.cs.sunysb.edu/~algorith/video-lectures/.<br />
<br />
Here's a link to the book on Amazon (note this is NOT an affiliate link, I am not making any money off this): http://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202/Mikehttp://www.blogger.com/profile/08589963072755856010noreply@blogger.com1