Tutorial: Categorizing Names With The Requester Website

Amazon Mechanical Turk
Happenings at MTurk
7 min readNov 19, 2016

--

Using the Amazon Mechanical Turk (MTurk) Requester website, Requesters can create Projects containing Human Intelligence Tasks (HITs) for Workers to complete. In this post, we will learn how to create a Project and ask for help from MTurk Workers to categorize content.

In this tutorial, imagine you are doing a research project about the Titanic. As part of your study, it’s important to know which passengers boarded the Titanic at which departure port.

Create a new Project to get started

Sign up and log in to the MTurk Requester Website. Click the “Create” tab and then click on the “New Project” sub-heading to get started on your Project. You will see a list of starting points to choose from including Survey Link, Tagging of an Image, Categorization Project and more. Given this Project is about categorizing data, let’s use the Categorization Project as a starting point.

Edit Project — Enter Properties

Here you will add and edit the properties that define your Project. The properties are grouped into three sections:

1. Describe your HIT to Workers

Here you define the basic attributes of your Project. The Title and Description help convey to Workers what your work is about. Workers looking for certain types of work can search by any combination of Title, Description, and Keyword attributes. Selecting thoughtful keywords helps ensure your work is discovered by Workers.

2. Setting up your HIT

Here you define various attributes of your work. These include:

  • Reward per assignment: When a Worker accepts your HIT, it is considered an Assignment. Here, you specify the reward amount per Assignment.
  • Number of assignments per HIT: The number of unique Workers you’d like to complete the same HIT. Selecting more than one can be a useful if you’d like to get a few opinions or judgments.
  • Time allotted per assignment: The amount of time a Worker can work on your HIT before it is automatically returned for another Worker to complete instead. For our Project, completing each HIT shouldn’t take more than a few minutes, but we’ll give Workers up to an hour to ensure they aren’t rushed.
  • HIT expires in: The length of time your Project and HITs will be available for Workers to discover and complete in the MTurk marketplace.
  • Auto-approve and pay Workers in: Once HITs are submitted, you’ll have an opportunity to review them and approve them. Once approved, Workers will be paid. In the event you don’t review them, Workers will automatically be paid after a specified time. Here, you set that length of time (up to 30 days). Here, we’ll specify 3 days to review the work before it’s automatically approved.

3. Worker requirements

Here you specify the attributes of Workers that are eligible to work on your Project. These can include the Worker’s past performance on MTurk, their Location, and more. You can also specify that only Amazon Mechanical Turk Masters complete your HITs. These are Workers who have demonstrated excellence across a wide range of HITs.

Click “Design Layout” to save your Project and continue.

Edit Project — Design Layout

Now, let’s customize the template to better support our needs for this Project.

Let’s start with the instructions. We want Workers to select the port that each Titanic passenger departed from. We will provide Workers with links to lists of Titanic passengers and ports to help them complete our HITs.

To update the instructions, simply select and edit the text much like you would edit text in Microsoft Word. You may also click “Source” to edit the instructions in HTML.

Updated Instructions
Click “Source” to edit the instructions in HTML

Next, let’s change the selections. We need to remove some of the excess selections. We also need to update the remaining choices to match the ports — Cherbourg, Queenstown, Southhampton. Once again, select and edit the text just like you would in Microsoft Word.

Updated selections

Now, we want to make sure that when Workers select the Titanic passengers port of departure, we can properly record their response. To do this, click “Source” and locate the “value” tag for each selection. Update each “value” to match the response you want. Make sure that each input ID is unique. Though not required, we also updated the “name” to be “Port” so the results, once we receive them, will be easier to read.

<div class="btn-group-vertical" data-toggle="buttons" id="CategoryInputs"><label class="btn btn-default">
<input id="category1" name="Port" required="" type="radio" value="Cherboug" />Cherboug</label>
<label class="btn btn-default">
<input id="category2" name="Port" required="" type="radio" value="Queenstown" />Queenstown</label>
<label class="btn btn-default">
<input id="category3" name="Port" required="" type="radio" value="Southampton" />Southampton</label>
</div>

Once you are done, click “Source” to return to the design view.

We are getting close to wrapping up the design of our HIT. The last piece we have to do is change this from a HIT that displays an image to one that shows the name of the Titanic passenger. Instead of just changing the label, we want the Titanic passenger text to be different for each HIT. To do this, lets click “Source” one last time.

Locate the following line in the template:

<img alt="image_url" class="img-responsive center-block" src="${image_url}" />

Update the line to read:

<b>Passenger:</b><br> ${name}

${name} is used so our Passenger name is dynamically loaded for each HIT. More on this later.

Once you are done, click “Source” to return to the design view.

Now click “Preview” to see how your Project will appear to Workers. Don’t worry about the ${name} variable. Before you publish your Project, you will have an opportunity see examples of the HITs where this value is replaced with Titanic passenger names.

Click “Finish” to continue. Our Project has been added to our list of Projects but is not yet ready for Workers. If you have created other Projects in the past, you will see them in this list as well.

My Project List

Now lets add the passenger names to our Project and publish it to the MTurk marketplace.

Publishing your Project

Click “Publish Batch” to get started.

Publish Batch

Here you will be asked to provide a Comma Separated Value (.csv) file. Any spreadsheet in Excel or similar tools can be saved as CSV files. This file will contain the passenger names for each HIT.

Our csv file

For each HIT to properly add a name from the file, we need to ensure the first row contains the name. You’ll recall placing the following line in your template:

<b>Passenger:</b><br> ${name}

By setting the first row to “name” in your CSV file, you’re telling MTurk to use the values in this column when creating your HITs. When creating our .csv file, the first row of the columns we want to replace must simply be the contents between the { and } characters.

Our csv file has been validated and ready to upload

After choosing the CSV file to upload, MTurk will validate that the file is properly formatted. Click “Upload” to continue and preview your HITs.

A Preview of a HIT from our Project

We now see an updated version of our Project. You’ll want to take note of two things. Notice the “HITs available: 1262” at the top right part of the blue rectangle. Our CSV file had 1,262 names in it. Each name was used to create a single HIT, for 1,262 HITs in total. You will also notice that the first passenger name from our CSV file is now visible in our HIT.

Click “Next” to continue.

This is the last page before your Project will be published live to Workers. Here we see a summary of the settings we specified while entering the properties of our HIT. Let’s click “Purchase and Publish” to add all our new HITs to the marketplace.

After clicking “Purchase and Publish” you will be taken to the Batch Details page. Here we can see a summary of the progress of our Project as Workers complete it. This includes the number of assignments completed by Workers, on average the length of time per assignment, and more. We will revisit this page in a future tutorial once Workers have had time to complete our HITs.

Wrap up

Here we walked through how to design a project where each HIT automatically displays a Titanic passenger’s name and enlists help from Workers to help categorize it. In a future tutorial we will walk through how to retrieve and manage results.

We hope you found this useful. If you have any questions, please post a question to our MTurk forums. To become a Requester, sign up here. Want to contribute as a Worker customer? Get started here.

--

--