Tutorial: Managing MTurk HITs with the AWS Command Line interface

Earlier this year, Amazon Mechanical Turk (MTurk) introduced the ability to use the MTurk Application Programming Interface (API) from the AWS Software Development Kit (SDK)and Command Line Interface (CLI). This lets Requesters use the MTurk API from any of nine AWS SDKs that are already widely used in the AWS community. These include support for languages such as Python, Node.js, Javascript, Java.NET (including C# and F#), Ruby, Go, PHP, and C++. This makes the MTurk API easier to use and more accessible than ever.

In April, we published a tutorial called “Crowdsourcing from the Command Line” showing how to use MTurk from the AWS Shell, an interactive productivity booster for the AWS CLI.

aws-shell makes it even easier to use the excellent AWS Command Line Interface

In today’s tutorial, we are going to build on our previous AWS Shell tutorial to show Requesters how to use this tool to Manage HITs and execute common functions such as listing HITs, expiring or extending HITs, creating additional assignments for HITs, deleting HITs, approving Assignments, and sending Bonuses to Workers. We’ll step through each individually below.

To get started, you’ll want to setup aws-shell, which you can do by following the MTurk tutorial Crowdsourcing from the Command Line.

Getting started

You can confirm that everything is setup correctly by calling get-account-balance. You should see your Prepaid HITs balance, and it should match what you see when you sign in at http://requester.mturk.com/. It will look something like this:

You can also use aws-shell with your Requester Sandbox account. Add the “endpoint-url” parameter to any command and set it to https://mturk-requester-sandbox.us-east-1.amazonaws.com to connect to your Sandbox account instead:

Your MTurk Sandbox balance is a fixed value that is always $10,000

As you may have guessed, aws-shell simply lets you call any operation in the MTurk Requester API.

Listing HITs

First, let’s use it to display a list of our recently published HITs using the ListHITs API operation:

$ aws-shell
aws> mturk list-hits --output table --query 'HITs[].{"1. HITId": HITId, "2. Title": Title, "3. Status":HITStatus}' --endpoint-url https://mturk-requester-sandbox.us-east-1.amazonaws.com --max-results 5

This will give you a list your submitted HITs like so:

Let’s take a moment to explain some of what just happened here. First, you called the MTurk ListHITs operation. You can learn more about this and other operations from the MTurk API documentation. But you also passed in several parameters such as “ — output” and “ — query” and “ — max-results” that are new. Let’s explain each of those:

By adding “--output table” you specified that you want to retrieve your results in a “table” format. If you omit this, you’ll receive them in JSON format.
By adding “--max-results 5” you asked that MTurk only return the first 5 results from ListHITs. This is useful if you only care about a small set of results instead of the entire list. 
By adding “--query 'HITs[].{"1. HITId": HITId, "2. Title": Title, "3. Status":HITStatus}'” you filtered the list after MTurk returned the results of ListHITs. This parameter is very powerful, but probably warrants a little more explaining:
As the name suggests, "--query" lets you query the results from any operation. The parameter you pass to it is a string in JMESPath format. You can learn more about JMESPath here. What happens is aws-shell will retrieve your JSON results and apply the JMESPath query to them, and return the output of that query. JMESPath can be used to filter out specific results, and to select only certain columns. In the example above, for instance, we asked JMESPath to return only the HITs array from the ListHITs JSON response (this is the HITs[] part) and to return only the HITId, Title, and HITStatus fields. We asked it to apply the labels "1. HITId", "2. Title", and "3. Status" to these fields, respectively. You can see those labels as column headers in the table. This JMESPath filtering is very powerful. 

There are many ways you can use the power of JMESPath and the “query” parameter in the AWS Shell. Here are a few examples to get you started:

Show only HITs where the Reward amount is greater than $0.25:
HITs[?Reward > ‘0.25’].{Title: Title, HITType: HITGroupId, HITId: HITId, Reward: Reward, Assignments: MaxAssignments}”
Show only HITs where the keywords contain the word 'annotation', and don't bother putting labels (field names) on any of the columns:
HITs[?contains(Keywords, 'easy')].[Title, Keywords, HITId]
Show the HITId of the HIT with the largest number of assignments in this page of results:
mturk list-hits --query "max_by(HITs, &MaxAssignments).HITId"
There are many more examples you can construct by checking out the excellent tutorial for JMESPath here: http://jmespath.org/tutorial.html

It is important to note, however, that JMESPath queries apply only to the page of results from ListHITs. This means that if you have many pages of results, your query will only apply to the current page. To apply your query to the next page, you need to retrieve the NextToken and pass it in as follows:

mturk list-hits --query "[max_by(HITs, &MaxAssignments).HITId, NextToken]" --max-results 100

Which should return the HITId of the HIT with the largest number of assignments in the first page of 100 results, along with the NextToken to fetch the next 100 results:

[
"3CMIQF80GNQSEKSFJ9SGNAF1LO76QP",
"p1:UfRIQmZ/4+OZ9qdstz/CcNw5sVZQ1dREr/EuNPrNa9aGpKqHNnLw=="
]

At which point you can repeat the operation on the next 100 results with:

mturk list-hits --query "[max_by(HITs, &MaxAssignments).HITId, NextToken]" --max-results 100 --next-Token "p1:UfRIQmZ/4+OZ9qdstz/CcNw5sVZQ1dREr/EuNPrNa9aGpKqHNnLw=="

If you want to take this even further, and build some powerful scripts to perform multiple commands together, or pipe the output of one command into the next, we encourage you to check out this re:Invent talk “Automating AWS with the AWS CLI” here. The slides are here, and code samples from the talk are here.

Expiring or Extending HITs

Now that you can list your HITs, expiring or extending them is as easy as making a couple of simple commands. If you want to extend a HIT’s expiration, you’d use a command like this, where the expire-at is a timestamp formatted per the ISO 8601 standard like this:

mturk update-expiration-for-hit --hit-id 31D0ZWOD0AZ5DTZJSOFDO9Q5457A0X --expire-at 2017-11-05

If you need to expire a HIT immediately, simply provide a value of 0 as the expire-at parameter like this:

mturk update-expiration-for-hit --hit-id 31D0ZWOD0AZ5DTZJSOFDO9Q5457A0X --expire-at 0

Creating additional assignments for HITs

If you want more Workers to complete the same HIT, you can create additional assignments for that HIT. This is useful when you want to ask more Workers for their judgements, perhaps to break a tie or gather more confidence in a result. You can accomplish this with a command like this:

mturk create-additional-assignments-for-hit --hit-id 31D0ZWOD0AZ5DTZJSOFDO9Q5457A0X --number-of-additional-assignments 1

Deleting HITs

As you write and test your code, you will want to delete your test HITs periodically so that they don’t clutter up your view when listing HITs.

If all the assignments for a HIT have been submitted and approved or rejected, you can delete it immediately like so:

mturk delete-hit --hit-id 38EHZ67RIMPQTVIAYG4LTSGY8KLGMG --endpoint-url https://mturk-requester-sandbox.us-east-1.amazonaws.com

If not, the HIT has to be expired first, and then deleted:

mturk update-expiration-for-hit --hit-id 3IVEC1GSLPW3O8MMD06Q6DKB217J1F --expire-at 0 --endpoint-url https://mturk-requester-sandbox.us-east-1.amazonaws.com
mturk delete-hit --hit-id 3IVEC1GSLPW3O8MMD06Q6DKB217J1F --endpoint-url https://mturk-requester-sandbox.us-east-1.amazonaws.com

Retrieving and approving assignments

To retrieve the assignments for a given HIT, you can issue a command like the following:

mturk list-assignments-for-hit --hit-id 31D0ZWOD0AZ5DTZJSOFDO9Q5457A0X --query "Assignments[].{AssignmentId: AssignmentId, Status: AssignmentStatus, WorkerId: WorkerId}" --output "table"

This will retrieve all the assignments with their status and WorkerIds. Using this command, we can see that there is one assignment that was submitted but not yet approved.

We can approve that assignment with the following command:

approve-assignment --assignment-id "3U8YCDAGXPG0MT4U7MH13T0X16HQ0I" --requester-feedback "Thanks for completing my HIT!"

If you had previously rejected this HIT, but realized you made an error, you can use this same command to reverse a rejection made within the last 30 days as long as the HIT has not been deleted. It can be used as follows:

mturk reject-assignment --assignment-id "3U8YCDAGXPG0MT4U7MH13T0X16HQ0I" --requester-feedback "Your response appears to be incorrect"
mturk approve-assignment --assignment-id "3U8YCDAGXPG0MT4U7MH13T0X16HQ0I" --requester-feedback "Our apologies, it turns out it was correct." --override-rejection

Sending Bonuses to Workers

It is always a great idea to reward great work from your MTurk Worker. To send a bonus, use a command like the following:

mturk send-bonus --assignment-id "3U8YCDAGXPG0MT4U7MH13T0X16HQ0I" --worker-id "A25G8HPJE1QPA2" --bonus-amount "0.01" --reason "Apologies for the rejection earlier. I've reversed it." --unique-request-token "Mistake reversal for A25G8HPJE1QPA2"

Let’s take a moment to explain the “unique-request-token” part of the command above. Imagine if you ran the “send-bonus” command above again. It would pay the Worker another bonus, right? It turns out, it will not. Instead, the API notices that the same “unique-request-token” is used twice, and responds with an error like this:

This is a concept called Idempotency. It helps make sure that you have a way to avoid mistakenly performing an operation multiple times. Now, if you wanted to pay a bonus to the same Worker for the same Assignment multiple times, you can either remove the “unique-request-token” or, a better approach, is to change the Requester token (in this case, perhaps you’d change it to “Mistake reversal #2 for A25G8HPJE1QPA2”

Wrapping up

That’s it. We’ve shown you how to perform several useful operations on your MTurk HITs using the AWS Shell powered by the Amazon Command Line Interface. In today’s tutorial, you learned how to Manage HITs and execute common functions such as listing HITs, expiring or extending HITs, creating additional assignments for HITs, deleting HITs, approving Assignments, and sending Bonuses to Workers.

We hope you found this useful. If you have any questions, please post a question to our MTurk forums. To become a Requester, sign up here. Want to contribute as a Worker customer? Get started here.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.

Responses
The author has chosen not to show responses on this story. You can still respond by clicking the response bubble.