Tutorial: MTurk using Python in Jupyter Notebook

Published in

Happenings at MTurk

9 min readJun 11, 2018

IPython notebooks are a powerful tool for data scientists to analyze data and train machine learning models within the Jupyter Notebook interface. In this tutorial we’ll explain how to use MTurk to annotate training data, all from within the Jupyter application. To get started quickly you can download the notebook and HTML for this tutorial.

Jupyter Notebook setup

Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. It was born out of the IPython project and, while it is still most commonly used with Python, it supports over 40 languages including R, Julia, and Scala.

The easiest way to get started with Jupyter is to install the Anaconda Distribution of Python 3.x. This includes Python, Jupyter Notebook, and a number of other useful packages. If you don’t want to install Anaconda, can follow the directions to install Jupyter using pip here.

Once you’ve installed Jupyter, simply run the command jupyter notebook from the Terminal or Command Prompt in the directory where you want to store your notebooks and assets. This will launch Jupyter on your computer and open a web browser window.

If you opened Jupyter in the same directory you stored the sample notebook and HTML you’ll see those files in your list.

Within each notebook you’ll have the ability to define cells. When working in Python the most common are code and markdown. Code cells are where you place the Python code you want to run and markdown cells are used for providing descriptions of the code and the steps you are taking.

To run code cells, you can select the cell and then select the Run Cell button or Control-Enter to execute it. Any output from the code steps in that cell will appear below the cell.

Account Setup

If you haven’t already, you’ll need to setup MTurk and AWS accounts that are linked together to use MTurk with Python. The MTurk account will be used to post tasks to the MTurk crowd and the AWS accounts will be used to connect to MTurk via API and provide access to any additional AWS resources that are needed to execute your task.

If you don’t have an AWS account already, visit https://aws.amazon.com and create an account you can use for your project.
If you don’t have an MTurk Requester account already, visit https://requester.mturk.com and create a new account.

After you’ve setup your accounts, you will need to link them together. When logged into both the root of your AWS account and your MTurk account, visit https://requester.mturk.com/developer to link them together.

Configuring Your Profile

To call MTurk you will need to configure your computer with a profile that has the right credentials. To get started, create a new AWS IAM User or select an existing one you plan to use. Add the AmazonMechanicalTurkFullAccess policy to your user. Then select the Security Credentials tab and create a new Access Key, copy the Access Key and Secret Access Key for future use.

The easiest method to configure your computer with this account is to install the AWS Command Line Interface (CLI). You can install this from the command line by typing pip install awscli. After it is installed you can run aws configure --profile mturk to configure an mturk profile on your computer that you will use when calling the API. When prompted, provide the Access Key Id and Secret Access Key Id you captured above. For a region, you can enter “us-east-1” and leave the output format as None.

pip install awscli
aws configure --profile mturk

More libraries to install

As a final step we will install the boto3 and xmltodict libraries. The boto3 package is an easy to use Python library for accessing AWS. The xmltodict library makes it much easier to work with the XML data returned my MTurk.

pip install boto3
pip install xmltodict

Getting Started in Jupyter

Now that we’ve installed all of the necessary tools, we can get started with our first notebook. Start by creating a new Python 3 notebook in Jupyter or opening the sample notebook you downloaded earlier.

First we’ll import the boto3, xmltodict, and json packages.

import boto3
import xmltodict
import json

Next we’ll create an MTurk client we’ll use to make requests. MTurk has two environments you can work in. The Production environment is for publishing tasks you want Workers to complete. The Sandbox is a test environment you can use to test your task before posting it to Workers. There is no cost to use the Sandbox but because it is only used for testing, items posted there won’t be completed unless you complete them yourself.

The code below will configure a client to connect to one of the two environments based on the value of create_hits_in_production. Note that in the definition of the session variable below, we reference the mturk profile created earlier. You can exclude this if you are using the default profile.

create_hits_in_production = False
environments = {
  "production": {
    "endpoint": "https://mturk-requester.us-east-1.amazonaws.com",
    "preview": "https://www.mturk.com/mturk/preview"
  },
  "sandbox": {
    "endpoint": 
          "https://mturk-requester-sandbox.us-east-1.amazonaws.com",
    "preview": "https://workersandbox.mturk.com/mturk/preview"
  },
}
mturk_environment = environments["production"] if create_hits_in_production else environments["sandbox"]session = boto3.Session(profile_name='mturk')
client = session.client(
    service_name='mturk',
    region_name='us-east-1',
    endpoint_url=mturk_environment['endpoint'],
)

Once you’ve created your client you can check to see if it’s setup correctly by getting your current MTurk account balance. Note that in the Sandbox environment your balance is always $10,000.

print(client.get_account_balance()['AvailableBalance'])

Define your task

For this project we are going to get the sentiment of a set of tweets that we plan to train a model to evaluate. We will create an MTurk Human Intelligence Task (HIT) for each tweet and assign each tweet to five Workers so we can correct for bias and quality.

tweets = ['in science class right now... urgh... stupid project..',
          'hmmm what to have for breaky?... Honey on toast ',
          'Doing home work  x',
          'Headed out of town for a few days. Will miss my girls']

To submit tasks to MTurk you need to create an HTML template that will be shown to Workers for each item. You can either use the HTML template you downloaded earlier or create a template within Jupyter by selecting New->Text File from the Jupyter Home page.

We can rename this file as SentimentQuestion.html and paste in the HTML from the examples to get started.

You may have noted that in the middle of this HTML is a reference to a variable named content. We will be replacing this ${content} variable with each tweet when we publish our tasks.

<p class="well">${content}</p>

Now we can read this HTML in from the file and wrap it with the question layout XML required by MTurk. We’ll use this question_xml variable later on.

html_layout = open('./SentimentQuestion.html', 'r').read()
QUESTION_XML = """<HTMLQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2011-11-11/HTMLQuestion.xsd">
        <HTMLContent><![CDATA[{}]]></HTMLContent>
        <FrameHeight>650</FrameHeight>
        </HTMLQuestion>"""
question_xml = QUESTION_XML.format(html_layout)

In our last setup step, we’ll define the attributes that will be applied to each HIT. As we mentioned above, for each task we’ll have five Workers review each tweet. We specify this in the definition below as well as parameters indicating that the HIT remain live on the worker.mturk.com website for no more than an hour, and that Workers provide a response for each item in less than ten minutes. Each response has a reward of $0.05 so the total Worker reward for this task would be $0.25 plus $0.05 in MTurk fees. An appropriate title, description, keywords are also provided to let Workers know what is involved in this task.

TaskAttributes = {
    'MaxAssignments': 5,           
    # How long the task will be available on MTurk (1 hour)     
    'LifetimeInSeconds': 60*60,
    # How long Workers have to complete each item (10 minutes)
    'AssignmentDurationInSeconds': 60*10,
    # The reward you will offer Workers for each response
    'Reward': '0.05',                     
    'Title': 'Provide sentiment for a Tweet',
    'Keywords': 'sentiment, tweet',
    'Description': 'Rate the sentiment of a tweet on a scale of 1 to 10.'
}

Create your tasks

Now we’re ready to publish these tasks to MTurk so they can be viewed and completed by Workers. First we’ll create a results array to contain information about each HIT we submit. We’ll also create a variable to contain the ID of the HIT Type that is generated for this task. Then we’ll loop through each tweet in the set and create a HIT using the attributes we defined earlier and the question_xml we created. Note that we’re replacing the content variable the tweet. As we go we’ll append the tweet and HIT Id that is generated to the results variable. The last step is to return a link to the HITs so you can view them on the https://worker.mturk.com or https://workersandbox.mturk.com.

results = []
hit_type_id = ''for tweet in tweets:
    response = client.create_hit(
        **TaskAttributes,
        Question=question_xml.replace('${content}',tweet)
    )
    hit_type_id = response['HIT']['HITTypeId']
    results.append({
        'tweet': tweet,
        'hit_id': response['HIT']['HITId']
    })
    
print("You can view the HITs here:")
print(mturk_environment['preview']+"?groupId={}".format(hit_type_id))

Get Results

Depending on the task, results will be available anywhere from a few minutes to a few hours. In Jupyter we can run the the following to retrieve the status of each HIT and the responses that have been provided by Workers. Because we’re just updating the results array, we can run this as frequently as we wish until the HITs are completed to get the status of our tasks.

For each item in the results array we perform the following steps:

Get the current status of the HIT and store it in the results array.
Get a list of the Assignments that have been completed for each item and store the count of Assignments completed into the results array.
Loop through each Assignment and capture the details of the Assignment and the results to an array of answers.
Approve each Assignment so that the $0.05 reward will be distributed to Workers.
Store the answers in the results array and compute an average response for the item.

for item in results:
    
    # Get the status of the HIT
    hit = client.get_hit(HITId=item['hit_id'])
    item['status'] = hit['HIT']['HITStatus']    # Get a list of the Assignments that have been submitted
    assignmentsList = client.list_assignments_for_hit(
        HITId=item['hit_id'],
        AssignmentStatuses=['Submitted', 'Approved'],
        MaxResults=10
    )
    assignments = assignmentsList['Assignments']
    item['assignments_submitted_count'] = len(assignments)    answers = []
    for assignment in assignments:
    
        # Retreive the attributes for each Assignment
        worker_id = assignment['WorkerId']
        assignment_id = assignment['AssignmentId']
        
        # Retrieve the value submitted by the Worker from the XML
        answer_dict = xmltodict.parse(assignment['Answer'])
        answer = answer_dict['QuestionFormAnswers']['Answer']['FreeText']
        answers.append(int(answer))
        
        # Approve the Assignment (if it hasn't been already)
        if assignment['AssignmentStatus'] == 'Submitted':
            client.approve_assignment(
                AssignmentId=assignment_id,
                OverrideRejection=False
            )
    
    # Add the answers that have been retrieved for this item
    item['answers'] = answers
    if len(answers) > 0:
        item['avg_answer'] = sum(answers)/len(answers)print(json.dumps(results,indent=2))

At the end of the task we’ll get a completed array of results that you can then use to train a model or process further.

[
  {
    "tweet": "in science class right now... urgh... stupid project..",
    "hit_id": "3SU800BH87Y9GXNK9PUO0R56NK3UQ6",
    "status": "Reviewable",
    "assignments_submitted_count": 5,
    "answers": [
      2,
      2,
      1,
      4,
      2
    ],
    "avg_answer": 2.2
  },
  {
    "tweet": "hmmm what to have for breaky?... Honey on toast ",
    "hit_id": "3YLPJ8OXX9JU8WUHGXYYHV35LA2X4J",
    "status": "Reviewable",
    "assignments_submitted_count": 5,
    "answers": [
      7,
      5,
      8,
      7,
      6
    ],
    "avg_answer": 6.6
  },
  {
    "tweet": "Doing home work  x",
    "hit_id": "3VZYA8PITP447PS6RQS1D9M8BZD502",
    "status": "Reviewable",
    "assignments_submitted_count": 5,
    "answers": [
      7,
      5,
      8,
      5,
      5
    ],
    "avg_answer": 6.0
  },
  {
    "tweet": "Headed out of town for a few days. Will miss my girls",
    "hit_id": "3ULIZ0H1VBB4EEDON9W1RE5IGRJ515",
    "status": "Reviewable",
    "assignments_submitted_count": 5,
    "answers": [
      8,
      4,
      5,
      5,
      6
    ],
    "avg_answer": 5.6
  }
]