Comments on MTurk Data integrity.

Please feel free to comment on my previous post about data contamination on MTurk below.

23 thoughts on “Comments on MTurk Data integrity.”

Nathan Favero August 14, 2018 at 5:34 pm

Thanks for sharing this. Very helpful. Two quick thoughts (one of which I mentioned briefly on Twitter already):

1. If doing experiments on MTurk, I would expect that having some bots/repeat respondents in your data should generally bias estimates of treatment effects towards 0. Thus, tests of treatment effects will tend to be conservative. This is definitely something to be aware of, and if we care about precisely estimating effect sizes (e.g., comparing to effect sizes from other studies), we should acknowledge that our estimates are probably biased towards zero. This isn’t all that different from a case where there’s (simple) measurement error.

2. This isn’t so different from the problem of inattentive subjects, except for the scale of it. Good manipulation checks should catch this kind of thing, from what I can tell.

billy August 15, 2018 at 10:05 am

Too bad you did not post the original qualifications. It is obvious you are skilled with statistics, but have little experience on mturk.
It is quite a broad stoke to say mturk gives you bad data when you did not solicit high quality participants to begin with.

Sean Dennis August 17, 2018 at 5:55 pm

My co-authors and I have just posted the following working paper to SSRN that investigates the root cause of this issue. Importantly, we find no evidence of bots.

https://ssrn.com/abstract=3233954

Timothy Ryan Post authorAugust 18, 2018 at 11:20 am

Thanks for posting this Sean! Looks like a helpful addition to the conversation.

I think some of the parties in the conversation might be talking past each other as concerns “bots.” For some people, bots seems to be “completely automated responses.” For other people, bots more means computer-assisted or subsidized.

I look at responses that have identical text and that came in *at the same time,* and I think, “Look! A bot!” Maybe that’s a misuse of the term, but what I really mean is that it’s clear that one entity is filling out the survey multiple times–and in a way such that computers are helping the effort. Maybe that’s having two browser windows open simultaneously–each logged in from a different account–and using copy/paste to reproduce the same text. And maybe you wouldn’t call that kind of activity a bot. But in any event, it’s fraudulent and damaging to the dataset.

In your paper, you also note receiving verbatim repeated text in open-ended responses, though I missed it if you said whether any of these came in simultaneously. But I think we’re looking at the same phenomenon here, and I think it’s clear that computers are helping a single individual fill out a survey thoughtlessly, and multiple times. Whether that counts as a “bot” or not might just be a matter of semantics.

Patrick Comer August 18, 2018 at 6:32 pm

Tim, love the analysis. I run Lucid which is the largest marketplace for survey responses.

Fraudulent responses via bot or bot-like behavior started in earnest in the summer of 2016. MTurk is just one source in the highly active survey sample marketplace. As systems have become more automated and programmatic, the ability for fraud at scale became more profitable. Simply put: we saw the peak of this two years ago. Slightly surprised that it took this long to discover in MTurk responses.

What did we have to do to combat? 1) millions of dollars spent on fraud detection and security software 2) technical integrations between sources and survey software and 3) training for buyers and sellers of respondents. It’s not a silver bullet answer but rather a long grind to reduce fraud and error in the survey process. Over two quarters we were able to cut the fraud rate in half.

I’m passionate about quality of responses. Check out more info here: https://luc.id/quality/

swiv August 19, 2018 at 11:52 pm

If your university is in the habit of offering particularly well-paying surveys, you also be assured that approx. the same 50 people are taking part in a good number of them:

https://www.reddit.com/r/mturk/comments/98ko4r/this_automatic_accept_script/

https://www.reddit.com/r/mturk/comments/98j1d6/turkerhub_has_a_secret_script_members_pay_for/

Roger Bell August 31, 2018 at 4:14 pm

Thank you for the detailed information.

Repelis January 7, 2019 at 4:14 am

Good reading. Nice article

Beatriz January 23, 2019 at 5:15 pm

Looking now at some old data where I had quality problems. Participants with duplicate locations were more likely to fail the open ended questions. Also, my age question was to select the year from drop down menu. Interestingly, disproportionate answers for 1987, 1988, and 1989 (so around the 30’s you found). Most of the text fails are versions of “good” and “nice”.