Part 1: Bot Detection Preliminaries
detranli
= DEtection of RANdom LIkert-type responses# Used for visualizations, installing the detranli package
install.packages(c("ggplot2","GGally"), type="binary")
install.packages("devtools", type="binary")
# stable version (recommended)
devtools::install_github("michaeljohnilagan/detranli")
# experimental version
#devtools::install_github("falkcarl/detranli", type="binary")
# some example datasets
install.packages(c("psychTools","qgraph"), type="binary")
Online data collection is fast, cheap, and facilitates access to hard-to-reach populations
Since participants are compensated 💰, there is incentive to complete many surveys in a short time
Humans may exhibit these partially, while bots may do so for all items in the survey
“I feel that I have a number of good qualities”
(Strongly Disagree) 1 2 3 4 5 (Strongly Agree)
Row | Item 1 | Item 2 | Item 3 | Item 4 | … |
---|---|---|---|---|---|
1 | 4 | 2 | 3 | 4 | … |
2 | 1 | 1 | 1 | 1 | … |
3 | 5 | 4 | 2 | 4 | … |
4 | 5 | 2 | 5 | 5 | … |
5 | 1 | 2 | 2 | 3 | … |
6 | 5 | 2 | 3 | 3 | … |
… | … | … | … | … | … |
🚩 Flag | 👍 Spare | |
---|---|---|
🤖 Bot | True positive ✔️ | False negative ❌ |
👶 Human | False positive ❌ | True negative ✔️ |
ID | Truth |
---|---|
1 | 👶 |
2 | 👶 |
3 | 👶 |
4 | 👶 |
5 | 👶 |
6 | 👶 |
7 | 🤖 |
8 | 🤖 |
9 | 🤖 |
10 | 🤖 |
ID | Truth |
---|---|
1 | 👶 |
2 | 👶 |
3 | 👶 |
4 | 👶 |
5 | 👶 |
6 | 👶 |
7 | 🤖 |
8 | 🤖 |
9 | 🤖 |
10 | 🤖 |
ID | Truth | Decision |
---|---|---|
1 | 👶 | 👍 |
2 | 👶 | 🚩 |
3 | 👶 | 👍 |
4 | 👶 | 👍 |
5 | 👶 | 🚩 |
6 | 👶 | 👍 |
ID | Truth | Decision |
---|---|---|
7 | 🤖 | 🚩 |
8 | 🤖 | 🚩 |
9 | 🤖 | 🚩 |
10 | 🤖 | 👍 |
ID | Truth | Decision |
---|---|---|
1 | 👶 | 👍 |
2 | 👶 | 🚩 |
3 | 👶 | 👍 |
4 | 👶 | 👍 |
5 | 👶 | 🚩 |
6 | 👶 | 👍 |
7 | 🤖 | 🚩 |
8 | 🤖 | 🚩 |
9 | 🤖 | 🚩 |
10 | 🤖 | 👶 |
Decision | |||
Spare | Flag | ||
Truth | Human | 4 | 2 |
Bot | 1 | 3 |
“Thankfully, because their answers clearly aren’t human, there are several methods for detecting bots in your data (see Dupuis et al., 2018)” - Blog for Prolific Academic (2022)
Humans answer based on item content; affects means and correlations among items
Bots do not follow the same structure; item responses are independent
Row | Item 1 | Item 2 | Item 3 | Item 4 |
---|---|---|---|---|
1 | 4 | 2 | 3 | 4 |
2 | 1 | 1 | 1 | 1 |
3 | 5 | 4 | 2 | 4 |
4 | 5 | 2 | 5 | 5 |
5 | 1 | 2 | 2 | 3 |
6 | 5 | 2 | 3 | 3 |
Row | Item 1 | Item 2 | Item 3 | Item 4 | IRV |
---|---|---|---|---|---|
1 | 4 | 2 | 3 | 4 | .96 |
2 | 1 | 1 | 1 | 1 | .00 |
3 | 5 | 4 | 2 | 4 | 1.26 |
4 | 5 | 2 | 5 | 5 | 1.50 |
5 | 1 | 2 | 2 | 3 | .82 |
6 | 5 | 2 | 3 | 3 | 1.26 |
Row | Item 1 | Item 2 | Item 3 | Item 4 | longstring |
---|---|---|---|---|---|
1 | 4 | 2 | 3 | 4 | 1 |
2 | 1 | 1 | 1 | 1 | 4 |
3 | 5 | 4 | 2 | 4 | 1 |
4 | 5 | 2 | 5 | 5 | 2 |
5 | 1 | 2 | 2 | 3 | 2 |
6 | 5 | 2 | 3 | 3 | 2 |
Row | Item 1 | Item 2 | Item 3 | Item 4 | PTC |
---|---|---|---|---|---|
1 | 4 | 2 | 3 | 4 | .99 |
2 | 1 | 1 | 1 | 1 | -1.00 |
3 | 5 | 4 | 2 | 4 | .47 |
4 | 5 | 2 | 5 | 5 | .81 |
5 | 1 | 2 | 2 | 3 | -.11 |
6 | 5 | 2 | 3 | 3 | .82 |
Item 1 | Item 2 | Item 3 | Item 4 | |
---|---|---|---|---|
Means | 3.5 | 2.17 | 2.67 | 3.33 |
Multivariate version of the “z-score” standardization
Under univariate normal distribution, z-score far from zero are less likely
Under a multivariate normal distribution, locations with large Mahalanobis distance are less likely
Coordinates | Euclidean distance from center | Mahalanobis distance from center | |
---|---|---|---|
Center | \((0, 0)\) | 0 | 0 |
Blue | \((+2, +2)\) | 2.83 | 4.22 |
Red | \((-2, +2)\) | 2.83 | 28.22 |
Column means and covariances
item1 item2 item3 item4
3.500 2.167 2.667 3.333
item1 item2 item3 item4
item1 3.9 1.100 1.800 2.000
item2 1.1 0.967 0.067 0.733
item3 1.8 0.067 1.867 1.533
item4 2.0 0.733 1.533 1.867
Result
item1 item2 item3 item4 mahal
1 4 2 3 4 2.041
2 1 1 1 1 1.739
3 5 4 2 4 1.970
4 5 2 5 5 1.877
5 1 2 2 3 1.739
6 5 2 3 3 1.543
“Thankfully, because their answers clearly aren’t human, there are several methods for detecting bots in your data (see Dupuis et al., 2018)” - Blog for Prolific Academic (2022)
Dupuis et al. did not show researchers how to use NRIs to flag bots
Do any of these strategies achieve high classification accuracy? (\(\approx\) 10:00)
See how well these strategies fare in various samples
Require more knowledge about the inventory, such as reverse-coding of items and/or underlying factor structure
number of items | critical value for \(\alpha=0.1\) | critical value for \(\alpha=0.05\) |
---|---|---|
15 | 4.72 | 5.00 |
20 | 5.33 | 5.60 |
25 | 5.86 | 6.14 |
30 | 6.34 | 6.62 |
35 | 6.79 | 7.06 |
$confusion
yhat
y flag spare
0 1 74
1 12 13
$outcomemeasures
acc spec sens flagrate
0.8600000 0.9866667 0.4800000 0.1300000
$confusion
yhat
y flag spare
0 0 75
1 8 17
$outcomemeasures
acc spec sens flagrate
0.83 1.00 0.32 0.08
$confusion
yhat
y flag spare
0 0 75
1 11 14
$outcomemeasures
acc spec sens flagrate
0.86 1.00 0.44 0.11
$confusion
yhat
y flag spare
0 16 59
1 25 0
$outcomemeasures
acc spec sens flagrate
0.8400000 0.7866667 1.0000000 0.4100000
Workshop materials: https://osf.io/vnuew/ contact: carl.falk@mcgill.ca