Published on Tue Dec 05 2017

FlagIt: A System for Minimally Supervised Human Trafficking Indicator Mining

Mayank Kejriwal, Jiayuan Ding, Runqi Shao, Anoop Kumar, Pedro Szekely

In this paper, we describe and study the indicator mining problem in the online sex advertising domain. We present an in-development system, FlagIt, which combines the benefits of both a lightweight expert system and classical semi-supervision. The FlagIt technology stack is open source.

0
0
0
Abstract

In this paper, we describe and study the indicator mining problem in the online sex advertising domain. We present an in-development system, FlagIt (Flexible and adaptive generation of Indicators from text), which combines the benefits of both a lightweight expert system and classical semi-supervision (heuristic re-labeling) with recently released state-of-the-art unsupervised text embeddings to tag millions of sentences with indicators that are highly correlated with human trafficking. The FlagIt technology stack is open source. On preliminary evaluations involving five indicators, FlagIt illustrates promising performance compared to several alternatives. The system is being actively developed, refined and integrated into a domain-specific search system used by over 200 law enforcement agencies to combat human trafficking, and is being aggressively extended to mine at least six more indicators with minimal programming effort. FlagIt is a good example of a system that operates in limited label settings, and that requires creative combinations of established machine learning techniques to produce outputs that could be used by real-world non-technical analysts.

Thu Mar 09 2017
Artificial Intelligence
Information Extraction in Illicit Domains
Human trafficking is a challenging problem with the potential for widespread impact. Such domains employ atypical language models, have `long tails' and suffer from the problem of concept drift. We propose a lightweight, feature-agnostic Information Extraction (IE) paradigm specifically designed for such domains.
0
0
0
Sun Jun 10 2018
Machine Learning
LexNLP: Natural language processing and information extraction for legal and regulatory texts
LexNLP is an open source Python package focused on natural language processing and machine learning for legal and regulatory text. The package includes functionality to segment documents, identify key text and build unsupervised and supervised models.
0
0
0
Tue May 30 2017
Artificial Intelligence
Semi-Supervised Learning for Detecting Human Trafficking
Human trafficking is one of the most atrocious crimes and among the challenges facing law enforcement. We leverage textual data from the website "Backpage"-used for classified advertisement- to discern potential patterns of human trafficking activities.
0
0
0
Sun Dec 03 2017
Artificial Intelligence
Always Lurking: Understanding and Mitigating Bias in Online Human Trafficking Detection
Web-based human trafficking activity has increased in recent years. The use of intelligent systems to detect trafficking can have a direct impact on investigative resource allocation and decision-making.
0
0
0
Mon Feb 22 2016
Artificial Intelligence
Empath: Understanding Topic Signals in Large-Scale Text
Empath draws connotations between words and phrases by deep learning. It analyzes text across 200 built-in, pre-validated categories.
0
0
0
Mon Apr 27 2020
NLP
"Call me sexist, but...": Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples
Research has focused on automated methods to effectively detect sexism online. Although overt sexism seems easy to spot, its subtle forms and manifold expressions are not. Results indicate that current machine learning models pick up on a very narrow set of linguistic markers of sexism.
4
0
2