Published on Sat Sep 01 2018

Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts

Samuel Carton, Qiaozhu Mei, Paul Resnick

We introduce an adversarial method for producing high-recall explanations of neural text classifier decisions. We develop a validation set of human-annotated personal attacks to evaluate the impact of these changes.

0
0
0
Abstract

We introduce an adversarial method for producing high-recall explanations of neural text classifier decisions. Building on an existing architecture for extractive explanations via hard attention, we add an adversarial layer which scans the residual of the attention for remaining predictive signal. Motivated by the important domain of detecting personal attacks in social media comments, we additionally demonstrate the importance of manually setting a semantically appropriate `default' behavior for the model by explicitly manipulating its bias term. We develop a validation set of human-annotated personal attacks to evaluate the impact of these changes.

Mon Jul 12 2021
NLP
Lumen: A Machine Learning Framework to Expose Influence Cues in Text
Lumen is a learning-based framework that exposes influence cues in text. Cues include persuasion, emotion, framing, guilt/blame, and emphasis. Lumen was trained with a newly developed dataset of 3K texts.
6
0
0
Mon May 01 2017
NLP
"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection
Automatic fake news detection is a challenging problem in deception detection. We collected a decade-long, 12.8K manually labeled short statements in various contexts from PolitiFact.com. This new dataset is an order of magnitude larger than previously largest public fake news datasets.
0
0
0
Sat Jul 27 2019
Artificial Intelligence
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts. In this paper, we present TextFooler, a simple but strong baseline to generate natural adversarial text.
0
0
0
Tue Oct 06 2020
NLP
Detecting Attackable Sentences in Arguments
We present a first large-scale analysis of sentence attackability in online arguments. Finding attackable sentences in an argument is the first step toward successful refutation in argumentation. We demonstrate that machine learning models can automatically detect attackable sentence in arguments.
0
0
0
Tue Jan 21 2020
Artificial Intelligence
Deceptive AI Explanations: Creation and Detection
Artificial intelligence (AI) comes with great opportunities but also great risks. Automatically generated explanations for decisions are deemed helpful to better understand AI, increasing transparency and fostering trust. But given e.g. economic incentives to create dishonest AI, can we trust its explanations?
0
0
0
Thu Dec 13 2018
Machine Learning
TextBugger: Generating Adversarial Text Against Real-world Applications
Deep Learning-based Text Understanding (DLTU) is the backbone technique behind various applications, including question answering, machine translation, and text classification. The security vulnerabilities of DLTU are still largely unknown, which is highly concerning given its increasing use in security-sensitive applications.
0
0
0