An Accessible Python based Author Identification Process
Anthony Breitzman
Author identification also known as ‘author attribution’ and more recently ‘forensic linguistics’ involves identifying true authors of anonymous texts. The Federalist Papers are 85 documents written anonymously by a combination of Alexander Hamilton, John Jay, and James Madison in the late
1780's supporting adoption of the American Constitution. All but 12 documents have confirmed authors based on lists provided before the
author’s deaths. Mosteller and Wallace in 1963 provided evidence of authorship for the 12 disputed documents, however the analysis is
not readily accessible to non-statisticians. In this paper we replicate the analysis but in a much more accessible way using modern
text mining methods and Python. One surprising result is the usefulness of filler-words in identifying writing styles. The method
described here can be applied to other authorship questions such as linking the Unabomber manifesto with Ted Kaczynski,
identifying Shakespeare's collaborators, etc. Although the question of authorship of the Federalist Papers has been studied before, what is new in this paper is we highlight a process and tools that can be easily used by Python programmers, and the methods do not rely on any knowledge of statistics or machine learning.
Federalist, Author Identification, Attribution, Forensic Linguistics, Text-Mining
DOI10.25080/gerudo-f2bc6f59-003