We discuss a real-world application of a recently proposed machine learning method for authorship verification. Authorship verification is considered an extremely difficult task in computational text classification, because it does not assume that the correct author of an anonymous text is included in the candidate authors available. To determine whether 2 documents have been written by the same author, the verification method discussed uses repeated feature subsampling and a pool of impostor authors. We use this technique to attribute a newly discovered Latin text from antiquity (the Compendiosa expositio) to Apuleius. This North African writer was one of the most important authors of the Roman Empire in the 2nd century and authored one of the world's first novels. This attribution has profound and wide-reaching cultural value, because it has been over a century since a new text by a major author from antiquity was discovered. This research therefore illustrates the rapidly growing potential of computational methods for studying the global textual heritage.
|Journal||Journal of the Association for Information Science and Technology|
|Early online date||23 Dec 2015|
|Publication status||Published - Jan 2016|