Shakespeare on the Tree (2.0)
The present article borrows from biology the idea of identifying and assessing phenotypical and genotypical traits shared by different individuals so as to group them into families. The aim of the research is to ascertain whether it is possible to create phylogenetic trees of Shakespeare’s theatrical plays and to what extent such tools may prove useful to Shakespearean scholars. Considering each Shakespearean play as a single individual with a distinguishing DNA of its own and closely following the procedures used in the field of molecular biology, the author resorts to a modified zipping algorithm to retrieve and extrapolate character strings (DNA sequences) shared by text pairs. Such pairs are subsequently plotted utilizing an algorithm specifically designed to create phylogenies. The final sections of the paper illustrate 4 phylogenies and discuss how they may prove useful in different fields of textual criticism. The first shows the effectiveness of the procedure in text recognition. In the second text recognition is made even more difficult by increasing the number of text pairs to be analysed. The third deals with language recognition issues by showing how a play written in a different language is recognised as such and isolated from the rest of the Shakespearean corpus. Eventually the fourth tree sketches a methodology to tackle authorship attribution issues.
Keywords: Phylogeny, Cladogram, Language recognition, Author recognition, DNA sequencing