The Corpus of Early Modern English Trials (1650-1700): Building of the Corpus and Hypotheses of Normalization


  • Emma Pasquali eCampus University of Novedrate




 The present paper discusses the building stages of the Corpus of Early Modern English Trials (1650-1700), henceforth EMET, a 1.8 million words highly specialized historical corpus of trial proceedings. The main purpose of the creation of the above-mentioned corpus is to shed light on the pragmatic aspects of Early Modern spoken English, since trial proceedings are considered records of authentic dialogues (Culpeper and Kytö 2010, 17). More specifically, the EMET was created in order to investigate the pragmatic influences both on the choice of the second person pronoun, which coexisted in the forms thou and you, and of any T- and Y-form used during the Restoration: thee, prithee, prethee, prethy, pray thee, thy, thy self, thyself, thine, you, ye, your, your self, yourself, yours and pray you.

The initial part of the essay will briefly explore the phase of the archives’ consultation, the criteria behind the selection of the trials and the technical stages that are necessary to the uploading of a corpus on #LancsBox and its study. Afterwards, the EMET itself will be presented (number of documents, total number of tokens and average number of tokens per text, and types of charges involved).

Then, the essay will focus on editing, normalization and POS tagging. More specifically, it will be illustrated how trials, and historical documents in general, should be edited in order to successfully analyse them with corpus linguistics tools. Then, different hypotheses of normalization of the EMET will be compared in detail and discussed. After determining which normalization parameters suit best the corpus, the advantages of such process will be highlighted. Lastly, the issues derived from the normalization process – mainly bound to proper nouns, badly preserved documents (i.e., noisy texts), and Latin (and foreign) terms – will be examined.




How to Cite

Pasquali, E. (2023). The Corpus of Early Modern English Trials (1650-1700): Building of the Corpus and Hypotheses of Normalization. Status Quaestionis, (25).