Hungarian Generative Diachronic Syntax (HGDS)

The project had two main objectives: first, to investigate the history of the syntax of Hungarian, and second, to build an electronic database which contains all the sources from the Old Hungarian period (ca. 2 millions of tokens).

Since a significant part of the source material was only available in print, they had to be converted into some electronic format. Uniformity is a basic requirement to be able to ask queries to the whole corpus, so we use standard (UTF-8 encoded) Unicode characters. 11 codices and 23 smaller texts have been normalized and converted into modern Hungarian spelling as much as possible. By using a morphological analyzer, which has been originally developed for Modern Hungarian and then adapted for Old Hungarian, 4 codices have been morphologically analyzed and disambiguated.

All of the Old Hungarian sources (47 codices, 24 smaller texts, and 244 personal letters) are available via a corpus query interface, called Old Hungarian Concordance .

A very important part of our cultural heritage became available and electronically retrievable for researchers and for anybody who is interested in historical linguistics.
Grant ID: NK 78074 – Funding: Hungarian Scientific Research Fund (OTKA)
