. 2017 Feb 23;12(2):e0172493.

doi: 10.1371/journal.pone.0172493. eCollection 2017.

GreekLex 2: A comprehensive lexical database with part-of-speech, syllabic, phonological, and stress information

Antonios Kyparissiadis¹, Walter J B van Heuven¹, Nicola J Pitchford¹, Timothy Ledgeway¹

Affiliations

PMID: 28231303
PMCID: PMC5322960
DOI: 10.1371/journal.pone.0172493

GreekLex 2: A comprehensive lexical database with part-of-speech, syllabic, phonological, and stress information

Antonios Kyparissiadis et al. PLoS One. 2017.

. 2017 Feb 23;12(2):e0172493.

doi: 10.1371/journal.pone.0172493. eCollection 2017.

Authors

Antonios Kyparissiadis¹, Walter J B van Heuven¹, Nicola J Pitchford¹, Timothy Ledgeway¹

Affiliation

¹ School of Psychology, University of Nottingham, Nottingham, United Kingdom.

PMID: 28231303
PMCID: PMC5322960
DOI: 10.1371/journal.pone.0172493

Abstract

Databases containing lexical properties on any given orthography are crucial for psycholinguistic research. In the last ten years, a number of lexical databases have been developed for Greek. However, these lack important part-of-speech information. Furthermore, the need for alternative procedures for calculating syllabic measurements and stress information, as well as combination of several metrics to investigate linguistic properties of the Greek language are highlighted. To address these issues, we present a new extensive lexical database of Modern Greek (GreekLex 2) with part-of-speech information for each word and accurate syllabification and orthographic information predictive of stress, as well as several measurements of word similarity and phonetic information. The addition of detailed statistical information about Greek part-of-speech, syllabification, and stress neighbourhood allowed novel analyses of stress distribution within different grammatical categories and syllabic lengths to be carried out. Results showed that the statistical preponderance of stress position on the pre-final syllable that is reported for Greek language is dependent upon grammatical category. Additionally, analyses showed that a proportion higher than 90% of the tokens in the database would be stressed correctly solely by relying on stress neighbourhood information. The database and the scripts for orthographic and phonological syllabification as well as phonetic transcription are available at http://www.psychology.nottingham.ac.uk/greeklex/.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Fig 1. Consonant-type classification.**
Consonant types were classified according to the Manner of Articulation, Place of Articulation, and Voicing scales.

**Fig 2. Distribution of stress position by number of syllables.**
Distribution of stress position by types (2A) and tokens (2B). Syllabic lengths higher than 7 (up to 11) were not presented in the graph as they accumulatively represent a proportion less than 1% (types) and 0.1% (tokens) of the whole set.

**Fig 3. Distribution of stress position by part-of-speech category.**
Counts for disyllables (3A), trisyllables (3B), and all polysyllables (3C). Only adjectives, nouns and verbs were considered for these calculations.

**Fig 4. Frequency of N (Neighbourhood) counts.**
Frequency counts for the clustered OLD20 Levenshtein Distance (based on [9]) values and Coltheart’s N orthographic similarity (based on [8]) values. OLD20 values are clustered around their closest integer numbers (e.g. a value of 2 represents the counts of all values between 1.5 and 2.49). Coltheart’s N values above 10 are not presented in the graph as they accumulatively represent a proportion smaller than 0.5% of the whole set.

**Fig 5. Orthographic similarity as a function of length.**
Distributions of mean OLD20 Levenshtein Distance (based on [9]) and Coltheart’s N orthographic similarity (based on [8]) as a function of word length measured in letters.

See this image and copyright information in PMC

Cited by

Chipola: A Chinese Podcast Lexical Database for capturing spoken language nuances and predicting behavioral data.
Zhao N, Lei L. Zhao N, et al. Behav Res Methods. 2025 May 8;57(6):166. doi: 10.3758/s13428-025-02697-0. Behav Res Methods. 2025. PMID: 40341999
Shabd: A psycholinguistic database for Hindi.
Verma A, Sikarwar V, Yadav H, Jaganathan R, Kumar P. Verma A, et al. Behav Res Methods. 2022 Apr;54(2):830-844. doi: 10.3758/s13428-021-01625-2. Epub 2021 Aug 6. Behav Res Methods. 2022. PMID: 34357542
HelexKids: A word frequency database for Greek and Cypriot primary school children.
Terzopoulos AR, Duncan LG, Wilson MA, Niolaki GZ, Masterson J. Terzopoulos AR, et al. Behav Res Methods. 2017 Feb;49(1):83-96. doi: 10.3758/s13428-015-0698-5. Behav Res Methods. 2017. PMID: 26822666 Free PMC article.
Syllables and their beginnings have a special role in the mental lexicon.
Sun Y, Poeppel D. Sun Y, et al. Proc Natl Acad Sci U S A. 2023 Sep 5;120(36):e2215710120. doi: 10.1073/pnas.2215710120. Epub 2023 Aug 28. Proc Natl Acad Sci U S A. 2023. PMID: 37639606 Free PMC article.
Universal linguistic hierarchies are not innately wired. Evidence from multiple adjectives.
Leivada E, Westergaard M. Leivada E, et al. PeerJ. 2019 Aug 1;7:e7438. doi: 10.7717/peerj.7438. eCollection 2019. PeerJ. 2019. PMID: 31396461 Free PMC article.

See all "Cited by" articles

References

1. Ktori M, van Heuven WJB, Pitchford NJ. GreekLex: a lexical database of Modern Greek. Behav Res Methods. 2008; 40(3): 773–783. - PubMed
1. Protopapas A, Tzakosta M, Chalamandaris A, Tsiakoulis P. IPLR: An online resource for Greek word-level and sublexical information. Lang Resour Eval. 2012; 46(3): 449–459.
1. Dimitropoulou M, Duñabeitia JA, Avilés A, Corral J, Carreiras M. Subtitle-based word frequencies as the best estimate of reading behavior: The case of Greek. Front Psychol. 2010; 1(218): 1–12. - PMC - PubMed
1. Gimenes M, New B. Worldlex: Twitter and blog word frequencies for 66 languages. Behav Res Methods. 2015; 48(3): 963–72. - PubMed
1. Hatzigeorgiu N, Gavrilidou M, Piperdis S, Carayannis G, Papakostopoulou A, Spiliotopoulou A, et al. Design and implementation of the online ILSP Greek Corpus. Paper presented at LREC 2000—the Second International Conference on Language Resources and Evaluation, Athens, Greece; 2000. Retrieved from http://www.lrec-conf.org/proceedings/lrec2000/pdf/336.pdf

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

GreekLex 2: A comprehensive lexical database with part-of-speech, syllabic, phonological, and stress information

Affiliation

GreekLex 2: A comprehensive lexical database with part-of-speech, syllabic, phonological, and stress information

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources