PUNCT

home grc/pos edit page issue tracker

This page still pertains to UD version 1.

`PUNCT`: punctuation

Definition

Punctuation marks in Ancient Greek texts have in general been added by modern editors. There are four main punctuation marks that can be found in modern editions: comma (COMMA “U+002C”), period (FULL STOP “U+002E”), the point above the line (MIDDLE DOT “U+00B7” corresponding to a an English colon or semicolon), and question mark (SEMICOLON “U+003B”).

The mark for elision (Smyth 1920: 23-24) is the apostrophe (COMBINING COMMA ABOVE “U+0313”). Crasis (Smyth 1920: 22-23) and aphaeresis (Smyth 1920: 24) are signaled by a smooth breathing (COMBINING COMMA ABOVE “U+0313”) standing either on the vowel/diphthong resulting from crasis or for an elided ε at the beginning of a word (aphaeresis).

References

Smyth, Herbert Weir. 1920. A Greek Grammar for Colleges. New York: American Book Company (Perseus Digital Library; Internet Archive).

Treebank Statistics (UD_Ancient_Greek)

There are 8 PUNCT lemmas (0%), 8 PUNCT types (0%) and 22603 PUNCT tokens (12%). Out of 14 observed tags, the rank of PUNCT is: 13 in number of lemmas, 14 in number of types and 3 in number of tokens.

The 10 most frequent PUNCT lemmas: ,, ., ·, ;, ̓, …, ?, †

The 10 most frequent PUNCT types: ,, ., ·, ;, ̓, …, ?, †

The 10 most frequent ambiguous lemmas:

The 10 most frequent ambiguous types:

Morphology

The form / lemma ratio of PUNCT is 1.000000 (the average of all parts of speech is 3.038201).

The 1st highest number of forms (1) was observed with the lemma “,”: ,.

The 2nd highest number of forms (1) was observed with the lemma “.”: ..

The 3rd highest number of forms (1) was observed with the lemma “…”: ….

PUNCT does not occur with any features.

Relations

PUNCT nodes are attached to their parents using 2 different relations: grc-dep/punct (22590; 100% instances), grc-dep/advmod (13; 0% instances)

Parents of PUNCT nodes belong to 14 different parts of speech: VERB (18483; 82% instances), SCONJ (1346; 6% instances), NOUN (1180; 5% instances), ADJ (827; 4% instances), CCONJ (313; 1% instances), PRON (156; 1% instances), ADV (112; 0% instances), ADP (93; 0% instances), INTJ (47; 0% instances), PART (19; 0% instances), X (10; 0% instances), NUM (9; 0% instances), DET (5; 0% instances), PUNCT (3; 0% instances)

22589 (100%) PUNCT nodes are leaves.

8 (0%) PUNCT nodes have one child.

5 (0%) PUNCT nodes have two children.

1 (0%) PUNCT nodes have three or more children.

The highest child degree of a PUNCT node is 4.

Children of PUNCT nodes are attached using 10 different relations: grc-dep/ccomp (5; 23% instances), grc-dep/obj (4; 18% instances), grc-dep/nsubj (3; 14% instances), grc-dep/punct (3; 14% instances), grc-dep/advmod (2; 9% instances), grc-dep/csubj (1; 5% instances), grc-dep/nmod (1; 5% instances), grc-dep/obl (1; 5% instances), grc-dep/vocative (1; 5% instances), grc-dep/xcomp (1; 5% instances)

Children of PUNCT nodes belong to 6 different parts of speech: VERB (8; 36% instances), NOUN (6; 27% instances), PRON (3; 14% instances), PUNCT (3; 14% instances), ADJ (1; 5% instances), PART (1; 5% instances)