home grc/pos edit page issue tracker

This page still pertains to UD version 1.

PUNCT: punctuation

Definition

Punctuation marks in Ancient Greek texts have in general been added by modern editors. There are four main punctuation marks that can be found in modern editions: comma (COMMA “U+002C”), period (FULL STOP “U+002E”), the point above the line (MIDDLE DOT “U+00B7” corresponding to a an English colon or semicolon), and question mark (SEMICOLON “U+003B”).

The mark for elision (Smyth 1920: 23-24) is the apostrophe (COMBINING COMMA ABOVE “U+0313”). Crasis (Smyth 1920: 22-23) and aphaeresis (Smyth 1920: 24) are signaled by a smooth breathing (COMBINING COMMA ABOVE “U+0313”) standing either on the vowel/diphthong resulting from crasis or for an elided ε at the beginning of a word (aphaeresis).

References

Smyth, Herbert Weir. 1920. A Greek Grammar for Colleges. New York: American Book Company (Perseus Digital Library; Internet Archive).


Treebank Statistics (UD_Ancient_Greek)

There are 8 PUNCT lemmas (0%), 8 PUNCT types (0%) and 22603 PUNCT tokens (12%). Out of 14 observed tags, the rank of PUNCT is: 13 in number of lemmas, 14 in number of types and 3 in number of tokens.

The 10 most frequent PUNCT lemmas: ,, ., ·, ;, ̓, …, ?, †

The 10 most frequent PUNCT types: ,, ., ·, ;, ̓, …, ?, †

The 10 most frequent ambiguous lemmas:

The 10 most frequent ambiguous types:

Morphology

The form / lemma ratio of PUNCT is 1.000000 (the average of all parts of speech is 3.038201).

The 1st highest number of forms (1) was observed with the lemma “,”: ,.

The 2nd highest number of forms (1) was observed with the lemma “.”: ..

The 3rd highest number of forms (1) was observed with the lemma “…”: .

PUNCT does not occur with any features.

Relations

PUNCT nodes are attached to their parents using 2 different relations: grc-dep/punct (22590; 100% instances), grc-dep/advmod (13; 0% instances)

Parents of PUNCT nodes belong to 14 different parts of speech: VERB (18483; 82% instances), SCONJ (1346; 6% instances), NOUN (1180; 5% instances), ADJ (827; 4% instances), CCONJ (313; 1% instances), PRON (156; 1% instances), ADV (112; 0% instances), ADP (93; 0% instances), INTJ (47; 0% instances), PART (19; 0% instances), X (10; 0% instances), NUM (9; 0% instances), DET (5; 0% instances), PUNCT (3; 0% instances)

22589 (100%) PUNCT nodes are leaves.

8 (0%) PUNCT nodes have one child.

5 (0%) PUNCT nodes have two children.

1 (0%) PUNCT nodes have three or more children.

The highest child degree of a PUNCT node is 4.

Children of PUNCT nodes are attached using 10 different relations: grc-dep/ccomp (5; 23% instances), grc-dep/obj (4; 18% instances), grc-dep/nsubj (3; 14% instances), grc-dep/punct (3; 14% instances), grc-dep/advmod (2; 9% instances), grc-dep/csubj (1; 5% instances), grc-dep/nmod (1; 5% instances), grc-dep/obl (1; 5% instances), grc-dep/vocative (1; 5% instances), grc-dep/xcomp (1; 5% instances)

Children of PUNCT nodes belong to 6 different parts of speech: VERB (8; 36% instances), NOUN (6; 27% instances), PRON (3; 14% instances), PUNCT (3; 14% instances), ADJ (1; 5% instances), PART (1; 5% instances)


PUNCT in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]