Treebank Statistics: UD_Czech: POS Tags: PROPN
There are 15241 PROPN
lemmas (25%), 21954 PROPN
types (17%) and 84031 PROPN
tokens (6%).
Out of 17 observed tags, the rank of PROPN
is: 2 in number of lemmas, 3 in number of types and 6 in number of tokens.
The 10 most frequent PROPN
lemmas: Praha, ČR, Evropa, LN, Jan, Jiří, Německo, Brno, ODS, USA
The 10 most frequent PROPN
types: Praha, ČR, Praze, LN, ODS, USA, J, Jiří, Jan, OSN
The 10 most frequent ambiguous lemmas: J (PROPN 422, ADJ 30), M (PROPN 244, NOUN 8, ADJ 1), V (PROPN 210, NUM 23, NOUN 7, ADJ 5), A (PROPN 172, NOUN 8, ADJ 8), York (PROPN 165, ADJ 5), P (PROPN 136, ADJ 4, NOUN 2), čt (NOUN 4, PROPN 2), S (PROPN 116, ADJ 12, NOUN 2), Washington (PROPN 111, ADJ 1), r (NOUN 55, ADV 1, PROPN 1)
The 10 most frequent ambiguous types: J (PROPN 422, ADJ 30, NOUN 3), M (PROPN 244, NOUN 51, X 3, ADJ 1), V (ADP 3736, PROPN 210, NUM 23, NOUN 15, ADJ 6, ADV 2), A (CCONJ 1042, PROPN 172, NOUN 93, ADJ 19, X 4), Rusko (PROPN 163, ADJ 3), Německo (PROPN 144, ADJ 2), P (PROPN 136, NOUN 124, ADJ 17, ADP 1), S (ADP 470, PROPN 117, NOUN 38, ADJ 14, X 3), r (NOUN 433, PROPN 1, ADV 1), F (PROPN 99, NOUN 27, ADJ 10)
- J
- M
- V
- A
- Rusko
- Německo
- P
- S
- r
- F
Morphology
The form / lemma ratio of PROPN
is 1.440457 (the average of all parts of speech is 2.181792).
The 1st highest number of forms (11) was observed with the lemma “Čech”: ČECH, ČEŠI, Čech, Čecha, Čechem, Čechovi, Čechy, Čechů, Čechům, Češi, Češích.
The 2nd highest number of forms (10) was observed with the lemma “Jan”: JAN, JANA, Jan, Jana, Janem, Janovi, Janové, Janu, Jany, Janů.
The 3rd highest number of forms (10) was observed with the lemma “Němec”: NĚMCI, NĚMCŮ, NĚMEC, Němce, Němcem, Němci, Němcích, Němců, Němcům, Němec.
PROPN
occurs with 9 features: NameType (84031; 100% instances), Polarity (84031; 100% instances), Gender (82083; 98% instances), Number (68761; 82% instances), Case (66478; 79% instances), Animacy (48949; 58% instances), Abbr (13042; 16% instances), Foreign (3684; 4% instances), Style (155; 0% instances)
PROPN
occurs with 46 feature-value pairs: Abbr=Yes
, Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Case=Voc
, Foreign=Yes
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NameType=Com
, NameType=Com,Geo
, NameType=Com,Giv
, NameType=Com,Giv,Sur
, NameType=Com,Nat
, NameType=Com,Pro
, NameType=Com,Sur
, NameType=Geo
, NameType=Geo,Giv
, NameType=Geo,Giv,Sur
, NameType=Geo,Oth
, NameType=Geo,Pro
, NameType=Geo,Sur
, NameType=Giv
, NameType=Giv,Nat
, NameType=Giv,Oth
, NameType=Giv,Pro
, NameType=Giv,Pro,Sur
, NameType=Giv,Sur
, NameType=Nat
, NameType=Nat,Sur
, NameType=Oth
, NameType=Pro
, NameType=Pro,Sur
, NameType=Sur
, Number=Plur
, Number=Sing
, Polarity=Pos
, Style=Arch
, Style=Coll
, Style=Expr
, Style=Rare
PROPN
occurs with 613 feature combinations.
The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc|NameType=Sur|Number=Sing|Polarity=Pos
(14098 tokens).
Examples: Klaus, Havel, Svoboda, Mečiar, Jelcin, John, Zeman, Němec, Novák, Benda
Relations
PROPN
nodes are attached to their parents using 27 different relations: nmod (26735; 32% instances), nsubj (14713; 18% instances), flat (13595; 16% instances), conj (7679; 9% instances), obl (6762; 8% instances), root (5373; 6% instances), dep (2878; 3% instances), obj (2057; 2% instances), appos (1361; 2% instances), obl:arg (751; 1% instances), iobj (558; 1% instances), orphan (495; 1% instances), flat:foreign (431; 1% instances), nsubj:pass (351; 0% instances), advcl (155; 0% instances), obl:agent (34; 0% instances), xcomp (32; 0% instances), vocative (21; 0% instances), cc (18; 0% instances), ccomp (8; 0% instances), amod (6; 0% instances), case (6; 0% instances), acl (5; 0% instances), parataxis (3; 0% instances), csubj (2; 0% instances), csubj:pass (1; 0% instances), punct (1; 0% instances)
Parents of PROPN
nodes belong to 15 different parts of speech: NOUN (26293; 31% instances), PROPN (26160; 31% instances), VERB (22399; 27% instances), (5373; 6% instances), ADJ (2773; 3% instances), NUM (403; 0% instances), ADV (375; 0% instances), DET (109; 0% instances), PRON (100; 0% instances), PART (18; 0% instances), ADP (12; 0% instances), SYM (7; 0% instances), CCONJ (5; 0% instances), INTJ (2; 0% instances), PUNCT (2; 0% instances)
33788 (40%) PROPN
nodes are leaves.
27275 (32%) PROPN
nodes have one child.
12895 (15%) PROPN
nodes have two children.
10073 (12%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 29.
Children of PROPN
nodes are attached using 31 different relations: punct (19366; 21% instances), case (16403; 18% instances), flat (13613; 15% instances), nmod (13165; 14% instances), conj (8172; 9% instances), amod (5217; 6% instances), cc (3647; 4% instances), dep (3377; 4% instances), nummod (1629; 2% instances), flat:foreign (1570; 2% instances), acl (1565; 2% instances), appos (1312; 1% instances), advmod:emph (1233; 1% instances), orphan (493; 1% instances), xcomp (392; 0% instances), mark (325; 0% instances), det (113; 0% instances), advmod (84; 0% instances), parataxis (80; 0% instances), cop (67; 0% instances), obl (67; 0% instances), nsubj (62; 0% instances), nummod:gov (40; 0% instances), advcl (9; 0% instances), obj (8; 0% instances), det:numgov (5; 0% instances), aux (3; 0% instances), det:nummod (3; 0% instances), ccomp (2; 0% instances), obl:arg (2; 0% instances), expl:pv (1; 0% instances)
Children of PROPN
nodes belong to 16 different parts of speech: PROPN (26160; 28% instances), PUNCT (19368; 21% instances), ADP (16536; 18% instances), NOUN (12888; 14% instances), ADJ (6624; 7% instances), CCONJ (3993; 4% instances), NUM (2594; 3% instances), VERB (1849; 2% instances), ADV (1092; 1% instances), SCONJ (335; 0% instances), DET (257; 0% instances), PART (175; 0% instances), AUX (70; 0% instances), PRON (52; 0% instances), SYM (24; 0% instances), INTJ (8; 0% instances)