Statistics of DET in UD

home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PUD: POS Tags: `DET`

There are 26 DET lemmas (0%), 117 DET types (2%) and 813 DET tokens (4%). Out of 15 observed tags, the rank of DET is: 8 in number of lemmas, 7 in number of types and 8 in number of tokens.

The 10 most frequent DET lemmas: ten, který, jeho, tento, svůj, mnoho, můj, všechen, některý, několik

The 10 most frequent DET types: to, který, jeho, které, která, jejich, své, mnoho, toho, její

The 10 most frequent ambiguous lemmas: ten (DET 180, ADV 2, PRON 1), jenž (PRON 19, DET 8), každý (ADJ 14, DET 2), málo (ADV 6, DET 2)

The 10 most frequent ambiguous types: to (DET 78, ADV 3), jehož (DET 3, PRON 1), každý (ADJ 6, DET 2), málo (ADV 1, DET 1)

to
- DET 78: Co říká a co dělá , to je – no , je to neuvěřitelné .
- ADV 3: Blokáda Villeneuvovy flotily vyústila v invazivní plán vzdát se Britských ostrovů , a to i v důsledku nového vývoje na kontinentu .
jehož
- DET 3: Donald Trump je nabubřelý , arogantní a sebestředný člověk , kterému nezáleží na ostatních a jehož povaha škodí Spojeným státům .
- PRON 1: Cuaron , jehož nejnovějším filmem je oskarová Gravitace , podle dostupných zpráv v době incidentu u natáčení nebyl .
každý
- ADJ 6: Ne každý se nad to dokáže povznést .
- DET 2: Oceněná stavba , navržená Juanem Carlosem Salasem , má sošný vzhled a každý detail má svůj význam .
málo
- ADV 1: Po sedmnácti dnech bylo naživu přes 70 % rostlin ze semínek ze Země , tedy jen o málo více , než úspěšnost těsně přes 66 % u rostlin z vesmírných semínek .
- DET 1: V Pchjongjangu jsem sice viděl pár lidí používat chytré telefony , ale bylo jich opravdu málo .

Morphology

The form / lemma ratio of DET is 4.500000 (the average of all parts of speech is 1.427558).

The 1st highest number of forms (13) was observed with the lemma “ten”: ta, ten, to, toho, tom, tomu, tou, ty, té, tím, těch, těm, těmi.

The 2nd highest number of forms (12) was observed with the lemma “tento”: tato, tento, tohoto, tomto, toto, tuto, tyto, této, tímto, těchto, těmito, těmto.

The 3rd highest number of forms (11) was observed with the lemma “který”: kterou, která, které, kterého, kterém, kterému, který, kterých, kterým, kterými, kteří.

DET occurs with 13 features: PronType (813; 100% instances), Case (721; 89% instances), Number (683; 84% instances), Gender (627; 77% instances), Poss (226; 28% instances), Number[psor] (139; 17% instances), Person (139; 17% instances), Reflex (87; 11% instances), Animacy (86; 11% instances), Gender[psor] (82; 10% instances), NumType (46; 6% instances), Abbr (13; 2% instances), Polarity (1; 0% instances)

DET occurs with 35 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Fem,Neut, Gender=Masc, Gender=Masc,Neut, Gender=Neut, Gender[psor]=Fem, Gender[psor]=Masc,Neut, NumType=Card, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Polarity=Pos, Poss=Yes, PronType=Dem, PronType=Emp, PronType=Ind, PronType=Int,Rel, PronType=Neg, PronType=Prs, PronType=Rel, PronType=Tot, Reflex=Yes

DET occurs with 143 feature combinations. The most frequent feature combination is Case=Nom|Gender=Neut|Number=Sing|PronType=Dem (87 tokens). Examples: to, toto, tohle

Relations

DET nodes are attached to their parents using 17 different relations: det (407; 50% instances), nsubj (214; 26% instances), obj (53; 7% instances), obl (38; 5% instances), det:numgov (30; 4% instances), nsubj:pass (17; 2% instances), nmod (13; 2% instances), det:nummod (10; 1% instances), iobj (10; 1% instances), conj (5; 1% instances), fixed (4; 0% instances), acl (3; 0% instances), advcl (3; 0% instances), root (3; 0% instances), amod (1; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances)

Parents of DET nodes belong to 11 different parts of speech: NOUN (472; 58% instances), VERB (243; 30% instances), ADJ (65; 8% instances), ADV (7; 1% instances), DET (6; 1% instances), PRON (6; 1% instances), PROPN (4; 0% instances), CCONJ (3; 0% instances), NUM (3; 0% instances), (3; 0% instances), AUX (1; 0% instances)

702 (86%) DET nodes are leaves.

74 (9%) DET nodes have one child.

28 (3%) DET nodes have two children.

9 (1%) DET nodes have three or more children.

The highest child degree of a DET node is 5.

Children of DET nodes are attached using 18 different relations: case (58; 35% instances), acl (36; 22% instances), punct (27; 16% instances), cop (7; 4% instances), nmod (7; 4% instances), advmod (6; 4% instances), nsubj (6; 4% instances), advmod:emph (4; 2% instances), amod (3; 2% instances), cc (2; 1% instances), conj (2; 1% instances), advcl (1; 1% instances), appos (1; 1% instances), det (1; 1% instances), mark (1; 1% instances), obl (1; 1% instances), orphan (1; 1% instances), vocative (1; 1% instances)

Children of DET nodes belong to 13 different parts of speech: ADP (58; 35% instances), PUNCT (27; 16% instances), VERB (27; 16% instances), NOUN (12; 7% instances), ADJ (9; 5% instances), ADV (7; 4% instances), AUX (7; 4% instances), DET (6; 4% instances), PART (3; 2% instances), PRON (3; 2% instances), PROPN (3; 2% instances), CCONJ (2; 1% instances), SCONJ (1; 1% instances)

Treebank Statistics: UD_Czech-PUD: POS Tags: DET

Morphology

Relations

Treebank Statistics: UD_Czech-PUD: POS Tags: `DET`