home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PUD: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut. Some words have combined values of the feature; 3 combinations have been observed: Fem|Masc, Fem|Neut, Masc|Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

9481 tokens (51%) have a non-empty value of Gender. 6232 types (82%) occur at least once with a non-empty value of Gender. 4159 lemmas (79%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (4337; 23% instances), ADJ (2244; 12% instances), PROPN (966; 5% instances), VERB (888; 5% instances), DET (627; 3% instances), AUX (251; 1% instances), PRON (106; 1% instances), NUM (62; 0% instances).

NOUN

4337 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Polarity=Pos (4329; 100%), Number=Sing (3086; 71%), Animacy=EMPTY (2423; 56%).

NOUN tokens may have the following values of Gender:

Paradigm rokMascNeut
Animacy=Inan|Case=Acc|Number=Singrok
Animacy=Inan|Case=Acc|Number=Plurroky
Animacy=Inan|Case=Gen|Number=Singroku, roka
Animacy=Inan|Case=Ins|Number=Singrokem
Animacy=Inan|Case=Loc|Number=Singroce, roku
Animacy=Inan|Case=Nom|Number=Singrok
Case=Acc|Number=Plur|Style=Archléta
Case=Gen|Number=Plurlet
Case=Ins|Number=Plurlety
Case=Loc|Number=Plurletech

Gender seems to be lexical feature of NOUN. 100% lemmas (1856) occur only with one value of Gender.

ADJ

2244 ADJ tokens (98% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Polarity=Pos (2075; 92%), VerbForm=EMPTY (1976; 88%), Voice=EMPTY (1976; 88%), Degree=Pos (1762; 79%), Number=Sing (1473; 66%), Animacy=EMPTY (1368; 61%).

ADJ tokens may have the following values of Gender:

Paradigm známýFem,NeutMascFemNeut
Animacy=Anim|Case=Nom|Degree=Sup|Number=Sing|Polarity=Posnejznámější
Animacy=Inan|Case=Acc|Degree=Pos|Number=Sing|Polarity=Posznámý
Animacy=Inan|Case=Gen|Degree=Pos|Number=Plur|Polarity=Posznámých
Animacy=Inan|Case=Nom|Degree=Pos|Number=Sing|Polarity=Posznámý
Animacy=Inan|Case=Nom|Degree=Pos|Number=Plur|Polarity=Posznámé
Case=Nom|Degree=Pos|Number=Sing|Polarity=Negneznámé
Case=Nom|Degree=Pos|Number=Sing|Polarity=Posznámá
Number=Sing|Polarity=Pos|Variant=Shortznámo
Number=Plur,Sing|Polarity=Pos|Variant=Shortznáma

PROPN

966 PROPN tokens (89% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Polarity=Pos (966; 100%), Foreign=EMPTY (897; 93%), Number=Sing (834; 86%).

PROPN tokens may have the following values of Gender:

Paradigm KarelMascFem
Animacy=Anim|Case=Gen|NameType=GivKarla
Animacy=Anim|Case=Nom|NameType=GivKarel
Case=Acc|NameType=SurKarel
Case=Nom|NameType=SurKarel

Gender seems to be lexical feature of PROPN. 99% lemmas (681) occur only with one value of Gender.

VERB

888 VERB tokens (51% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Voice=Act (888; 100%), Mood=EMPTY (888; 100%), Person=EMPTY (888; 100%), Tense=Past (887; 100%), VerbForm=Part (886; 100%), Polarity=Pos (852; 96%), Animacy=EMPTY (700; 79%), Number=Sing (498; 56%).

VERB tokens may have the following values of Gender:

Paradigm začítFem,MascFem,NeutMascFemNeut
Animacy=Anim|Number=Plurzačali
Animacy=Inan|Number=Plurzačaly
Number=Singzačalzačalazačalo
Number=Plur,Singzačala

DET

627 DET tokens (77% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Gender[psor]=EMPTY (606; 97%), Number[psor]=EMPTY (582; 93%), Person=EMPTY (582; 93%), Reflex=EMPTY (553; 88%), Animacy=EMPTY (541; 86%), Poss=EMPTY (508; 81%), Number=Sing (502; 80%), Case=Nom (317; 51%).

DET tokens may have the following values of Gender:

Paradigm tenMascMasc,NeutFemNeut
Animacy=Anim|Case=Acc|Number=Plurty
Animacy=Inan|Case=Acc|Number=Singten
Animacy=Inan|Case=Nom|Number=Plurty
Case=Acc|Number=Singto
Case=Acc|Number=Plurty
Case=Dat|Number=Singtomu
Case=Gen|Number=Singtoho
Case=Ins|Number=Singtímtou
Case=Loc|Number=Singtom
Case=Nom|Number=Singtentato
Case=Nom|Number=Plurty

AUX

251 AUX tokens (39% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Voice=Act (251; 100%), VerbForm=Part (251; 100%), Mood=EMPTY (251; 100%), Tense=Past (251; 100%), Person=EMPTY (251; 100%), Polarity=Pos (233; 93%), Number=Sing (149; 59%).

AUX tokens may have the following values of Gender:

Paradigm býtFem,MascFem,NeutMascFemNeut
Animacy=Anim|Number=Plur|Polarity=Posbyli
Animacy=Inan|Number=Plur|Polarity=Negnebyly
Animacy=Inan|Number=Plur|Polarity=Posbyly
Aspect=Imp|Number=Sing|Polarity=Negbyl
Number=Sing|Polarity=Negnebylnebylo
Number=Sing|Polarity=Posbylbylabylo
Number=Plur,Sing|Polarity=Negnebyla
Number=Plur,Sing|Polarity=Posbyla

PRON

106 PRON tokens (18% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (106; 100%), Number=Sing (88; 83%), Variant=EMPTY (86; 81%), PronType=Prs (71; 67%), Person=3 (71; 67%), PrepCase=EMPTY (65; 61%).

PRON tokens may have the following values of Gender:

Paradigm onMascMasc,NeutFemNeut
Animacy=Anim|Case=Nom|Number=Pluroni
Case=Acc|Number=Sing|PrepCase=Preněj, něho, ho
Case=Acc|Number=Singjije
Case=Acc|Number=Sing|Variant=Shortho
Case=Dat|Number=Sing|PrepCase=Preněmu
Case=Dat|Number=Sing
Case=Dat|Number=Sing|Variant=Shortmu
Case=Gen|Number=Sing|PrepCase=Preněj
Case=Gen|Number=Sing
Case=Ins|Number=Sing|PrepCase=Prením
Case=Ins|Number=Singjím
Case=Loc|Number=Sing|PrepCase=Preněm
Case=Nom|Number=Singonona

NUM

62 NUM tokens (14% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (62; 100%), NumValue=1,2,3 (62; 100%), NumForm=Word (62; 100%), Number=Sing (38; 61%).

NUM tokens may have the following values of Gender:

Paradigm jedenMascMasc,NeutFemNeut
Case=Accjedenjednujedno
Case=Genjednohojedné
Case=Insjednímjednou
Case=Locjednomjedné
Case=Nomjedenjednajedno

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (1705; 99%), VERB –[nsubj]–> PROPN (142; 65%), ADJ –[aux:pass]–> AUX (127; 78%), PROPN –[flat]–> PROPN (124; 88%), PROPN –[amod]–> ADJ (87; 99%), VERB –[conj]–> VERB (75; 63%), PROPN –[nmod]–> NOUN (60; 87%), ADJ –[nsubj]–> NOUN (54; 78%), ADJ –[conj]–> ADJ (44; 86%), PROPN –[conj]–> PROPN (33; 57%).