home bg/feat edit page issue tracker

This page still pertains to UD version 1.

Gender: gender

Gender is usually a lexical feature of nouns and inflectional feature of other parts of speech (adjectives, verbs) that mark agreement with nouns. In Bulgarian gender is grammatical.

There are three genders: masculine(m), feminine (f) and neuter (n).

Masc: masculine gender

Nouns denoting male persons are masculine. Other nouns may be also grammatically masculine, without any relation to sex.

Example: [bg] замък / zamak “castle”

Fem: feminine gender

Nouns denoting female persons are feminine. Other nouns may be also grammatically feminine, without any relation to sex.

Example: [bg] маса / masa “table”

Neut: neuter gender

Neither masculine nor feminine (grammatically).

Example: [bg] дете / dete “child”


Treebank Statistics (UD_Bulgarian)

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

52997 tokens (38%) have a non-empty value of Gender. 18352 types (74%) occur at least once with a non-empty value of Gender. 10734 lemmas (76%) occur at least once with a non-empty value of Gender. The feature is used with 9 part-of-speech tags: bg-pos/NOUN (30163; 21% instances), bg-pos/ADJ (8547; 6% instances), bg-pos/PROPN (7546; 5% instances), bg-pos/PRON (2934; 2% instances), bg-pos/VERB (1663; 1% instances), bg-pos/DET (1525; 1% instances), bg-pos/NUM (464; 0% instances), bg-pos/AUX (154; 0% instances), bg-pos/ADP (1; 0% instances).

NOUN

30163 bg-pos/NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (21455; 71%), Definite=Ind (18466; 61%).

NOUN tokens may have the following values of Gender:

Paradigm главаMascFem
Definite=Def|Number=Singглавата
Definite=Def|Number=Plurглавите
Definite=Ind|Number=Singглаваглава
Definite=Ind|Number=Plurглави

Gender seems to be lexical feature of NOUN. 100% lemmas (5259) occur only with one value of Gender.

ADJ

8547 bg-pos/ADJ tokens (70% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (8547; 100%), Degree=Pos (8086; 95%), Voice=EMPTY (7783; 91%), Aspect=EMPTY (7783; 91%), VerbForm=EMPTY (7783; 91%), Definite=Ind (4678; 55%).

ADJ tokens may have the following values of Gender:

Paradigm новMascFemNeut
Case=Voc|Degree=PosНови
Definite=Def|Degree=Posновия, новиятноватановото
Definite=Def|Degree=Supнай-новиятнай-новатаНай-новото
Definite=Ind|Degree=Posновнованово

PROPN

7546 bg-pos/PROPN tokens (99% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (7432; 98%), Definite=Ind (7281; 96%).

PROPN tokens may have the following values of Gender:

Paradigm белMascFemNeut
БЕЛБЕЛБЕЛ

Gender seems to be lexical feature of PROPN. 99% lemmas (2712) occur only with one value of Gender.

PRON

2934 bg-pos/PRON tokens (32% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Poss=EMPTY (2934; 100%), Number=Sing (2934; 100%), Reflex=EMPTY (2934; 100%), Case=Nom (2028; 69%), PronType=Prs (1609; 55%), Person=3 (1609; 55%).

PRON tokens may have the following values of Gender:

Paradigm азMascFemNeut
Case=Accго, негоя, неяго, него
Case=Datму, немуйму
Case=Nomтойтято
й

VERB

1663 bg-pos/VERB tokens (11% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Person=EMPTY (1663; 100%), VerbForm=Part (1663; 100%), Number=Sing (1663; 100%), Definite=Ind (1662; 100%), Mood=EMPTY (1650; 99%), Aspect=Perf (1248; 75%), Voice=Act (1030; 62%), Tense=Past (870; 52%).

VERB tokens may have the following values of Gender:

Paradigm могаMascFemNeut
Tense=Impможелможеламожело
Tense=Pastмогълмогламогло

DET

1525 bg-pos/DET tokens (71% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (1525; 100%), Person=EMPTY (1293; 85%), Poss=EMPTY (1220; 80%), Definite=EMPTY (987; 65%), Case=EMPTY (912; 60%).

DET tokens may have the following values of Gender:

Paradigm тозиMascFemNeut
Case=Nomтази, тая, онази, тeзитова, онова, туй
този, тоя, оня, онзи

NUM

464 bg-pos/NUM tokens (25% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (464; 100%), Definite=Ind (403; 87%), Number=Plur (254; 55%).

NUM tokens may have the following values of Gender:

Paradigm дваMascFemNeut
Definite=Defдватадветедвете
Definite=Indдва, 2две, 2две

AUX

154 bg-pos/AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Person=EMPTY (154; 100%), Tense=EMPTY (154; 100%), VerbForm=Part (154; 100%), Number=Sing (154; 100%), Mood=Ind (154; 100%), Voice=Act (154; 100%), Aspect=Imp (154; 100%).

AUX tokens may have the following values of Gender:

Paradigm съмMascFemNeut
билбилабило

ADP

1 bg-pos/ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.

ADP tokens may have the following values of Gender:

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (7143; 70%), NOUN –[nmod]–> PROPN (1608; 55%), PROPN –[flat]–> PROPN (1404; 95%), NOUN –[det]–> DET (1209; 69%), PROPN –[conj]–> PROPN (391; 72%), ADJ –[nsubj]–> NOUN (248; 73%), ADJ –[conj]–> ADJ (219; 97%), PROPN –[amod]–> ADJ (216; 83%), PROPN –[nmod]–> PROPN (215; 70%), PROPN –[nmod]–> NOUN (208; 67%).


Gender in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [u] [ug] [uk] [ur] [vi] [yue] [zh]