home lv/feat edit page issue tracker

This page still pertains to UD version 1.

Gender: gender

Latvian features gender for nouns (NOUN, PROPN), adjectives (ADJ), some numerals (NUM), some participles (VERB with VerbForm=Part or VerbForm=Conv) and some pronouns (PRON and DET).

Values used:

Values not used:

Traditional dictionaries describe some nouns, e.g., bende, aitasgalva as having common gender, but in Latvian Treebank these are anotated as masculine or feminine based on context and/or singular dative form which differs depending on contextual gender.


Treebank Statistics (UD_Latvian)

This feature is universal. It occurs with 2 different values: Fem, Masc.

19615 tokens (44%) have a non-empty value of Gender. 8802 types (74%) occur at least once with a non-empty value of Gender. 4804 lemmas (69%) occur at least once with a non-empty value of Gender. The feature is used with 7 part-of-speech tags: lv-pos/NOUN (11525; 26% instances), lv-pos/ADJ (2244; 5% instances), lv-pos/PRON (1692; 4% instances), lv-pos/PROPN (1466; 3% instances), lv-pos/VERB (1400; 3% instances), lv-pos/DET (1008; 2% instances), lv-pos/NUM (280; 1% instances).

NOUN

11525 lv-pos/NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (8098; 70%).

NOUN tokens may have the following values of Gender:

Paradigm slepkavaMascFem
Number=Singslepkava
Number=Plurslepkavas

Gender seems to be lexical feature of NOUN. 100% lemmas (2769) occur only with one value of Gender.

ADJ

2244 lv-pos/ADJ tokens (87% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: NumType=EMPTY (2174; 97%), Degree=Pos (2047; 91%), Number=Sing (1520; 68%).

ADJ tokens may have the following values of Gender:

Paradigm skaistsMascFem
Case=Acc|Degree=Pos|Number=Plurskaistus
Case=Nom|Degree=Pos|Number=Singskaistsskaista
Case=Nom|Degree=Pos|Number=Plurskaisti
Case=Nom|Degree=Cmp|Number=Singskaistāks

Gender seems to be lexical feature of ADJ. 99% lemmas (787) occur only with one value of Gender.

PRON

1692 lv-pos/PRON tokens (56% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (1389; 82%), Person=EMPTY (1044; 62%), Case=Nom (896; 53%).

PRON tokens may have the following values of Gender:

Paradigm viņaMascFem
Case=Acc|Number=Singviņu
Case=Acc|Number=Plurviņas
Case=Dat|Number=Singviņai
Case=Dat|Number=Plurviņām
Case=Gen|Number=Singviņaviņas
Case=Loc|Number=Singviņā
Case=Nom|Number=Singviņa
Case=Nom|Number=Plurviņas

PROPN

1466 lv-pos/PROPN tokens (81% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Abbr=EMPTY (1466; 100%), Number=Sing (1418; 97%).

PROPN tokens may have the following values of Gender:

Paradigm SeisumsMascFem
Case=GenSeisuma
Case=NomSeisuma

Gender seems to be lexical feature of PROPN. 99% lemmas (508) occur only with one value of Gender.

VERB

1400 lv-pos/VERB tokens (19% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Person=EMPTY (1400; 100%), Evident=EMPTY (1400; 100%), Polarity=EMPTY (1400; 100%), Mood=EMPTY (1400; 100%), VerbForm=Part (1381; 99%), Degree=Pos (1380; 99%), Reflex=EMPTY (1292; 92%), Voice=EMPTY (1176; 84%), Aspect=Perf (1175; 84%), Tense=Past (1175; 84%).

VERB tokens may have the following values of Gender:

Paradigm būtMascFem
Aspect=Imp|Case=Gen|Definite=Def|Degree=Pos|Number=Plur|Tense=Pres|VerbForm=Part|Voice=Passesošo
Aspect=Imp|Case=Loc|Definite=Ind|Degree=Pos|Number=Plur|Tense=Pres|VerbForm=Part|Voice=Passesošās
Aspect=Imp|Case=Nom|Definite=Def|Degree=Pos|Number=Sing|Tense=Pres|VerbForm=Part|Voice=Passesošaisesošā
Aspect=Perf|Case=Acc|Definite=Def|Degree=Pos|Number=Sing|Tense=Past|VerbForm=Partbijušo
Aspect=Perf|Case=Dat|Definite=Def|Degree=Pos|Number=Plur|Tense=Past|VerbForm=Partbijušajām
Aspect=Perf|Case=Nom|Definite=Ind|Degree=Pos|Number=Sing|Tense=Past|VerbForm=Partbijisbijusi
Aspect=Perf|Case=Nom|Definite=Ind|Degree=Pos|Number=Plur|Tense=Past|VerbForm=Partbijušibijušas
Case=Nom|Definite=Ind|Number=Plur|VerbForm=Conv|Voice=Passbūdami

DET

1008 lv-pos/DET tokens (100% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Poss=EMPTY (815; 81%), Number=Sing (693; 69%).

DET tokens may have the following values of Gender:

Paradigm savaMascFem
Case=Acc|Number=Singsavusavu
Case=Acc|Number=Plursavas
Case=Dat|Number=Singsavai
Case=Dat|Number=Plursavām
Case=Gen|Number=Singsavas
Case=Gen|Number=Plursavu
Case=Loc|Number=Singsavā
Case=Loc|Number=Plursavās
Case=Nom|Number=Singsava
Case=Nom|Number=Plursavas

NUM

280 lv-pos/NUM tokens (44% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (279; 100%), Number=Sing (147; 53%).

NUM tokens may have the following values of Gender:

Paradigm viensMascFem
Case=Accvienuvienu
Case=Datvienam
Case=Genviena
Case=Locvienā
Case=Nomviens

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[nmod]–> NOUN (1717; 52%), NOUN –[amod]–> ADJ (1576; 85%), NOUN –[det]–> DET (916; 95%), NOUN –[conj]–> NOUN (435; 62%), NOUN –[amod]–> VERB (397; 95%), PROPN –[flat:name]–> PROPN (228; 97%), NOUN –[nummod]–> NUM (181; 55%), VERB –[nsubj:pass]–> NOUN (179; 97%), NOUN –[acl]–> NOUN (167; 54%), PROPN –[nmod]–> NOUN (145; 78%).


Gender in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [u] [ug] [uk] [ur] [vi] [yue] [zh]