Gender
: gender
Latvian features gender for nouns (NOUN
, PROPN
), adjectives (ADJ
), some numerals (NUM
), some participles (VERB
with VerbForm=Part
or VerbForm=Conv
) and some pronouns (PRON
and DET
).
Values used:
Masc
(masculine gender)Fem
(feminine gender)
Values not used:
Neut
(neuter gender)Com
(common gender)
Traditional dictionaries describe some nouns, e.g., bende, aitasgalva as having common gender, but in Latvian Treebank these are anotated as masculine or feminine based on context and/or singular dative form which differs depending on contextual gender.
Treebank Statistics (UD_Latvian)
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
19615 tokens (44%) have a non-empty value of Gender
.
8802 types (74%) occur at least once with a non-empty value of Gender
.
4804 lemmas (69%) occur at least once with a non-empty value of Gender
.
The feature is used with 7 part-of-speech tags: lv-pos/NOUN (11525; 26% instances), lv-pos/ADJ (2244; 5% instances), lv-pos/PRON (1692; 4% instances), lv-pos/PROPN (1466; 3% instances), lv-pos/VERB (1400; 3% instances), lv-pos/DET (1008; 2% instances), lv-pos/NUM (280; 1% instances).
NOUN
11525 lv-pos/NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (8098; 70%).
NOUN
tokens may have the following values of Gender
:
Fem
(6032; 52% of non-emptyGender
): valsts, izglītības, finanšu, pasaules, bibliotēkas, meitene, padomes, dienas, grāmatas, pašvaldībasMasc
(5493; 48% of non-emptyGender
): gada, gadā, darba, atkritumu, darbinieku, latu, nagu, laikā, skaitu, gaduEMPTY
(10): kino, Sanī, foto, Cukini, alibi, auto
Paradigm slepkava | Masc | Fem |
---|---|---|
Number=Sing | slepkava | |
Number=Plur | slepkavas |
Gender
seems to be lexical feature of NOUN
. 100% lemmas (2769) occur only with one value of Gender
.
ADJ
2244 lv-pos/ADJ tokens (87% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: NumType=EMPTY (2174; 97%), Degree=Pos (2047; 91%), Number=Sing (1520; 68%).
ADJ
tokens may have the following values of Gender
:
Fem
(1066; 48% of non-emptyGender
): nacionālās, liela, otrās, vispārējās, jaunu, lielas, sabiedrisko, jaunās, lielu, NacionālāMasc
(1178; 52% of non-emptyGender
): pedagoģisko, iespējams, liels, nepieciešams, galvenais, nepieciešamo, dažādu, lielā, jauna, vienīgaisEMPTY
(325): 2012., 1., 2., 3., 2010., 2011., 2007., 4., 2013., 7.
Paradigm skaists | Masc | Fem |
---|---|---|
Case=Acc|Degree=Pos|Number=Plur | skaistus | |
Case=Nom|Degree=Pos|Number=Sing | skaists | skaista |
Case=Nom|Degree=Pos|Number=Plur | skaisti | |
Case=Nom|Degree=Cmp|Number=Sing | skaistāks |
Gender
seems to be lexical feature of ADJ
. 99% lemmas (787) occur only with one value of Gender
.
PRON
1692 lv-pos/PRON tokens (56% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (1389; 82%), Person=EMPTY (1044; 62%), Case=Nom (896; 53%).
PRON
tokens may have the following values of Gender
:
Fem
(614; 36% of non-emptyGender
): viņa, tā, viņai, viņas, tās, viņu, to, pati, tajā, kurasMasc
(1078; 64% of non-emptyGender
): tas, viņš, to, viņi, tam, viņa, viņu, viss, viņam, kuršEMPTY
(1343): es, kas, man, ko, mēs, tu, mani, mums, tev, sevi
Paradigm viņa | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | viņu | |
Case=Acc|Number=Plur | viņas | |
Case=Dat|Number=Sing | viņai | |
Case=Dat|Number=Plur | viņām | |
Case=Gen|Number=Sing | viņa | viņas |
Case=Loc|Number=Sing | viņā | |
Case=Nom|Number=Sing | viņa | |
Case=Nom|Number=Plur | viņas |
PROPN
1466 lv-pos/PROPN tokens (81% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Abbr=EMPTY (1466; 100%), Number=Sing (1418; 97%).
PROPN
tokens may have the following values of Gender
:
Fem
(859; 59% of non-emptyGender
): Latvijas, Sofija, Eiropas, Latvijā, Rīgas, Jelgavas, Sanī, Sofijai, Rīga, RīgāMasc
(607; 41% of non-emptyGender
): Andris, Vilks, Ģirts, Kuplais, Latvenergo, Grūtupa, Ziedonis, Jānis, Andra, BalvisEMPTY
(336): SIA, ZAAO, Pillar, IKP, UNESCO, DUS, LETA, ST, EEK, ES
Paradigm Seisums | Masc | Fem |
---|---|---|
Case=Gen | Seisuma | |
Case=Nom | Seisuma |
Gender
seems to be lexical feature of PROPN
. 99% lemmas (508) occur only with one value of Gender
.
VERB
1400 lv-pos/VERB tokens (19% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Person=EMPTY (1400; 100%), Evident=EMPTY (1400; 100%), Polarity=EMPTY (1400; 100%), Mood=EMPTY (1400; 100%), VerbForm=Part (1381; 99%), Degree=Pos (1380; 99%), Reflex=EMPTY (1292; 92%), Voice=EMPTY (1176; 84%), Aspect=Perf (1175; 84%), Tense=Past (1175; 84%).
VERB
tokens may have the following values of Gender
:
Fem
(586; 42% of non-emptyGender
): cēlusies, kļuvusi, bijusi, izlietojusi, radusies, sākusi, adresēta, atklājusi, celta, iesaistītajāmMasc
(814; 58% of non-emptyGender
): bijis, saistīts, rakstīts, pagājušā, ziņots, dibināts, pagājušajā, saņēmis, zināms, ProtamsEMPTY
(5799): ir, bija, nav, var, būs, nebija, varētu, būtu, esmu, tiek
Paradigm būt | Masc | Fem |
---|---|---|
Aspect=Imp|Case=Gen|Definite=Def|Degree=Pos|Number=Plur|Tense=Pres|VerbForm=Part|Voice=Pass | esošo | |
Aspect=Imp|Case=Loc|Definite=Ind|Degree=Pos|Number=Plur|Tense=Pres|VerbForm=Part|Voice=Pass | esošās | |
Aspect=Imp|Case=Nom|Definite=Def|Degree=Pos|Number=Sing|Tense=Pres|VerbForm=Part|Voice=Pass | esošais | esošā |
Aspect=Perf|Case=Acc|Definite=Def|Degree=Pos|Number=Sing|Tense=Past|VerbForm=Part | bijušo | |
Aspect=Perf|Case=Dat|Definite=Def|Degree=Pos|Number=Plur|Tense=Past|VerbForm=Part | bijušajām | |
Aspect=Perf|Case=Nom|Definite=Ind|Degree=Pos|Number=Sing|Tense=Past|VerbForm=Part | bijis | bijusi |
Aspect=Perf|Case=Nom|Definite=Ind|Degree=Pos|Number=Plur|Tense=Past|VerbForm=Part | bijuši | bijušas |
Case=Nom|Definite=Ind|Number=Plur|VerbForm=Conv|Voice=Pass | būdami |
DET
1008 lv-pos/DET tokens (100% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Poss=EMPTY (815; 81%), Number=Sing (693; 69%).
DET
tokens may have the following values of Gender
:
Fem
(473; 47% of non-emptyGender
): šīs, šo, savu, šī, savā, savas, tās, visas, šai, šajāMasc
(535; 53% of non-emptyGender
): savu, šo, to, mans, tā, kādu, šis, visu, kāda, šajāEMPTY
(3): Mani, kā, mana
Paradigm sava | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | savu | savu |
Case=Acc|Number=Plur | savas | |
Case=Dat|Number=Sing | savai | |
Case=Dat|Number=Plur | savām | |
Case=Gen|Number=Sing | savas | |
Case=Gen|Number=Plur | savu | |
Case=Loc|Number=Sing | savā | |
Case=Loc|Number=Plur | savās | |
Case=Nom|Number=Sing | sava | |
Case=Nom|Number=Plur | savas |
NUM
280 lv-pos/NUM tokens (44% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Card (279; 100%), Number=Sing (147; 53%).
NUM
tokens may have the following values of Gender
:
Fem
(125; 45% of non-emptyGender
): viena, divas, trīs, vienu, vienā, vienai, divām, otra, otras, vienasMasc
(155; 55% of non-emptyGender
): viens, vienu, trīs, vienam, divi, vienā, divus, tūkstošiem, piecos, diviemEMPTY
(360): desmit, 25, 3, 2, 2007, 4, 80, 987, 5, 20
Paradigm viens | Masc | Fem |
---|---|---|
Case=Acc | vienu | vienu |
Case=Dat | vienam | |
Case=Gen | viena | |
Case=Loc | vienā | |
Case=Nom | viens |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[nmod]–> NOUN (1717; 52%),
NOUN –[amod]–> ADJ (1576; 85%),
NOUN –[det]–> DET (916; 95%),
NOUN –[conj]–> NOUN (435; 62%),
NOUN –[amod]–> VERB (397; 95%),
PROPN –[flat:name]–> PROPN (228; 97%),
NOUN –[nummod]–> NUM (181; 55%),
VERB –[nsubj:pass]–> NOUN (179; 97%),
NOUN –[acl]–> NOUN (167; 54%),
PROPN –[nmod]–> NOUN (145; 78%).
Gender in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [u] [ug] [uk] [ur] [vi] [yue] [zh]