Gender: gender
Latvian features gender for nouns (NOUN, PROPN), adjectives (ADJ), some numerals (NUM), some participles (VERB with VerbForm=Part or VerbForm=Conv) and some pronouns (PRON and DET).
Values used:
Masc(masculine gender)Fem(feminine gender)
Values not used:
Neut(neuter gender)Com(common gender)
Traditional dictionaries describe some nouns, e.g., bende, aitasgalva as having common gender, but in Latvian Treebank these are anotated as masculine or feminine based on context and/or singular dative form which differs depending on contextual gender.
Treebank Statistics (UD_Latvian)
This feature is universal.
It occurs with 2 different values: Fem, Masc.
19615 tokens (44%) have a non-empty value of Gender.
8802 types (74%) occur at least once with a non-empty value of Gender.
4804 lemmas (69%) occur at least once with a non-empty value of Gender.
The feature is used with 7 part-of-speech tags: lv-pos/NOUN (11525; 26% instances), lv-pos/ADJ (2244; 5% instances), lv-pos/PRON (1692; 4% instances), lv-pos/PROPN (1466; 3% instances), lv-pos/VERB (1400; 3% instances), lv-pos/DET (1008; 2% instances), lv-pos/NUM (280; 1% instances).
NOUN
11525 lv-pos/NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (8098; 70%).
NOUN tokens may have the following values of Gender:
Fem(6032; 52% of non-emptyGender): valsts, izglītības, finanšu, pasaules, bibliotēkas, meitene, padomes, dienas, grāmatas, pašvaldībasMasc(5493; 48% of non-emptyGender): gada, gadā, darba, atkritumu, darbinieku, latu, nagu, laikā, skaitu, gaduEMPTY(10): kino, Sanī, foto, Cukini, alibi, auto
| Paradigm slepkava | Masc | Fem |
|---|---|---|
| Number=Sing | slepkava | |
| Number=Plur | slepkavas |
Gender seems to be lexical feature of NOUN. 100% lemmas (2769) occur only with one value of Gender.
ADJ
2244 lv-pos/ADJ tokens (87% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: NumType=EMPTY (2174; 97%), Degree=Pos (2047; 91%), Number=Sing (1520; 68%).
ADJ tokens may have the following values of Gender:
Fem(1066; 48% of non-emptyGender): nacionālās, liela, otrās, vispārējās, jaunu, lielas, sabiedrisko, jaunās, lielu, NacionālāMasc(1178; 52% of non-emptyGender): pedagoģisko, iespējams, liels, nepieciešams, galvenais, nepieciešamo, dažādu, lielā, jauna, vienīgaisEMPTY(325): 2012., 1., 2., 3., 2010., 2011., 2007., 4., 2013., 7.
| Paradigm skaists | Masc | Fem |
|---|---|---|
| Case=Acc|Degree=Pos|Number=Plur | skaistus | |
| Case=Nom|Degree=Pos|Number=Sing | skaists | skaista |
| Case=Nom|Degree=Pos|Number=Plur | skaisti | |
| Case=Nom|Degree=Cmp|Number=Sing | skaistāks |
Gender seems to be lexical feature of ADJ. 99% lemmas (787) occur only with one value of Gender.
PRON
1692 lv-pos/PRON tokens (56% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (1389; 82%), Person=EMPTY (1044; 62%), Case=Nom (896; 53%).
PRON tokens may have the following values of Gender:
Fem(614; 36% of non-emptyGender): viņa, tā, viņai, viņas, tās, viņu, to, pati, tajā, kurasMasc(1078; 64% of non-emptyGender): tas, viņš, to, viņi, tam, viņa, viņu, viss, viņam, kuršEMPTY(1343): es, kas, man, ko, mēs, tu, mani, mums, tev, sevi
| Paradigm viņa | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | viņu | |
| Case=Acc|Number=Plur | viņas | |
| Case=Dat|Number=Sing | viņai | |
| Case=Dat|Number=Plur | viņām | |
| Case=Gen|Number=Sing | viņa | viņas |
| Case=Loc|Number=Sing | viņā | |
| Case=Nom|Number=Sing | viņa | |
| Case=Nom|Number=Plur | viņas |
PROPN
1466 lv-pos/PROPN tokens (81% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Abbr=EMPTY (1466; 100%), Number=Sing (1418; 97%).
PROPN tokens may have the following values of Gender:
Fem(859; 59% of non-emptyGender): Latvijas, Sofija, Eiropas, Latvijā, Rīgas, Jelgavas, Sanī, Sofijai, Rīga, RīgāMasc(607; 41% of non-emptyGender): Andris, Vilks, Ģirts, Kuplais, Latvenergo, Grūtupa, Ziedonis, Jānis, Andra, BalvisEMPTY(336): SIA, ZAAO, Pillar, IKP, UNESCO, DUS, LETA, ST, EEK, ES
| Paradigm Seisums | Masc | Fem |
|---|---|---|
| Case=Gen | Seisuma | |
| Case=Nom | Seisuma |
Gender seems to be lexical feature of PROPN. 99% lemmas (508) occur only with one value of Gender.
VERB
1400 lv-pos/VERB tokens (19% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Person=EMPTY (1400; 100%), Evident=EMPTY (1400; 100%), Polarity=EMPTY (1400; 100%), Mood=EMPTY (1400; 100%), VerbForm=Part (1381; 99%), Degree=Pos (1380; 99%), Reflex=EMPTY (1292; 92%), Voice=EMPTY (1176; 84%), Aspect=Perf (1175; 84%), Tense=Past (1175; 84%).
VERB tokens may have the following values of Gender:
Fem(586; 42% of non-emptyGender): cēlusies, kļuvusi, bijusi, izlietojusi, radusies, sākusi, adresēta, atklājusi, celta, iesaistītajāmMasc(814; 58% of non-emptyGender): bijis, saistīts, rakstīts, pagājušā, ziņots, dibināts, pagājušajā, saņēmis, zināms, ProtamsEMPTY(5799): ir, bija, nav, var, būs, nebija, varētu, būtu, esmu, tiek
| Paradigm būt | Masc | Fem |
|---|---|---|
| Aspect=Imp|Case=Gen|Definite=Def|Degree=Pos|Number=Plur|Tense=Pres|VerbForm=Part|Voice=Pass | esošo | |
| Aspect=Imp|Case=Loc|Definite=Ind|Degree=Pos|Number=Plur|Tense=Pres|VerbForm=Part|Voice=Pass | esošās | |
| Aspect=Imp|Case=Nom|Definite=Def|Degree=Pos|Number=Sing|Tense=Pres|VerbForm=Part|Voice=Pass | esošais | esošā |
| Aspect=Perf|Case=Acc|Definite=Def|Degree=Pos|Number=Sing|Tense=Past|VerbForm=Part | bijušo | |
| Aspect=Perf|Case=Dat|Definite=Def|Degree=Pos|Number=Plur|Tense=Past|VerbForm=Part | bijušajām | |
| Aspect=Perf|Case=Nom|Definite=Ind|Degree=Pos|Number=Sing|Tense=Past|VerbForm=Part | bijis | bijusi |
| Aspect=Perf|Case=Nom|Definite=Ind|Degree=Pos|Number=Plur|Tense=Past|VerbForm=Part | bijuši | bijušas |
| Case=Nom|Definite=Ind|Number=Plur|VerbForm=Conv|Voice=Pass | būdami |
DET
1008 lv-pos/DET tokens (100% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Poss=EMPTY (815; 81%), Number=Sing (693; 69%).
DET tokens may have the following values of Gender:
Fem(473; 47% of non-emptyGender): šīs, šo, savu, šī, savā, savas, tās, visas, šai, šajāMasc(535; 53% of non-emptyGender): savu, šo, to, mans, tā, kādu, šis, visu, kāda, šajāEMPTY(3): Mani, kā, mana
| Paradigm sava | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | savu | savu |
| Case=Acc|Number=Plur | savas | |
| Case=Dat|Number=Sing | savai | |
| Case=Dat|Number=Plur | savām | |
| Case=Gen|Number=Sing | savas | |
| Case=Gen|Number=Plur | savu | |
| Case=Loc|Number=Sing | savā | |
| Case=Loc|Number=Plur | savās | |
| Case=Nom|Number=Sing | sava | |
| Case=Nom|Number=Plur | savas |
NUM
280 lv-pos/NUM tokens (44% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (279; 100%), Number=Sing (147; 53%).
NUM tokens may have the following values of Gender:
Fem(125; 45% of non-emptyGender): viena, divas, trīs, vienu, vienā, vienai, divām, otra, otras, vienasMasc(155; 55% of non-emptyGender): viens, vienu, trīs, vienam, divi, vienā, divus, tūkstošiem, piecos, diviemEMPTY(360): desmit, 25, 3, 2, 2007, 4, 80, 987, 5, 20
| Paradigm viens | Masc | Fem |
|---|---|---|
| Case=Acc | vienu | vienu |
| Case=Dat | vienam | |
| Case=Gen | viena | |
| Case=Loc | vienā | |
| Case=Nom | viens |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[nmod]–> NOUN (1717; 52%),
NOUN –[amod]–> ADJ (1576; 85%),
NOUN –[det]–> DET (916; 95%),
NOUN –[conj]–> NOUN (435; 62%),
NOUN –[amod]–> VERB (397; 95%),
PROPN –[flat:name]–> PROPN (228; 97%),
NOUN –[nummod]–> NUM (181; 55%),
VERB –[nsubj:pass]–> NOUN (179; 97%),
NOUN –[acl]–> NOUN (167; 54%),
PROPN –[nmod]–> NOUN (145; 78%).
Gender in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [u] [ug] [uk] [ur] [vi] [yue] [zh]