home pt/feat edit page issue tracker

This page still pertains to UD version 1.

Gender: gender

Gender is usually a lexical feature of nouns and inflectional feature of other parts of speech (pronouns, adjectives, determiners, numerals, verbs) that mark agreement with nouns.

Masc: masculine gender

Nouns denoting male persons are masculine. Other nouns may be also grammatically masculine, without any relation to sex.

Examples

Fem: feminine gender

Nouns denoting female persons are feminine. Other nouns may be also grammatically feminine, without any relation to sex.

Examples

Unsp: unspecified

Unsp is used to tag words that can be masculine or feminine when the context is not enough to make clear its gender.

Examples


Treebank Statistics (UD_Portuguese)

This feature is universal but the values Unsp are language-specific. It occurs with 3 different values: Fem, Masc, Unsp.

103964 tokens (48%) have a non-empty value of Gender. 18368 types (73%) occur at least once with a non-empty value of Gender. 14111 lemmas (80%) occur at least once with a non-empty value of Gender. The feature is used with 14 part-of-speech tags: pt-pos/NOUN (38465; 18% instances), pt-pos/DET (32137; 15% instances), pt-pos/PROPN (11173; 5% instances), pt-pos/ADJ (11139; 5% instances), pt-pos/PRON (6900; 3% instances), pt-pos/VERB (3435; 2% instances), pt-pos/SYM (384; 0% instances), pt-pos/NUM (137; 0% instances), pt-pos/ADP (131; 0% instances), pt-pos/ADV (25; 0% instances), pt-pos/AUX (22; 0% instances), pt-pos/X (12; 0% instances), pt-pos/INTJ (3; 0% instances), pt-pos/PART (1; 0% instances).

NOUN

38465 pt-pos/NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (27296; 71%).

NOUN tokens may have the following values of Gender:

Paradigm presidenteMascFemUnsp
Number=SingpresidentepresidentePresidente
Number=Plurpresidentes

Gender seems to be lexical feature of NOUN. 97% lemmas (6234) occur only with one value of Gender.

DET

32137 pt-pos/DET tokens (96% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (28441; 88%), Number=Sing (25307; 79%), Definite=Def (25256; 79%).

DET tokens may have the following values of Gender:

Paradigm muitoMascFemUnsp
Number=Singmuito, maismais, muita
Number=Plurmuitos, maismuitas, maismais
Number=Unspmais

PROPN

11173 pt-pos/PROPN tokens (62% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (10778; 96%).

PROPN tokens may have the following values of Gender:

Paradigm SãoMascFemUnsp
_SÃO
Number=SingSão, SÃOSãoSão

Gender seems to be lexical feature of PROPN. 94% lemmas (4288) occur only with one value of Gender.

ADJ

11139 pt-pos/ADJ tokens (99% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (7947; 71%).

ADJ tokens may have the following values of Gender:

Paradigm grandeMascFemUnsp
Number=Singmaior, grande, máximomaior, grande, máxima
Number=Plurgrandes, maiores, máximosgrandes, maioresgrandes

PRON

6900 pt-pos/PRON tokens (100% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (4848; 70%), Person=EMPTY (4610; 67%), Case=EMPTY (4483; 65%).

PRON tokens may have the following values of Gender:

Paradigm queMascFemUnsp
Definite=Def|Number=Sing|PronType=Artque
Number=Sing|PronType=Demque
Number=Sing|PronType=Indqueque
Number=Sing|PronType=Intquequeque
Number=Sing|PronType=Relqueque, quque
Number=Plur|PronType=Indque
Number=Plur|PronType=Intqueque
Number=Plur|PronType=Relquequeque
Number=Unsp|PronType=Indque
Number=Unsp|PronType=Relque
PronType=Relque

VERB

3435 pt-pos/VERB tokens (18% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Tense=EMPTY (3435; 100%), Person=EMPTY (3434; 100%), Mood=EMPTY (3434; 100%), VerbForm=Part (3432; 100%), Number=Sing (2254; 66%).

VERB tokens may have the following values of Gender:

Paradigm terMascFem
Number=Singtido
Number=Sing|Voice=Passtidotida
Number=Plurtidas

SYM

384 pt-pos/SYM tokens (99% of all SYM tokens) have a non-empty value of Gender.

The most frequent other feature values with which SYM and Gender co-occurred: Number=Plur (376; 98%).

SYM tokens may have the following values of Gender:

NUM

137 pt-pos/NUM tokens (3% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Mult (128; 93%).

NUM tokens may have the following values of Gender:

ADP

131 pt-pos/ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.

ADP tokens may have the following values of Gender:

ADV

25 pt-pos/ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADV and Gender co-occurred: Polarity=Neg (22; 88%).

ADV tokens may have the following values of Gender:

AUX

22 pt-pos/AUX tokens (0% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: VerbForm=Part (22; 100%), Tense=EMPTY (22; 100%), Person=EMPTY (22; 100%), Mood=EMPTY (22; 100%), Number=Sing (16; 73%).

AUX tokens may have the following values of Gender:

Gender seems to be lexical feature of AUX. 100% lemmas (13) occur only with one value of Gender.

X

12 pt-pos/X tokens (9% of all X tokens) have a non-empty value of Gender.

The most frequent other feature values with which X and Gender co-occurred: Number=Sing (11; 92%).

X tokens may have the following values of Gender:

Gender seems to be lexical feature of X. 100% lemmas (11) occur only with one value of Gender.

INTJ

3 pt-pos/INTJ tokens (7% of all INTJ tokens) have a non-empty value of Gender.

INTJ tokens may have the following values of Gender:

PART

1 pt-pos/PART tokens (20% of all PART tokens) have a non-empty value of Gender.

The most frequent other feature values with which PART and Gender co-occurred: Number=Sing (1; 100%).

PART tokens may have the following values of Gender:

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (25282; 95%), NOUN –[amod]–> ADJ (7987; 99%), PROPN –[det]–> DET (4201; 80%), NOUN –[acl]–> VERB (1497; 66%), NOUN –[conj]–> NOUN (1250; 61%), NOUN –[appos]–> PROPN (1129; 88%), PROPN –[conj]–> PROPN (753; 76%), VERB –[nsubj:pass]–> NOUN (618; 95%), ADJ –[det]–> DET (503; 92%), ADJ –[nsubj]–> NOUN (440; 96%).


Treebank Statistics (UD_Portuguese-BR)

This feature is universal. It occurs with 2 different values: Fem, Masc.

18999 tokens (7%) have a non-empty value of Gender. 4 types (0%) occur at least once with a non-empty value of Gender. 1 lemmas (20%) occur at least once with a non-empty value of Gender. The feature is used with 1 part-of-speech tags: pt-pos/DET (18999; 7% instances).

DET

18999 pt-pos/DET tokens (45% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (18999; 100%), Definite=Def (18999; 100%), Number=Sing (15867; 84%).

DET tokens may have the following values of Gender:

Paradigm oMascFem
Number=Singoa
Number=Plurosas

Gender in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [u] [ug] [uk] [ur] [vi] [yue] [zh]