home bg/feat edit page issue tracker

This page still pertains to UD version 1.

NumType: numeral type

NumType

Some languages (especially Slavic) have a complex system of numerals. For example, in the school grammar of Czech, the main part of speech is “numeral”, it includes almost everything where counting is involved and there are various subtypes. It also includes interrogative, relative, indefinite and demonstrative words referring to numbers (words like kolik / how many, tolik / so many, několik / some, a few), so at the same time we may have a non-empty value of PronType. (In English, these words are called quantifiers and they are considered a subgroup of determiners.)

In this respect Bulgarian behaves like Czech language.

From the syntactic point of view, some numtypes behave like adjectives and some behave like adverbs. We tag them u-pos/ADJ and u-pos/ADV respectively. Thus the NumType feature applies to several different parts of speech:

Card: cardinal number or corresponding interrogative / relative / indefinite / demonstrative word

Note that in some Indo-European languages there is a fuzzy borderline between numerals and nouns for thousand, million and billion.

Examples

Ord: ordinal number or corresponding interrogative / relative / indefinite / demonstrative word

This is a subtype of adjective.

Examples

Mult: multiplicative numeral or corresponding interrogative / relative / indefinite / demonstrative word

This is subtype of adverb.

Examples

Frac: fraction

This is a subtype of cardinal numbers, occasionally distinguished in corpora. It may denote a fraction or just the denominator of the fraction. In Bulgarian the numerator is cardinal numeral and denominator is ordinal numeral.

Examples


Treebank Statistics (UD_Bulgarian)

This feature is universal. It occurs with 2 different values: Card, Ord.

3259 tokens (2%) have a non-empty value of NumType. 688 types (3%) occur at least once with a non-empty value of NumType. 523 lemmas (4%) occur at least once with a non-empty value of NumType. The feature is used with 3 part-of-speech tags: bg-pos/NUM (1883; 1% instances), bg-pos/ADJ (818; 1% instances), bg-pos/ADV (558; 0% instances).

NUM

1883 bg-pos/NUM tokens (100% of all NUM tokens) have a non-empty value of NumType.

The most frequent other feature values with which NUM and NumType co-occurred: Definite=Ind (1748; 93%), Number=Plur (1664; 88%), Gender=EMPTY (1419; 75%).

NUM tokens may have the following values of NumType:

Paradigm 2000CardOrd
_2000
Definite=Ind|Number=Plur2000

NumType seems to be lexical feature of NUM. 100% lemmas (384) occur only with one value of NumType.

ADJ

818 bg-pos/ADJ tokens (7% of all ADJ tokens) have a non-empty value of NumType.

The most frequent other feature values with which ADJ and NumType co-occurred: Voice=EMPTY (818; 100%), VerbForm=EMPTY (818; 100%), Aspect=EMPTY (818; 100%), Degree=Pos (818; 100%), Number=Sing (764; 93%), Definite=Ind (601; 73%).

ADJ tokens may have the following values of NumType:

NumType seems to be lexical feature of ADJ. 100% lemmas (157) occur only with one value of NumType.

ADV

558 bg-pos/ADV tokens (9% of all ADV tokens) have a non-empty value of NumType.

The most frequent other feature values with which ADV and NumType co-occurred: PronType=EMPTY (451; 81%), Degree=Pos (418; 75%).

ADV tokens may have the following values of NumType:

Relations with Agreement in NumType

The 10 most frequent relations where parent and child node agree in NumType: NUM –[flat]–> NUM (45; 100%), NUM –[nmod]–> NUM (30; 100%), NUM –[conj]–> NUM (24; 100%), ADJ –[conj]–> ADJ (12; 86%).


NumType in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [u] [ug] [uk] [ur] [urj] [vi] [yue] [zh]