NumType
: numeral type
Values: | Card | Frac | Mult | Ord | Sets |
Czech has a complex system of numerals. For example, in the school grammar of Czech, the main part of speech is “numeral”, it includes almost everything where counting is involved and there are various subtypes. It also includes interrogative, relative, indefinite and demonstrative quantifiers (words like kolik “how many”, tolik “so many”, několik “several”), so at the same time we may have a non-empty value of PronType.
From the syntactic point of view, some numtypes behave like adjectives
and some behave like adverbs. We tag them cs-pos/ADJ and
cs-pos/ADV respectively. Thus the NumType
feature applies to
several different parts of speech:
- cs-pos/NUM: cardinal numerals
- cs-pos/DET: quantifiers
- cs-pos/ADJ: adjectival ordinal and some generic numerals
- cs-pos/ADV: adverbial (e.g. ordinal and multiplicative) numerals
Card
: cardinal number or corresponding interrogative / relative / indefinite / demonstrative word
Examples
- jeden, dva, tři “one, two, three”
- kolik “how many”
- několik “several”, mnoho “many”, málo “few”
- tolik “so many”
- čtvero, patero, desatero (specific forms of four, five, ten; they are morphologically, syntactically and stylistically distinct from the default forms čtyři, pět, deset)
Ord
: ordinal number or corresponding interrogative / relative / indefinite / demonstrative word
This is a subtype of adjective or adverb.
Adjectival examples
- první “first”; druhý “second”, třetí “third”
- kolikátý lit. how manieth “which rank”
- několikátý “some rank”
- tolikátý “this/that rank”
Adverbial examples
- poprvé “for the first time”; podruhé “for the second time”; potřetí “for the third time”
- pokolikáté “for which time”
- poněkolikáté “for x-th time”
- potolikáté “it has been so many times”
Mult
: multiplicative numeral or corresponding interrogative / relative / indefinite / demonstrative word
This is subtype of adjective or adverb.
Examples
- dvojí, trojí, čtverý (twofold, threefold, fourfold; these are morphologically and syntactically adjectives)
- jednou “once”; dvakrát “twice”; třikrát “three times”
- kolikrát “how many times”
- několikrát “several times”
- tolikrát “so many times”
Frac
: fraction
This is a subtype of cardinal numbers. It may denote a fraction or just the denominator of the fraction.
Examples
- půl / polovina “half”; třetina “one third”; čtvrt / čtvrtina “quarter”
Sets
: number of sets of things; collective numeral
Morphologically distinct class of numerals used to count sets of things, or nouns that are pluralia tantum.
Examples
- dvoje / troje boty “two / three [pairs of] shoes”; as opposed to normal cardinal numbers: dvě / tři boty “two / three shoes”
Treebank Statistics (UD_Czech)
This feature is universal.
It occurs with 5 different values: Card
, Frac
, Mult
, Ord
, Sets
.
Some words have combined values of the feature; 1 combinations have been observed: Mult|Sets
.
43673 tokens (3%) have a non-empty value of NumType
.
3812 types (3%) occur at least once with a non-empty value of NumType
.
3381 lemmas (6%) occur at least once with a non-empty value of NumType
.
The feature is used with 4 part-of-speech tags: cs-pos/NUM (36842; 3% instances), cs-pos/ADJ (4420; 0% instances), cs-pos/DET (1626; 0% instances), cs-pos/ADV (785; 0% instances).
NUM
36842 cs-pos/NUM tokens (100% of all NUM
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which NUM
and NumType
co-occurred: Gender=EMPTY (32665; 89%), NumValue=EMPTY (29763; 81%), Case=EMPTY (26598; 72%), Number=EMPTY (26574; 72%), NumForm=Digit (26226; 71%).
NUM
tokens may have the following values of NumType
:
Card
(36547; 99% of non-emptyNumType
): 1, 2, 3, dva, tři, 4, jeden, 6, dvě, 5Frac
(295; 1% of non-emptyNumType
): třetiny, třetinu, třetina, čtvrtinu, třetině, čtvrtina, desetinu, čtvrtiny, pětinu, pětina
NumType
seems to be lexical feature of NUM
. 100% lemmas (3253) occur only with one value of NumType
.
ADJ
4420 cs-pos/ADJ tokens (3% of all ADJ
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADJ
and NumType
co-occurred: Degree=EMPTY (4420; 100%), Polarity=EMPTY (4420; 100%), Number=Sing (3733; 84%), Animacy=EMPTY (2856; 65%).
ADJ
tokens may have the following values of NumType
:
Mult,Sets
(63; 1% of non-emptyNumType
): dvojí, obojí, dvojím, dvojího, dvoje, obojím, trojí, oboje, dvojími, obéOrd
(4330; 98% of non-emptyNumType
): první, druhé, prvním, třetí, druhý, prvních, druhou, prvního, druhá, druhémSets
(27; 1% of non-emptyNumType
): jedny, jedni, jedněch, jedněm, jedněmiEMPTY
(164378): další, české, nové, státní, poslední, dalších, vlastní, možné, jiné, každý
NumType
seems to be lexical feature of ADJ
. 100% lemmas (62) occur only with one value of NumType
.
DET
1626 cs-pos/DET tokens (3% of all DET
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which DET
and NumType
co-occurred: Person=EMPTY (1626; 100%), Number[psor]=EMPTY (1626; 100%), Poss=EMPTY (1626; 100%), Animacy=EMPTY (1622; 100%), Number=EMPTY (1610; 99%), Gender=EMPTY (1610; 99%), PronType=Ind (1365; 84%).
DET
tokens may have the following values of NumType
:
Card
(1624; 100% of non-emptyNumType
): několik, několika, mnoho, mnoha, kolik, tolik, málo, moc, mála, tolikaOrd
(2; 0% of non-emptyNumType
): několikáté, několikátýEMPTY
(47144): to, které, který, jeho, která, jejich, své, tím, kteří, tom
NumType
seems to be lexical feature of DET
. 100% lemmas (16) occur only with one value of NumType
.
ADV
785 cs-pos/ADV tokens (1% of all ADV
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADV
and NumType
co-occurred: Polarity=EMPTY (785; 100%), Degree=EMPTY (785; 100%), PronType=EMPTY (673; 86%).
ADV
tokens may have the following values of NumType
:
Mult
(496; 63% of non-emptyNumType
): dvakrát, jednou, třikrát, několikrát, desetkrát, pětkrát, čtyřikrát, nejednou, šestkrát, mnohokrátOrd
(289; 37% of non-emptyNumType
): poprvé, podruhé, potřetí, počtvrté, podvanácté, pošesté, POPÁTÉ, Popáté, Pošestnácté, podesátéEMPTY
(69717): tak, už, také, jak, včera, ještě, již, dnes, více, tedy
NumType
seems to be lexical feature of ADV
. 100% lemmas (52) occur only with one value of NumType
.
Relations with Agreement in NumType
The 10 most frequent relations where parent and child node agree in NumType
:
NUM –[conj]–> NUM (2906; 100%),
NUM –[compound]–> NUM (2589; 100%),
NUM –[orphan]–> NUM (75; 100%),
ADJ –[conj]–> ADJ (60; 53%),
NUM –[dep]–> NUM (39; 100%),
NUM –[det:nummod]–> DET (14; 100%),
ADJ –[orphan]–> ADJ (10; 71%),
DET –[conj]–> DET (5; 83%),
NUM –[conj]–> DET (3; 60%),
DET –[appos]–> NUM (3; 100%).
Treebank Statistics (UD_Czech-CAC)
This feature is universal.
It occurs with 5 different values: Card
, Frac
, Mult
, Ord
, Sets
.
Some words have combined values of the feature; 1 combinations have been observed: Mult|Sets
.
8863 tokens (2%) have a non-empty value of NumType
.
342 types (1%) occur at least once with a non-empty value of NumType
.
138 lemmas (0%) occur at least once with a non-empty value of NumType
.
The feature is used with 4 part-of-speech tags: cs-pos/NUM (7204; 1% instances), cs-pos/ADJ (852; 0% instances), cs-pos/DET (642; 0% instances), cs-pos/ADV (165; 0% instances).
NUM
7204 cs-pos/NUM tokens (100% of all NUM
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which NUM
and NumType
co-occurred: Gender=EMPTY (6031; 84%), NumValue=EMPTY (5274; 73%), Number=EMPTY (4784; 66%), Case=EMPTY (4784; 66%), NumForm=Digit (4784; 66%).
NUM
tokens may have the following values of NumType
:
Card
(7149; 99% of non-emptyNumType
): #, dvou, jeden, dvě, tři, dva, obou, jednoho, jedné, jednímFrac
(55; 1% of non-emptyNumType
): třetina, třetinu, třetiny, čtvrtiny, dvanáctinu, třetinou, třetině, desetin, desetinu, dvanáctina
NumType
seems to be lexical feature of NUM
. 100% lemmas (59) occur only with one value of NumType
.
ADJ
852 cs-pos/ADJ tokens (1% of all ADJ
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADJ
and NumType
co-occurred: Degree=EMPTY (852; 100%), Polarity=EMPTY (852; 100%), Number=Sing (600; 70%), Animacy=EMPTY (558; 65%).
ADJ
tokens may have the following values of NumType
:
Mult,Sets
(33; 4% of non-emptyNumType
): dvojí, dvojím, obojí, trojí, dvojího, trojím, dvojímu, oboje, obojího, obojímOrd
(809; 95% of non-emptyNumType
): první, prvním, třetí, prvních, prvního, šedesátých, třetího, třicátých, dvacátých, pátéSets
(10; 1% of non-emptyNumType
): jedněch, jedni, jednyEMPTY
(72326): další, pracovní, jednotlivých, základní, nové, možno, socialistické, různých, dalších, každý
NumType
seems to be lexical feature of ADJ
. 100% lemmas (39) occur only with one value of NumType
.
DET
642 cs-pos/DET tokens (3% of all DET
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which DET
and NumType
co-occurred: Person=EMPTY (642; 100%), Number[psor]=EMPTY (642; 100%), Poss=EMPTY (642; 100%), Animacy=EMPTY (640; 100%), Number=EMPTY (628; 98%), Gender=EMPTY (628; 98%), PronType=Ind (552; 86%).
DET
tokens may have the following values of NumType
:
Card
(639; 100% of non-emptyNumType
): několik, mnoho, několika, mnoha, kolik, tolik, málo, mála, nejeden, nejednomOrd
(3; 0% of non-emptyNumType
): Kolikátý, kolikátá, kolikátémEMPTY
(18399): to, které, jejich, jeho, který, která, tím, této, své, těchto
ADV
165 cs-pos/ADV tokens (1% of all ADV
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADV
and NumType
co-occurred: Polarity=EMPTY (165; 100%), Degree=EMPTY (165; 100%), PronType=EMPTY (122; 74%).
ADV
tokens may have the following values of NumType
:
Mult
(116; 70% of non-emptyNumType
): dvakrát, nejednou, několikrát, třikrát, mnohokrát, kolikrát, desetkrát, stokrát, čtyřikrát, dvanáctkrátOrd
(49; 30% of non-emptyNumType
): poprvé, podruhé, potřetí, potřináctéEMPTY
(26788): tak, také, jak, již, už, ještě, kde, tedy, pak, velmi
NumType
seems to be lexical feature of ADV
. 100% lemmas (33) occur only with one value of NumType
.
Relations with Agreement in NumType
The 10 most frequent relations where parent and child node agree in NumType
:
NUM –[conj]–> NUM (310; 100%),
NUM –[compound]–> NUM (42; 100%),
NUM –[orphan]–> NUM (16; 100%),
ADV –[conj]–> ADV (6; 55%),
NUM –[det:nummod]–> DET (5; 100%),
NUM –[appos]–> DET (1; 100%).
Treebank Statistics (UD_Czech-CLTT)
This feature is universal.
It occurs with 3 different values: Card
, Mult
, Ord
.
355 tokens (1%) have a non-empty value of NumType
.
99 types (3%) occur at least once with a non-empty value of NumType
.
85 lemmas (4%) occur at least once with a non-empty value of NumType
.
The feature is used with 4 part-of-speech tags: cs-pos/NUM (310; 1% instances), cs-pos/ADJ (31; 0% instances), cs-pos/ADV (13; 0% instances), cs-pos/PRON (1; 0% instances).
NUM
310 cs-pos/NUM tokens (100% of all NUM
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which NUM
and NumType
co-occurred: Gender=EMPTY (278; 90%), NumValue=EMPTY (272; 88%), NumForm=Roman (264; 85%), Number=EMPTY (264; 85%), Case=EMPTY (264; 85%).
NUM
tokens may have the following values of NumType
:
Card
(310; 100% of non-emptyNumType
): 1, 3, 2, jeden, 4, 5, 41, 2004, 2008, 31
NumType
seems to be lexical feature of NUM
. 100% lemmas (76) occur only with one value of NumType
.
ADJ
31 cs-pos/ADJ tokens (1% of all ADJ
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADJ
and NumType
co-occurred: Number=Sing (31; 100%), Degree=EMPTY (31; 100%), Polarity=EMPTY (31; 100%), Gender=Masc (16; 52%).
ADJ
tokens may have the following values of NumType
:
Ord
(31; 100% of non-emptyNumType
): prvním, prvnímu, prvního, první, TŘETÍ, PÁTÁ, druhé, páté, ČTVRTÁ, ŠESTÁEMPTY
(4624): účetní, účetních, účetního, konsolidované, finanční, účetním, konsolidující, povinny, obchodního, ostatní
ADV
13 cs-pos/ADV tokens (2% of all ADV
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADV
and NumType
co-occurred: Polarity=EMPTY (13; 100%), Degree=EMPTY (13; 100%).
ADV
tokens may have the following values of NumType
:
Mult
(2; 15% of non-emptyNumType
): jednouOrd
(11; 85% of non-emptyNumType
): poprvéEMPTY
(554): zejména, dále, popřípadě, například, pouze, jinak, též, kdy, tak, později
PRON
1 cs-pos/PRON tokens (0% of all PRON
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which PRON
and NumType
co-occurred: Reflex=EMPTY (1; 100%), Variant=EMPTY (1; 100%), Case=Ins (1; 100%), PronType=Dem,Ind (1; 100%), Number=EMPTY (1; 100%), Gender=EMPTY (1; 100%).
PRON
tokens may have the following values of NumType
:
Card
(1; 100% of non-emptyNumType
): několikaEMPTY
(811): se, které, která, který, to, kterých, kterým, kterém, nichž, všech
Relations with Agreement in NumType
The 10 most frequent relations where parent and child node agree in NumType
:
NUM –[conj]–> NUM (29; 100%),
NUM –[conj]–> PRON (1; 100%).
NumType in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [u] [ug] [uk] [ur] [urj] [vi] [yue] [zh]