NumForm

home sl/feat edit page issue tracker

This page still pertains to UD version 1.

`NumForm`: numeral form

NumForm is a lexical feature of numerals that marks whether the number is expressed by digits or letters.

Word: number expressed as word

Examples

en “one”, dva “two”, tri “three”
enoj “one-fold”, dvoj “two-fold”, troj “three-fold”

Digit: number expressed using digits

Examples

1, 2, 3
1., 2., 3.

Roman: roman numeral

Examples

I, II, III
I., II., III.

Conversion from JOS

NumForm is assigned to all numerals that are converted to UD NUM. Numerals with Form=digit are converted to NumForm=Digit, numerals with Form=roman are converted to NumForm=Roman and numerals with Form=letter are converted to NumForm=Word. Note, however, that (word) numerals that are converted to UD ADJ, do not have any NumForm.

Treebank Statistics (UD_Slovenian)

This feature is language-specific. It occurs with 3 different values: Digit, Roman, Word.

1779 tokens (1%) have a non-empty value of NumForm. 534 types (2%) occur at least once with a non-empty value of NumForm. 469 lemmas (3%) occur at least once with a non-empty value of NumForm. The feature is used with 1 part-of-speech tags: sl-pos/NUM (1779; 1% instances).

`NUM`

1779 sl-pos/NUM tokens (100% of all NUM tokens) have a non-empty value of NumForm.

The most frequent other feature values with which NUM and NumForm co-occurred: NumType=Card (1533; 86%), Gender=EMPTY (1335; 75%), Number=EMPTY (1098; 62%), Case=EMPTY (1098; 62%).

NUM tokens may have the following values of NumForm:

Digit (1079; 61% of non-empty NumForm): 15, 2000, 1., 10, 50, 20, 30, 18., 40, 20.
Roman (19; 1% of non-empty NumForm): I., II, II., I, III, IV., IX., V, V., X
Word (681; 38% of non-empty NumForm): eno, tri, dveh, dva, ena, tisoč, eden, štiri, dve, štirih

NumForm seems to be lexical feature of NUM. 100% lemmas (469) occur only with one value of NumForm.

Relations with Agreement in `NumForm`

The 10 most frequent relations where parent and child node agree in NumForm: NUM –[conj]–> NUM (86; 98%), NUM –[flat]–> NUM (19; 66%), NUM –[nmod]–> NUM (2; 100%).

Treebank Statistics (UD_Slovenian-SST)

This feature is language-specific. It occurs with 1 different values: Word.

350 tokens (2%) have a non-empty value of NumForm. 71 types (2%) occur at least once with a non-empty value of NumForm. 47 lemmas (2%) occur at least once with a non-empty value of NumForm. The feature is used with 1 part-of-speech tags: sl-pos/NUM (350; 2% instances).

`NUM`

350 sl-pos/NUM tokens (100% of all NUM tokens) have a non-empty value of NumForm.

The most frequent other feature values with which NUM and NumForm co-occurred: NumType=Card (349; 100%), Number=Plur (201; 57%), Case=Acc (177; 51%).

NUM tokens may have the following values of NumForm:

Word (350; 100% of non-empty NumForm): eno, dva, en, ena, tri, dvajset, pet, tisoč, dve, enega

NumForm seems to be lexical feature of NUM. 100% lemmas (47) occur only with one value of NumForm.

Relations with Agreement in `NumForm`

The 10 most frequent relations where parent and child node agree in NumForm: NUM –[flat]–> NUM (33; 100%), NUM –[conj]–> NUM (23; 100%), NUM –[fixed]–> NUM (4; 100%), NUM –[nummod]–> NUM (1; 100%), NUM –[reparandum]–> NUM (1; 100%).

NumForm in other languages: [ar] [ca] [cs] [es] [et] [la] [nl] [pt] [ro] [sl] [ta]

NumForm: numeral form

Word: number expressed as word

Examples

Digit: number expressed using digits

Examples

Roman: roman numeral

Examples

Conversion from JOS

Treebank Statistics (UD_Slovenian)

NUM

Relations with Agreement in NumForm

Treebank Statistics (UD_Slovenian-SST)

NUM

Relations with Agreement in NumForm

`NumForm`: numeral form

`NUM`

Relations with Agreement in `NumForm`

`NUM`

Relations with Agreement in `NumForm`