home edit page issue tracker

This page still pertains to UD version 1.

Tokenization

The low-level tokenization of the Russian UD treebanks generally adopt the RNC standard.

Some special cases worth mentioning:

The Russian UD treebanks does not contain multiword tokens. (UD_Russian-Syntagrus treebank v.1.3 and v.1.4 contained multitokens following the Syntagrus standard).

Pronouns and adverbs

Verb forms, analytical grammatical forms, negation