home uk/pos edit page issue tracker

This page still pertains to UD version 1.

PROPN: proper noun

Definition

A proper noun is a noun that is the name of a specific individual, place, or object. Ukrainian proper nouns are always written starting with an uppercase letter. Note that names of days of week, names of months, names of languages, and adjectives derived from geographical names are not written capitalized (unlike in English) and are not considered proper nouns.

Single-word named entities should be tagged PROPN even if they originate from a common noun (Заєць, Бук)  or an adjective (Довгополий, Масна).  Even if they were originally adjectives and inflect according to adjectival paradigms, they behave syntactically as nouns. For instance, Масна  (the feminine version of surname Масний ) is originally feminine form of the adjective масний  “fatty” but as an anthroponimic name, it is a noun. It denotes a concrete person (rather than a property of somebody/something) and its gender is limited to feminine and masculine (while adjectives have forms in all three genders).

Personal names are typically treated as a sequence of proper nouns (one or more given names and one or more surnames). If the name contains prepositions, conjunctions or particles (foreign names), these are tagged as ADP, CCONJ and DET, respectively.

Ukrainian (and other Slavic) multi-word named entities have internal syntactic structure, which is preserved in the annotation. The headword is always noun and there may be other nouns involved. They will be tagged either PROPN or NOUN and possible ambiguities must be resolved individually. Modifying adjectives are never tagged PROPN. Even if an adjective is the first word of a multi-word name, and thus it starts with an uppercase letter, it is still tagged ADJ. Similarly, function words in named entities retain their normal tags. These rules are less strict for foreign named entities where the original part of speech is hidden for a Ukrainian speaker.

Examples


Treebank Statistics (UD_Ukrainian)

There are 182 PROPN lemmas (5%), 234 PROPN types (4%) and 363 PROPN tokens (3%). Out of 17 observed tags, the rank of PROPN is: 5 in number of lemmas, 5 in number of types and 10 in number of tokens.

The 10 most frequent PROPN lemmas: Україна, Пуня, Джон, Львів, Щербачов, Валентин, Франківськ, Василівна, Марія, Михась

The 10 most frequent PROPN types: України, Пуня, Щербачов, Валентин, Василівна, Джон, Марія, Міра, Франківську, Львова

The 10 most frequent ambiguous lemmas: Красна (ADJ 1, PROPN 1), Поляна (NOUN 2, PROPN 1)

The 10 most frequent ambiguous types: б (AUX 8, PART 2, PROPN 1), В (ADP 16, PROPN 2), І (CCONJ 13, PART 4, PROPN 1), А (CCONJ 18, PART 1, PROPN 1), З (ADP 17, PROPN 1), Красної (PROPN 1, ADJ 1), Поляни (PROPN 1, NOUN 1)

Morphology

The form / lemma ratio of PROPN is 1.285714 (the average of all parts of speech is 1.380843).

The 1st highest number of forms (4) was observed with the lemma “Джон”: Джон, Джона, Джонові, Джоном.

The 2nd highest number of forms (4) was observed with the lemma “Каракєджба”: Каракєджба, Каракєджби, Каракєджбу, Каракєджбі.

The 3rd highest number of forms (4) was observed with the lemma “Пуня”: Пунею, Пуню, Пуня, Пуні.

PROPN occurs with 8 features: uk-feat/Animacy (356; 98% instances), uk-feat/Case (347; 96% instances), uk-feat/Gender (343; 94% instances), uk-feat/NameType (218; 60% instances), uk-feat/Abbr (23; 6% instances), uk-feat/Number (7; 2% instances), uk-feat/Foreign (2; 1% instances), uk-feat/Style (2; 1% instances)

PROPN occurs with 20 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, NameType=Giv, NameType=Pat, NameType=Sur, Number=Plur, Number=Ptan, Style=Odd

PROPN occurs with 57 feature combinations. The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc|NameType=Giv (58 tokens). Examples: Пуня, Валентин, Джон, Михась, Лаціс, Тарас, Фацик, Адріян, Влодко, Мікі

Relations

PROPN nodes are attached to their parents using 14 different relations: uk-dep/nsubj (81; 22% instances), uk-dep/obl (70; 19% instances), uk-dep/appos (59; 16% instances), uk-dep/nmod (59; 16% instances), uk-dep/flat:name (51; 14% instances), uk-dep/conj (15; 4% instances), uk-dep/obj (7; 2% instances), uk-dep/root (7; 2% instances), uk-dep/vocative (5; 1% instances), uk-dep/advcl (4; 1% instances), uk-dep/compound (2; 1% instances), uk-dep/acl (1; 0% instances), uk-dep/dislocated (1; 0% instances), uk-dep/iobj (1; 0% instances)

Parents of PROPN nodes belong to 11 different parts of speech: VERB (138; 38% instances), NOUN (131; 36% instances), PROPN (61; 17% instances), ADJ (9; 2% instances), ROOT (7; 2% instances), ADV (6; 2% instances), PRON (4; 1% instances), AUX (3; 1% instances), NUM (2; 1% instances), DET (1; 0% instances), INTJ (1; 0% instances)

169 (47%) PROPN nodes are leaves.

128 (35%) PROPN nodes have one child.

37 (10%) PROPN nodes have two children.

29 (8%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 6.

Children of PROPN nodes are attached using 21 different relations: uk-dep/case (95; 31% instances), uk-dep/punct (76; 25% instances), uk-dep/flat:name (49; 16% instances), uk-dep/conj (17; 6% instances), uk-dep/amod (16; 5% instances), uk-dep/cc (12; 4% instances), uk-dep/appos (9; 3% instances), uk-dep/discourse (7; 2% instances), uk-dep/acl (5; 2% instances), uk-dep/mark (4; 1% instances), uk-dep/advmod (3; 1% instances), uk-dep/compound (2; 1% instances), uk-dep/det (2; 1% instances), uk-dep/nsubj (2; 1% instances), uk-dep/nummod (2; 1% instances), uk-dep/parataxis (2; 1% instances), uk-dep/cop (1; 0% instances), uk-dep/flat (1; 0% instances), uk-dep/nummod:gov (1; 0% instances), uk-dep/obj (1; 0% instances), uk-dep/obl (1; 0% instances)

Children of PROPN nodes belong to 14 different parts of speech: ADP (95; 31% instances), PUNCT (76; 25% instances), PROPN (61; 20% instances), ADJ (14; 5% instances), NOUN (14; 5% instances), CCONJ (12; 4% instances), VERB (11; 4% instances), PART (8; 3% instances), DET (4; 1% instances), NUM (4; 1% instances), SCONJ (4; 1% instances), ADV (2; 1% instances), PRON (2; 1% instances), AUX (1; 0% instances)


PROPN in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]