PROPN
: proper noun
Definition
A proper noun is a noun that is the name of a specific individual, place, or object. Ukrainian proper nouns are always written starting with an uppercase letter. Note that names of days of week, names of months, names of languages, and adjectives derived from geographical names are not written capitalized (unlike in English) and are not considered proper nouns.
Single-word named entities should be tagged PROPN
even if they originate from a common noun (Заєць, Бук) or an adjective (Довгополий, Масна). Even if they were originally adjectives and inflect according to adjectival paradigms, they behave syntactically as nouns. For instance, Масна (the feminine version of surname Масний ) is originally feminine form of the adjective масний “fatty” but as an anthroponimic name, it is a noun. It denotes a concrete person (rather than a property of somebody/something) and its gender is limited to feminine and masculine (while adjectives have forms in all three genders).
Personal names are typically treated as a sequence of proper nouns (one or more given names and one or more surnames). If the name contains prepositions, conjunctions or particles (foreign names), these are tagged as ADP
, CCONJ
and DET
, respectively.
Ukrainian (and other Slavic) multi-word named entities have internal syntactic structure, which is preserved in the annotation. The headword is always noun and there may be other nouns involved. They will be tagged either PROPN
or NOUN
and possible ambiguities must be resolved individually. Modifying adjectives are never tagged PROPN
. Even if an adjective is the first word of a multi-word name, and thus it starts with an uppercase letter, it is still tagged ADJ
. Similarly, function words in named entities retain their normal tags. These rules are less strict for foreign named entities where the original part of speech is hidden for a Ukrainian speaker.
Examples
- Франкфурт
PROPN
наADP
МайніPROPN
is a city. Франкфурт is the head and the на Майні part refers to the river flowing through the city, to distinguish it from other Frankfurts. - Організація
NOUN
об’єднанихADJ
наційNOUN
“United Nations Organization” consists of three words, none of which is proper noun. However, the acronym ООН “UN” is a single-token name and is taggedPROPN
.
Treebank Statistics (UD_Ukrainian)
There are 182 PROPN
lemmas (5%), 234 PROPN
types (4%) and 363 PROPN
tokens (3%).
Out of 17 observed tags, the rank of PROPN
is: 5 in number of lemmas, 5 in number of types and 10 in number of tokens.
The 10 most frequent PROPN
lemmas: Україна, Пуня, Джон, Львів, Щербачов, Валентин, Франківськ, Василівна, Марія, Михась
The 10 most frequent PROPN
types: України, Пуня, Щербачов, Валентин, Василівна, Джон, Марія, Міра, Франківську, Львова
The 10 most frequent ambiguous lemmas: Красна (ADJ 1, PROPN 1), Поляна (NOUN 2, PROPN 1)
The 10 most frequent ambiguous types: б (AUX 8, PART 2, PROPN 1), В (ADP 16, PROPN 2), І (CCONJ 13, PART 4, PROPN 1), А (CCONJ 18, PART 1, PROPN 1), З (ADP 17, PROPN 1), Красної (PROPN 1, ADJ 1), Поляни (PROPN 1, NOUN 1)
- б
- В
- ADP 16: В деяких місцевостях це були справжні розкішні притони для номенклатури .
- PROPN 2: Тижнів через три , як книжка з’явиться на вітринах книгарень , я гадаю зробити ще публічну лекцію при громаді академічних службовців — з відчитанням кількох оповідань в цілості — щось на зразок того читання , яке улаштував С . О . Єфремов півтора місяця тому на пошанування В . Стефаника .
- І
- А
- CCONJ 18: А з братом Юрком ми придумали нову розвагу — бокс лежачи .
- PART 1: Зрештою , також маю гранати , цілих дві , — і всі одночасно без слів згадали , як дарували їй у вересні на іменини « репанку » - РГД , і вона мусила те згадати , бо схлипнула , чи то засміялась нервово : — Ой Божечку , а я ж для вас усіх гостинці маю на Різдво наготовані ! — і вже викидала в западаючу темінь зі свого наплечника якісь тремтячі згортки : рукавиці , шкарпетки ? — щось біле спурхнуло й спало долі крилом , зачерпнувши жару , вона підхопила , стріпуючи : — А це для вас , друже командир , вберете ?
- PROPN 1: Проте Марія Василівна чулася в тому віці , коли вже можна дозволити собі також багато іншого : і А ) випити чарочку коньяку , який прогріває горло під час занять , і Б ) намалюватися помадою непристойно червоного кольору , і В ) допустити певні думки під час приповільнення кроку біля побуткомбінату , де за прозорими вікнами в підвальному приміщенні — чоловіча качалка …
- З
- Красної
- Поляни
Morphology
The form / lemma ratio of PROPN
is 1.285714 (the average of all parts of speech is 1.380843).
The 1st highest number of forms (4) was observed with the lemma “Джон”: Джон, Джона, Джонові, Джоном.
The 2nd highest number of forms (4) was observed with the lemma “Каракєджба”: Каракєджба, Каракєджби, Каракєджбу, Каракєджбі.
The 3rd highest number of forms (4) was observed with the lemma “Пуня”: Пунею, Пуню, Пуня, Пуні.
PROPN
occurs with 8 features: uk-feat/Animacy (356; 98% instances), uk-feat/Case (347; 96% instances), uk-feat/Gender (343; 94% instances), uk-feat/NameType (218; 60% instances), uk-feat/Abbr (23; 6% instances), uk-feat/Number (7; 2% instances), uk-feat/Foreign (2; 1% instances), uk-feat/Style (2; 1% instances)
PROPN
occurs with 20 feature-value pairs: Abbr=Yes
, Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Case=Voc
, Foreign=Yes
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NameType=Giv
, NameType=Pat
, NameType=Sur
, Number=Plur
, Number=Ptan
, Style=Odd
PROPN
occurs with 57 feature combinations.
The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc|NameType=Giv
(58 tokens).
Examples: Пуня, Валентин, Джон, Михась, Лаціс, Тарас, Фацик, Адріян, Влодко, Мікі
Relations
PROPN
nodes are attached to their parents using 14 different relations: uk-dep/nsubj (81; 22% instances), uk-dep/obl (70; 19% instances), uk-dep/appos (59; 16% instances), uk-dep/nmod (59; 16% instances), uk-dep/flat:name (51; 14% instances), uk-dep/conj (15; 4% instances), uk-dep/obj (7; 2% instances), uk-dep/root (7; 2% instances), uk-dep/vocative (5; 1% instances), uk-dep/advcl (4; 1% instances), uk-dep/compound (2; 1% instances), uk-dep/acl (1; 0% instances), uk-dep/dislocated (1; 0% instances), uk-dep/iobj (1; 0% instances)
Parents of PROPN
nodes belong to 11 different parts of speech: VERB (138; 38% instances), NOUN (131; 36% instances), PROPN (61; 17% instances), ADJ (9; 2% instances), ROOT (7; 2% instances), ADV (6; 2% instances), PRON (4; 1% instances), AUX (3; 1% instances), NUM (2; 1% instances), DET (1; 0% instances), INTJ (1; 0% instances)
169 (47%) PROPN
nodes are leaves.
128 (35%) PROPN
nodes have one child.
37 (10%) PROPN
nodes have two children.
29 (8%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 6.
Children of PROPN
nodes are attached using 21 different relations: uk-dep/case (95; 31% instances), uk-dep/punct (76; 25% instances), uk-dep/flat:name (49; 16% instances), uk-dep/conj (17; 6% instances), uk-dep/amod (16; 5% instances), uk-dep/cc (12; 4% instances), uk-dep/appos (9; 3% instances), uk-dep/discourse (7; 2% instances), uk-dep/acl (5; 2% instances), uk-dep/mark (4; 1% instances), uk-dep/advmod (3; 1% instances), uk-dep/compound (2; 1% instances), uk-dep/det (2; 1% instances), uk-dep/nsubj (2; 1% instances), uk-dep/nummod (2; 1% instances), uk-dep/parataxis (2; 1% instances), uk-dep/cop (1; 0% instances), uk-dep/flat (1; 0% instances), uk-dep/nummod:gov (1; 0% instances), uk-dep/obj (1; 0% instances), uk-dep/obl (1; 0% instances)
Children of PROPN
nodes belong to 14 different parts of speech: ADP (95; 31% instances), PUNCT (76; 25% instances), PROPN (61; 20% instances), ADJ (14; 5% instances), NOUN (14; 5% instances), CCONJ (12; 4% instances), VERB (11; 4% instances), PART (8; 3% instances), DET (4; 1% instances), NUM (4; 1% instances), SCONJ (4; 1% instances), ADV (2; 1% instances), PRON (2; 1% instances), AUX (1; 0% instances)
PROPN in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]