PROPN
: proper noun
Definition
A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object. For institutional names that contain regular words belonging to other parts of speech such as nouns (e.g., 公司 / gōngsī “company”, 大學 / dàxué “university”, etc.), those words should be segmented as their own tokens and still tagged their native part of speech; only the proper nouns in such complex names should be tagged PROPN.
Examples
- 孔子 / kǒngzǐ “Confucius”
- 亞洲 / yàzhōu “Asia”
- 威威 木材 公司 / wēiwēi mùcái gōngsī “Weiwei Timber Company”
Treebank Statistics (UD_Chinese)
There are 4561 PROPN
lemmas (22%), 4561 PROPN
types (22%) and 9735 PROPN
tokens (9%).
Out of 15 observed tags, the rank of PROPN
is: 2 in number of lemmas, 2 in number of types and 5 in number of tokens.
The 10 most frequent PROPN
lemmas: 中國、 美國、 日本、 香港、 李、 中華、 英國、 美、 台灣、 王
The 10 most frequent PROPN
types: 中國、 美國、 日本、 香港、 李、 中華、 英國、 美、 台灣、 王
The 10 most frequent ambiguous lemmas: 美 (PROPN 61, PART 1), 王 (PROPN 55, PART 19, NOUN 5), 日 (NOUN 348, PROPN 50, PART 6, NUM 2), 英 (PROPN 50, NOUN 1), 張 (PROPN 41, NOUN 16), 中 (ADP 344, NOUN 44, PROPN 40, VERB 3, PART 2), 林 (PROPN 23, PART 4, NOUN 1), 港 (PROPN 23, PART 12), 周 (PROPN 21, NOUN 5, PART 1), 清 (PROPN 20, NOUN 2, PART 2)
The 10 most frequent ambiguous types: 美 (PROPN 61, PART 1), 王 (PROPN 55, PART 19, NOUN 5), 日 (NOUN 348, PROPN 50, PART 6, NUM 2), 英 (PROPN 50, NOUN 1), 張 (PROPN 41, NOUN 16), 中 (ADP 344, NOUN 44, PROPN 40, VERB 3, PART 2), 林 (PROPN 23, PART 4, NOUN 1), 港 (PROPN 23, PART 12), 周 (PROPN 21, NOUN 5, PART 1), 清 (PROPN 20, NOUN 2, PART 2)
- 美
- 王
- 日
- 英
- 張
- 中
- 林
- 港
- 周
- 清
Morphology
The form / lemma ratio of PROPN
is 1.000000 (the average of all parts of speech is 1.000284).
The 1st highest number of forms (1) was observed with the lemma “14572”: 14572.
The 2nd highest number of forms (1) was observed with the lemma “360”: 360.
The 3rd highest number of forms (1) was observed with the lemma “Casey”: Casey.
PROPN
does not occur with any features.
Relations
PROPN
nodes are attached to their parents using 23 different relations: zh-dep/nmod (4303; 44% instances), zh-dep/case:suff (1793; 18% instances), zh-dep/nsubj (1588; 16% instances), zh-dep/obj (721; 7% instances), zh-dep/det (469; 5% instances), zh-dep/conj (461; 5% instances), zh-dep/obl (200; 2% instances), zh-dep/nsubj:pass (61; 1% instances), zh-dep/appos (33; 0% instances), zh-dep/dep (25; 0% instances), zh-dep/root (25; 0% instances), zh-dep/iobj (13; 0% instances), zh-dep/advmod (12; 0% instances), zh-dep/ccomp (10; 0% instances), zh-dep/nummod (6; 0% instances), zh-dep/nmod:tmod (5; 0% instances), zh-dep/flat:foreign (3; 0% instances), zh-dep/dislocated (2; 0% instances), zh-dep/acl (1; 0% instances), zh-dep/acl:relcl (1; 0% instances), zh-dep/case:pref (1; 0% instances), zh-dep/vocative (1; 0% instances), zh-dep/xcomp (1; 0% instances)
Parents of PROPN
nodes belong to 12 different parts of speech: NOUN (2903; 30% instances), PART (2516; 26% instances), VERB (2446; 25% instances), PROPN (1741; 18% instances), ADJ (46; 0% instances), ROOT (25; 0% instances), ADP (22; 0% instances), X (15; 0% instances), NUM (13; 0% instances), PRON (5; 0% instances), ADV (2; 0% instances), SYM (1; 0% instances)
7144 (73%) PROPN
nodes are leaves.
1546 (16%) PROPN
nodes have one child.
647 (7%) PROPN
nodes have two children.
398 (4%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 19.
Children of PROPN
nodes are attached using 26 different relations: zh-dep/nmod (1395; 31% instances), zh-dep/punct (728; 16% instances), zh-dep/appos (572; 13% instances), zh-dep/conj (501; 11% instances), zh-dep/case:dec (468; 10% instances), zh-dep/case (208; 5% instances), zh-dep/cc (171; 4% instances), zh-dep/acl (118; 3% instances), zh-dep/det (69; 2% instances), zh-dep/acl:relcl (54; 1% instances), zh-dep/cop (45; 1% instances), zh-dep/dep (42; 1% instances), zh-dep/nsubj (42; 1% instances), zh-dep/case:pref (40; 1% instances), zh-dep/nummod (14; 0% instances), zh-dep/clf (13; 0% instances), zh-dep/amod (11; 0% instances), zh-dep/dislocated (8; 0% instances), zh-dep/advmod (6; 0% instances), zh-dep/case:suff (5; 0% instances), zh-dep/csubj (5; 0% instances), zh-dep/nmod:tmod (5; 0% instances), zh-dep/flat:foreign (2; 0% instances), zh-dep/mark (2; 0% instances), zh-dep/ccomp (1; 0% instances), zh-dep/mark:relcl (1; 0% instances)
Children of PROPN
nodes belong to 15 different parts of speech: PROPN (1741; 38% instances), PUNCT (727; 16% instances), NOUN (716; 16% instances), PART (654; 14% instances), ADP (240; 5% instances), CCONJ (171; 4% instances), VERB (92; 2% instances), X (76; 2% instances), AUX (45; 1% instances), DET (17; 0% instances), NUM (14; 0% instances), ADJ (12; 0% instances), PRON (12; 0% instances), ADV (7; 0% instances), SYM (2; 0% instances)
PROPN in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]