home fr/pos edit page issue tracker

This page still pertains to UD version 1.

PROPN: proper noun

Definition

A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object. The names of people living in a place (such as Les Américains “The Americans”) should be tagged as NOUN (but this is not yet done consistently in the French data).

Examples


Treebank Statistics (UD_French)

There are 16333 PROPN lemmas (46%), 16335 PROPN types (35%) and 29915 PROPN tokens (8%). Out of 17 observed tags, the rank of PROPN is: 1 in number of lemmas, 1 in number of types and 6 in number of tokens.

The 10 most frequent PROPN lemmas: France, Paris, Europe, États-Unis, Jean, Maroc, Espagne, la, New, York

The 10 most frequent PROPN types: France, Paris, Europe, États-Unis, Jean, Maroc, Espagne, la, New, York

The 10 most frequent ambiguous lemmas: la (PROPN 3, ADV 1, NOUN 1, DET 1), New (PROPN 62, X 1), York (PROPN 61, X 1), le (DET 42233, PRON 859, PROPN 3), The (PROPN 42, X 1), sud (NOUN 122, ADJ 2, PROPN 1), saint (NOUN 30, PROPN 9, ADJ 8), de (ADP 30816, PROPN 24), ONU (PROPN 29, X 1), El (PROPN 25, X 1)

The 10 most frequent ambiguous types: la (DET 9506, PRON 108, PROPN 3, ADV 1, NOUN 1), New (PROPN 62, ADJ 2, X 1), York (PROPN 61, X 1), Nord (PROPN 51, NOUN 11), le (DET 13505, PRON 280, PROPN 3), The (PROPN 42, DET 42, X 1), sud (NOUN 93, ADJ 2, PROPN 1), Grand (PROPN 36, ADJ 11), saint (PROPN 9, NOUN 8, ADJ 4), de (ADP 25926, DET 419, PROPN 24, ADV 1)

Morphology

The form / lemma ratio of PROPN is 1.000122 (the average of all parts of speech is 1.305637).

The 1st highest number of forms (2) was observed with the lemma “Côte”: COTE, Côte.

The 2nd highest number of forms (2) was observed with the lemma “Ivoire”: IVOIRE, Ivoire.

The 3rd highest number of forms (2) was observed with the lemma “Jésus-Christ”: J.-C., Jésus-Christ.

PROPN occurs with 2 features: fr-feat/Gender (3; 0% instances), fr-feat/Number (3; 0% instances)

PROPN occurs with 3 feature-value pairs: Gender=Fem, Gender=Masc, Number=Sing

PROPN occurs with 3 feature combinations. The most frequent feature combination is _ (29912 tokens). Examples: France, Paris, Europe, États-Unis, Jean, Maroc, Espagne, la, New, York

Relations

PROPN nodes are attached to their parents using 27 different relations: fr-dep/nmod (8362; 28% instances), fr-dep/flat:name (6587; 22% instances), fr-dep/obl (3731; 12% instances), fr-dep/appos (3431; 11% instances), fr-dep/nsubj (3270; 11% instances), fr-dep/conj (2652; 9% instances), fr-dep/obj (674; 2% instances), fr-dep/amod (244; 1% instances), fr-dep/nsubj:pass (237; 1% instances), fr-dep/xcomp (180; 1% instances), fr-dep/root (172; 1% instances), fr-dep/det (149; 0% instances), fr-dep/compound (64; 0% instances), fr-dep/case (49; 0% instances), fr-dep/nummod (44; 0% instances), fr-dep/dep (21; 0% instances), fr-dep/nmod:poss (9; 0% instances), fr-dep/acl:relcl (8; 0% instances), fr-dep/parataxis (6; 0% instances), fr-dep/advmod (5; 0% instances), fr-dep/vocative (5; 0% instances), fr-dep/acl (4; 0% instances), fr-dep/advcl (4; 0% instances), fr-dep/ccomp (3; 0% instances), fr-dep/dislocated (2; 0% instances), fr-dep/cc (1; 0% instances), fr-dep/fixed (1; 0% instances)

Parents of PROPN nodes belong to 15 different parts of speech: NOUN (11331; 38% instances), PROPN (11059; 37% instances), VERB (6807; 23% instances), ADJ (258; 1% instances), ROOT (172; 1% instances), PRON (167; 1% instances), NUM (50; 0% instances), X (27; 0% instances), SYM (13; 0% instances), ADV (11; 0% instances), ADP (9; 0% instances), INTJ (4; 0% instances), PUNCT (4; 0% instances), DET (2; 0% instances), AUX (1; 0% instances)

9769 (33%) PROPN nodes are leaves.

8624 (29%) PROPN nodes have one child.

6677 (22%) PROPN nodes have two children.

4845 (16%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 34.

Children of PROPN nodes are attached using 33 different relations: fr-dep/case (12494; 31% instances), fr-dep/flat:name (6766; 17% instances), fr-dep/det (5831; 14% instances), fr-dep/punct (4644; 11% instances), fr-dep/conj (2733; 7% instances), fr-dep/nmod (2188; 5% instances), fr-dep/cc (1484; 4% instances), fr-dep/appos (1353; 3% instances), fr-dep/amod (780; 2% instances), fr-dep/acl (513; 1% instances), fr-dep/acl:relcl (416; 1% instances), fr-dep/nummod (404; 1% instances), fr-dep/compound (178; 0% instances), fr-dep/advmod (167; 0% instances), fr-dep/cop (139; 0% instances), fr-dep/nsubj (138; 0% instances), fr-dep/nmod:poss (44; 0% instances), fr-dep/obl (40; 0% instances), fr-dep/expl (25; 0% instances), fr-dep/dep (19; 0% instances), fr-dep/mark (11; 0% instances), fr-dep/advcl (9; 0% instances), fr-dep/obj (7; 0% instances), fr-dep/parataxis (7; 0% instances), fr-dep/ccomp (5; 0% instances), fr-dep/fixed (3; 0% instances), fr-dep/aux (2; 0% instances), fr-dep/xcomp (2; 0% instances), fr-dep/discourse (1; 0% instances), fr-dep/dislocated (1; 0% instances), fr-dep/nsubj:pass (1; 0% instances), fr-dep/orphan (1; 0% instances), fr-dep/reparandum (1; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: ADP (12505; 31% instances), PROPN (11059; 27% instances), DET (5792; 14% instances), PUNCT (4684; 12% instances), NOUN (1907; 5% instances), CCONJ (1392; 3% instances), VERB (948; 2% instances), NUM (679; 2% instances), ADJ (641; 2% instances), ADV (228; 1% instances), PRON (152; 0% instances), X (146; 0% instances), AUX (141; 0% instances), SCONJ (61; 0% instances), SYM (47; 0% instances), PART (23; 0% instances), INTJ (2; 0% instances)


Treebank Statistics (UD_French-ParTUT)

There are 146 PROPN lemmas (6%), 146 PROPN types (4%) and 308 PROPN tokens (2%). Out of 16 observed tags, the rank of PROPN is: 5 in number of lemmas, 5 in number of types and 11 in number of tokens.

The 10 most frequent PROPN lemmas: Facebook, Pericles, Europe, CE, CEE, Koch, Commons, Creative, Galles, Européenne

The 10 most frequent PROPN types: Facebook, Pericles, Europe, CE, CEE, Koch, Commons, Creative, Galles, Européenne

The 10 most frequent ambiguous lemmas: CE (PROPN 9, PRON 4), PME (NOUN 1, PROPN 1)

The 10 most frequent ambiguous types: CE (PROPN 10, PRON 4), Conseil (NOUN 3, PROPN 1), Contrat (NOUN 2, PROPN 1), De (ADP 4, PROPN 1), En (ADP 17, PROPN 1), PME (NOUN 1, PROPN 1), Partage (NOUN 1, PROPN 1), États (NOUN 2, PROPN 1)

Morphology

The form / lemma ratio of PROPN is 1.000000 (the average of all parts of speech is 1.316828).

The 1st highest number of forms (2) was observed with the lemma “De”: DE, De.

The 2nd highest number of forms (1) was observed with the lemma “A5-0104”: A5-0104.

The 3rd highest number of forms (1) was observed with the lemma “ADN”: ADN.

PROPN does not occur with any features.

Relations

PROPN nodes are attached to their parents using 14 different relations: fr-dep/nmod (148; 48% instances), fr-dep/flat (41; 13% instances), fr-dep/flat:name (34; 11% instances), fr-dep/obl (25; 8% instances), fr-dep/nsubj (22; 7% instances), fr-dep/conj (15; 5% instances), fr-dep/obj (6; 2% instances), fr-dep/appos (5; 2% instances), fr-dep/xcomp (4; 1% instances), fr-dep/nsubj:pass (3; 1% instances), fr-dep/root (2; 1% instances), fr-dep/amod (1; 0% instances), fr-dep/case (1; 0% instances), fr-dep/dislocated (1; 0% instances)

Parents of PROPN nodes belong to 9 different parts of speech: NOUN (175; 57% instances), PROPN (53; 17% instances), VERB (50; 16% instances), NUM (16; 5% instances), ADJ (6; 2% instances), PRON (4; 1% instances), ROOT (2; 1% instances), ADV (1; 0% instances), PUNCT (1; 0% instances)

142 (46%) PROPN nodes are leaves.

61 (20%) PROPN nodes have one child.

51 (17%) PROPN nodes have two children.

54 (18%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 12.

Children of PROPN nodes are attached using 16 different relations: fr-dep/case (114; 32% instances), fr-dep/det (69; 20% instances), fr-dep/punct (47; 13% instances), fr-dep/flat:name (34; 10% instances), fr-dep/conj (29; 8% instances), fr-dep/nmod (21; 6% instances), fr-dep/cc (13; 4% instances), fr-dep/nummod (10; 3% instances), fr-dep/amod (4; 1% instances), fr-dep/acl:relcl (3; 1% instances), fr-dep/acl (2; 1% instances), fr-dep/advmod (2; 1% instances), fr-dep/appos (2; 1% instances), fr-dep/compound (1; 0% instances), fr-dep/fixed (1; 0% instances), fr-dep/nmod:poss (1; 0% instances)

Children of PROPN nodes belong to 11 different parts of speech: ADP (114; 32% instances), DET (70; 20% instances), PROPN (53; 15% instances), PUNCT (47; 13% instances), NOUN (33; 9% instances), CCONJ (13; 4% instances), NUM (10; 3% instances), ADJ (4; 1% instances), VERB (4; 1% instances), PRON (3; 1% instances), ADV (2; 1% instances)


Treebank Statistics (UD_French-Sequoia)

There are 1053 PROPN lemmas (16%), 1060 PROPN types (12%) and 3025 PROPN tokens (5%). Out of 16 observed tags, the rank of PROPN is: 2 in number of lemmas, 4 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Aclasta, Angiox, Paris, commission, Jacques, Parlement, France, Union, RPR, Taïwan

The 10 most frequent PROPN types: Aclasta, Angiox, paris, commission, Jacques, Parlement, France, Union, RPR, Taïwan

The 10 most frequent ambiguous lemmas: commission (NOUN 55, PROPN 43), conseil (PROPN 27, NOUN 23), Reuters (PROPN 13, X 1), UE (PROPN 7, NOUN 1), société (NOUN 21, PROPN 1), banque (NOUN 8, PROPN 5), van (PROPN 1, X 1), Koweit (PROPN 4, NOUN 1), république (NOUN 11, PROPN 1), Beirut (PROPN 3, X 1)

The 10 most frequent ambiguous types: commission (NOUN 31, PROPN 1), conseil (NOUN 17, PROPN 1), Reuters (PROPN 13, X 1), Parti (PROPN 7, NOUN 3, VERB 1), UE (PROPN 7, NOUN 1), société (NOUN 15, PROPN 1), Guerre (NOUN 8, PROPN 5), Ligue (PROPN 5, NOUN 1), van (X 1, PROPN 1), Koweit (PROPN 4, NOUN 1)

Morphology

The form / lemma ratio of PROPN is 1.006648 (the average of all parts of speech is 1.389679).

The 1st highest number of forms (2) was observed with the lemma “Aclasta”: ACLASTA, Aclasta.

The 2nd highest number of forms (2) was observed with the lemma “Angiox”: ANGIOX, Angiox.

The 3rd highest number of forms (2) was observed with the lemma “Courcelles-lès-Montbéliard”: COURCELLES-LES-MONTBELIARD, Courcelles-lès-Montbéliard.

PROPN occurs with 2 features: fr-feat/Number (1508; 50% instances), fr-feat/Gender (1369; 45% instances)

PROPN occurs with 4 feature-value pairs: Gender=Fem, Gender=Masc, Number=Plur, Number=Sing

PROPN occurs with 8 feature combinations. The most frequent feature combination is _ (1516 tokens). Examples: Aclasta, Angiox, RPR, Halphen, Jean-Claude, Thomson, Méry, Éric, Dumas, Schuller

Relations

PROPN nodes are attached to their parents using 16 different relations: fr-dep/nmod (1110; 37% instances), fr-dep/flat:name (752; 25% instances), fr-dep/obl (320; 11% instances), fr-dep/nsubj (262; 9% instances), fr-dep/conj (209; 7% instances), fr-dep/root (110; 4% instances), fr-dep/appos (83; 3% instances), fr-dep/obj (74; 2% instances), fr-dep/nsubj:pass (68; 2% instances), fr-dep/dep (19; 1% instances), fr-dep/dislocated (6; 0% instances), fr-dep/acl (5; 0% instances), fr-dep/orphan (3; 0% instances), fr-dep/acl:relcl (2; 0% instances), fr-dep/ccomp (1; 0% instances), fr-dep/det (1; 0% instances)

Parents of PROPN nodes belong to 8 different parts of speech: NOUN (1344; 44% instances), PROPN (842; 28% instances), VERB (671; 22% instances), ROOT (110; 4% instances), ADJ (36; 1% instances), PRON (16; 1% instances), X (4; 0% instances), NUM (2; 0% instances)

1050 (35%) PROPN nodes are leaves.

900 (30%) PROPN nodes have one child.

593 (20%) PROPN nodes have two children.

482 (16%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 13.

Children of PROPN nodes are attached using 22 different relations: fr-dep/case (1169; 29% instances), fr-dep/det (612; 15% instances), fr-dep/punct (603; 15% instances), fr-dep/flat:name (571; 14% instances), fr-dep/conj (290; 7% instances), fr-dep/nmod (209; 5% instances), fr-dep/amod (136; 3% instances), fr-dep/appos (116; 3% instances), fr-dep/cc (109; 3% instances), fr-dep/acl (60; 1% instances), fr-dep/dep (57; 1% instances), fr-dep/nummod (35; 1% instances), fr-dep/acl:relcl (31; 1% instances), fr-dep/advmod (15; 0% instances), fr-dep/cop (11; 0% instances), fr-dep/mark (11; 0% instances), fr-dep/orphan (10; 0% instances), fr-dep/nsubj (5; 0% instances), fr-dep/nmod:poss (2; 0% instances), fr-dep/expl (1; 0% instances), fr-dep/obl (1; 0% instances), fr-dep/vocative (1; 0% instances)

Children of PROPN nodes belong to 14 different parts of speech: ADP (1175; 29% instances), PROPN (842; 21% instances), DET (614; 15% instances), PUNCT (605; 15% instances), NOUN (352; 9% instances), ADJ (138; 3% instances), CCONJ (107; 3% instances), VERB (85; 2% instances), X (47; 1% instances), NUM (39; 1% instances), ADV (15; 0% instances), PRON (15; 0% instances), AUX (11; 0% instances), SCONJ (10; 0% instances)


PROPN in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]