home fr/pos edit page issue tracker

This page still pertains to UD version 1.

DET: determiner

Definition

We follow the definition for DET proposed in the universal scheme.

However note that at the moment numerals are not consistently annotated as NUM, and are sometimes marked as DET.

For demonstratives such as ce …-là, ce …-ci (as in cet homme-ci, cette femme-là “this man, that women”), the first part of the determiner is annotated as DET and the clitic ci, là (which are split from the noun) are marked as PART.

Examples


Treebank Statistics (UD_French)

There are 34 DET lemmas (0%), 81 DET types (0%) and 60293 DET tokens (15%). Out of 17 observed tags, the rank of DET is: 12 in number of lemmas, 11 in number of types and 3 in number of tokens.

The 10 most frequent DET lemmas: le, un, son, ce, tout, plusieurs, quelque, certain, chaque, aucun

The 10 most frequent DET types: le, la, les, l’, un, une, des, son, sa, cette

The 10 most frequent ambiguous lemmas: le (DET 42233, PRON 859, PROPN 3), un (DET 9910, PRON 311, NUM 57, NOUN 2, PROPN 1), son (DET 4250, NOUN 19), ce (DET 2160, PRON 933, SCONJ 2, X 1), tout (DET 565, PRON 139, ADV 135, NOUN 17, ADJ 5), plusieurs (DET 247, PRON 12), quelque (DET 175, ADV 3), certain (DET 162, PRON 48, ADJ 30), aucun (DET 105, PRON 15), du (DET 76, PROPN 5)

The 10 most frequent ambiguous types: le (DET 13505, PRON 280, PROPN 3), la (DET 9506, PRON 108, PROPN 3, ADV 1, NOUN 1), les (DET 8551, PRON 128), l’ (DET 6061, PRON 259, PART 149, PROPN 2), un (DET 3842, PRON 182, NUM 57, PROPN 1), une (DET 3327, PRON 110, NUM 56, NOUN 3), des (DET 1668, ADP 1), son (DET 1361, NOUN 16, AUX 3), ce (DET 521, PRON 315, SCONJ 2, X 1), de (ADP 25926, DET 419, PROPN 24, ADV 1)

Morphology

The form / lemma ratio of DET is 2.382353 (the average of all parts of speech is 1.305637).

The 1st highest number of forms (16) was observed with the lemma “son”: leur, leurs, ma, mes, mon, nos, notre, sa, se, ses, son, sont, tes, ton, vos, votre.

The 2nd highest number of forms (6) was observed with the lemma “le”: l, l’, la, le, les, là.

The 3rd highest number of forms (6) was observed with the lemma “un”: d’, de, des, in, un, une.

DET occurs with 5 features: fr-feat/Number (60153; 100% instances), fr-feat/PronType (58663; 97% instances), fr-feat/Gender (57159; 95% instances), fr-feat/Definite (52226; 87% instances), fr-feat/Polarity (107; 0% instances)

DET occurs with 11 feature-value pairs: Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Polarity=Neg, PronType=Art, PronType=Dem, PronType=Neg, PronType=Prs

DET occurs with 46 feature combinations. The most frequent feature combination is Definite=Def|Gender=Masc|Number=Sing|PronType=Art (17533 tokens). Examples: le, l’, l

Relations

DET nodes are attached to their parents using 15 different relations: fr-dep/det (55664; 92% instances), fr-dep/nmod:poss (4253; 7% instances), fr-dep/fixed (250; 0% instances), fr-dep/flat:name (49; 0% instances), fr-dep/advmod (44; 0% instances), fr-dep/compound (9; 0% instances), fr-dep/conj (9; 0% instances), fr-dep/dep (5; 0% instances), fr-dep/case (3; 0% instances), fr-dep/nsubj (2; 0% instances), fr-dep/amod (1; 0% instances), fr-dep/mark (1; 0% instances), fr-dep/obj (1; 0% instances), fr-dep/obl (1; 0% instances), fr-dep/root (1; 0% instances)

Parents of DET nodes belong to 13 different parts of speech: NOUN (53367; 89% instances), PROPN (5792; 10% instances), ADJ (354; 1% instances), ADP (222; 0% instances), PRON (182; 0% instances), NUM (121; 0% instances), ADV (98; 0% instances), X (72; 0% instances), VERB (52; 0% instances), SYM (19; 0% instances), DET (12; 0% instances), PART (1; 0% instances), ROOT (1; 0% instances)

60184 (100%) DET nodes are leaves.

87 (0%) DET nodes have one child.

18 (0%) DET nodes have two children.

4 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 5.

Children of DET nodes are attached using 16 different relations: fr-dep/fixed (75; 55% instances), fr-dep/punct (16; 12% instances), fr-dep/advmod (10; 7% instances), fr-dep/case (8; 6% instances), fr-dep/cc (8; 6% instances), fr-dep/conj (8; 6% instances), fr-dep/mark (2; 1% instances), fr-dep/nmod (2; 1% instances), fr-dep/acl (1; 1% instances), fr-dep/advcl (1; 1% instances), fr-dep/amod (1; 1% instances), fr-dep/appos (1; 1% instances), fr-dep/cop (1; 1% instances), fr-dep/goeswith (1; 1% instances), fr-dep/nsubj (1; 1% instances), fr-dep/reparandum (1; 1% instances)

Children of DET nodes belong to 13 different parts of speech: ADV (50; 36% instances), ADP (24; 18% instances), NOUN (17; 12% instances), PUNCT (16; 12% instances), DET (12; 9% instances), CCONJ (8; 6% instances), ADJ (2; 1% instances), PROPN (2; 1% instances), VERB (2; 1% instances), AUX (1; 1% instances), PRON (1; 1% instances), SCONJ (1; 1% instances), SYM (1; 1% instances)


Treebank Statistics (UD_French-ParTUT)

There are 28 DET lemmas (1%), 49 DET types (1%) and 2984 DET tokens (17%). Out of 16 observed tags, the rank of DET is: 9 in number of lemmas, 9 in number of types and 2 in number of tokens.

The 10 most frequent DET lemmas: le, un, ce, tout, son, leur, mon, votre, aucun, certain

The 10 most frequent DET types: les, le, la, l’, une, un, des, cette, ce, ces

The 10 most frequent ambiguous lemmas: le (DET 2093, PRON 36), un (DET 375, PRON 7, NUM 1), ce (DET 144, PRON 86), tout (DET 80, ADV 15, PRON 8), leur (DET 34, PRON 2), aucun (DET 17, PRON 1), certain (DET 17, ADJ 2, PRON 1), de (ADP 1530, DET 15, SCONJ 1), autre (ADJ 18, DET 10, PRON 4), chaque (DET 5, NOUN 1, ADJ 1)

The 10 most frequent ambiguous types: les (DET 568, PRON 4), le (DET 574, PRON 15), la (DET 498, PRON 2), l’ (DET 330, PRON 15), une (DET 136, PRON 4), un (DET 131, PRON 1, NUM 1), ce (PRON 49, DET 45), leur (DET 21, PRON 4), tous (DET 19, PRON 5), de (ADP 1234, DET 15, SCONJ 1)

Morphology

The form / lemma ratio of DET is 1.750000 (the average of all parts of speech is 1.316828).

The 1st highest number of forms (5) was observed with the lemma “le”: Des, l’, la, le, les.

The 2nd highest number of forms (5) was observed with the lemma “un”: de, des, du, un, une.

The 3rd highest number of forms (4) was observed with the lemma “ce”: ce, ces, cet, cette.

DET occurs with 5 features: fr-feat/PronType (2984; 100% instances), fr-feat/Number (2976; 100% instances), fr-feat/Definite (2478; 83% instances), fr-feat/Gender (1740; 58% instances), fr-feat/Person (2; 0% instances)

DET occurs with 13 feature-value pairs: Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Person=3, PronType=Art, PronType=Dem, PronType=Ind, PronType=Int, PronType=Prs, PronType=Tot

DET occurs with 43 feature combinations. The most frequent feature combination is Definite=Def|Number=Plur|PronType=Art (614 tokens). Examples: les, des

Relations

DET nodes are attached to their parents using 10 different relations: fr-dep/det (2781; 93% instances), fr-dep/nmod:poss (182; 6% instances), fr-dep/fixed (5; 0% instances), fr-dep/nsubj (5; 0% instances), fr-dep/obj (3; 0% instances), fr-dep/case (2; 0% instances), fr-dep/nmod (2; 0% instances), fr-dep/obl (2; 0% instances), fr-dep/csubj (1; 0% instances), fr-dep/expl (1; 0% instances)

Parents of DET nodes belong to 9 different parts of speech: NOUN (2827; 95% instances), PROPN (70; 2% instances), ADJ (25; 1% instances), ADV (25; 1% instances), VERB (18; 1% instances), PRON (12; 0% instances), DET (4; 0% instances), ADP (2; 0% instances), X (1; 0% instances)

2972 (100%) DET nodes are leaves.

6 (0%) DET nodes have one child.

2 (0%) DET nodes have two children.

4 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 3.

Children of DET nodes are attached using 9 different relations: fr-dep/case (6; 27% instances), fr-dep/nmod (6; 27% instances), fr-dep/det (4; 18% instances), fr-dep/advmod (1; 5% instances), fr-dep/conj (1; 5% instances), fr-dep/cop (1; 5% instances), fr-dep/mark (1; 5% instances), fr-dep/nsubj (1; 5% instances), fr-dep/punct (1; 5% instances)

Children of DET nodes belong to 8 different parts of speech: ADP (6; 27% instances), NOUN (6; 27% instances), DET (4; 18% instances), PRON (2; 9% instances), ADV (1; 5% instances), AUX (1; 5% instances), PUNCT (1; 5% instances), SCONJ (1; 5% instances)


Treebank Statistics (UD_French-Sequoia)

There are 41 DET lemmas (1%), 72 DET types (1%) and 8965 DET tokens (15%). Out of 16 observed tags, the rank of DET is: 11 in number of lemmas, 10 in number of types and 3 in number of tokens.

The 10 most frequent DET lemmas: le, un, son, ce, quelque, aucun, tout, plusieurs, quel, certain

The 10 most frequent DET types: le, les, la, l’, une, un, des, cette, ce, son

The 10 most frequent ambiguous lemmas: le (DET 6379, PRON 86), un (DET 1513, PRON 37), ce (DET 383, PRON 128), quelque (DET 36, ADV 3, ADJ 1), aucun (DET 35, PRON 2), tout (ADJ 85, DET 32, ADV 28, PRON 17), plusieurs (DET 31, PRON 3), quel (DET 18, ADJ 13), certain (DET 17, PRON 7, ADJ 6), différent (ADJ 8, DET 4)

The 10 most frequent ambiguous types: le (DET 1756, PRON 30), les (DET 1525, PRON 16), la (DET 1390, PRON 13), l’ (DET 1069, PRON 27), une (DET 537, PRON 4), un (DET 467, PRON 33), des (DET 299, ADP 1), ce (DET 101, PRON 60), de (ADP 3991, DET 55), d’ (ADP 878, DET 43)

Morphology

The form / lemma ratio of DET is 1.756098 (the average of all parts of speech is 1.389679).

The 1st highest number of forms (12) was observed with the lemma “son”: leur, leurs, ma, mes, mon, nos, notre, sa, ses, son, vos, votre.

The 2nd highest number of forms (5) was observed with the lemma “un”: d’, de, des, un, une.

The 3rd highest number of forms (4) was observed with the lemma “ce”: ce, ces, cet, cette.

DET occurs with 5 features: fr-feat/Number (8923; 100% instances), fr-feat/PronType (8343; 93% instances), fr-feat/Definite (7947; 89% instances), fr-feat/Gender (5091; 57% instances), fr-feat/Poss (440; 5% instances)

DET occurs with 10 feature-value pairs: Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Poss=Yes, PronType=Art, PronType=Dem, PronType=Int

DET occurs with 27 feature combinations. The most frequent feature combination is Definite=Def|Gender=Masc|Number=Sing|PronType=Art (2003 tokens). Examples: le, les, l’

Relations

DET nodes are attached to their parents using 10 different relations: fr-dep/det (8282; 92% instances), fr-dep/nmod:poss (440; 5% instances), fr-dep/fixed (201; 2% instances), fr-dep/nsubj (14; 0% instances), fr-dep/dep (9; 0% instances), fr-dep/advmod (5; 0% instances), fr-dep/conj (5; 0% instances), fr-dep/obj (5; 0% instances), fr-dep/nmod (2; 0% instances), fr-dep/obl (2; 0% instances)

Parents of DET nodes belong to 10 different parts of speech: NOUN (8031; 90% instances), PROPN (614; 7% instances), ADP (198; 2% instances), ADJ (48; 1% instances), PRON (32; 0% instances), VERB (22; 0% instances), DET (9; 0% instances), ADV (7; 0% instances), NUM (3; 0% instances), X (1; 0% instances)

8886 (99%) DET nodes are leaves.

62 (1%) DET nodes have one child.

13 (0%) DET nodes have two children.

4 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 4.

Children of DET nodes are attached using 11 different relations: fr-dep/fixed (57; 56% instances), fr-dep/case (13; 13% instances), fr-dep/nmod (8; 8% instances), fr-dep/dep (6; 6% instances), fr-dep/cc (5; 5% instances), fr-dep/conj (5; 5% instances), fr-dep/acl (2; 2% instances), fr-dep/advmod (2; 2% instances), fr-dep/advcl (1; 1% instances), fr-dep/appos (1; 1% instances), fr-dep/punct (1; 1% instances)

Children of DET nodes belong to 9 different parts of speech: ADJ (32; 32% instances), NOUN (20; 20% instances), ADP (13; 13% instances), PRON (12; 12% instances), DET (9; 9% instances), ADV (6; 6% instances), CCONJ (5; 5% instances), NUM (3; 3% instances), PUNCT (1; 1% instances)


DET in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]