PART
: particle
Definition
Particles are function words that must be associated with another word or phrase to impart meaning and that do not satisfy definitions of other universal parts of speech (e.g. adpositions, coordinating conjunctions, subordinating conjunctions or auxiliary verbs). Czech particles are not inflected.
Note that response words such as ano, jo “yes”, ne “no”, etc. are considered particles in the PDT tagset but they should be retagged as interjections under the UD standard. Also note that ne can be used in two ways, one would be translated as English “no” and the other as “not”. Only the former should become interjection, while the latter will stay a particle.
Examples
- Sentence modality: ať, kéž, nechť (“Let’s do it!” “If only I could do it over.” “May you have an enjoyable stay!”)
- jen “just, only”
- až “only, as late as, even, up to” Use case: až po stovky tisíc let “up to hundreds of thousands of years”
- asi “about, roughly, maybe”
Diffs
Prague Dependency Treebank
-
li “if”: This is an encliticized morpheme that functions as subordinating conjunction but it always immediately follows the predicate of the subordinate clause. For example: Nebude-li pršet, nezmoknem. lit. Will-not-if rain, we-will-not-get-wet. “We will not get wet if it does not rain.” PDT tags the li morpheme as particle and it is currently kept so in the UD conversion but it might be changed to
SCONJ
in the future releases. -
At present the UD-conversion of PDT keeps the PDT convention on tagging the response words (“yes, no”) as particles. Automatic conversion would not be straightforward because the negative particle ne is sometimes used as the response particle/interjection (English “no”) and sometimes as a free negative morpheme (English “not”). These two usages would have to be distinguished and only the first one converted to interjection.
References
Treebank Statistics (UD_Czech)
There are 74 PART
lemmas (0%), 75 PART
types (0%) and 7177 PART
tokens (1%).
Out of 17 observed tags, the rank of PART
is: 8 in number of lemmas, 10 in number of types and 14 in number of tokens.
The 10 most frequent PART
lemmas: jen, až, asi, li, ne, nejen, prý, jenom, ano, bohužel
The 10 most frequent PART
types: jen, až, asi, li, ne, nejen, prý, jenom, ano, bohužel
The 10 most frequent ambiguous lemmas: jen (PART 2125, NOUN 22), až (PART 1232, CCONJ 570, SCONJ 116), li (PART 648, PROPN 5), nejen (PART 440, ADV 1), jenom (PART 190, ADV 1), ať (SCONJ 104, PART 64), pozor (PART 43, NOUN 23), ovšem (CCONJ 543, PART 36), to (PART 25, ADP 8), co (PRON 1652, ADV 201, SCONJ 184, PART 17)
The 10 most frequent ambiguous types: jen (PART 1998, NOUN 1), až (PART 1156, CCONJ 570, SCONJ 96), li (PART 648, PROPN 5), nejen (PART 417, ADV 1), jenom (PART 178, ADV 1), ať (SCONJ 87, PART 53), pozor (NOUN 16, PART 2), ovšem (CCONJ 483, PART 36), to (DET 5335, PART 23, ADP 3), co (PRON 1058, ADV 197, SCONJ 181, PART 7)
- jen
- až
- li
- nejen
- jenom
- ať
- pozor
- ovšem
- to
- co
Morphology
The form / lemma ratio of PART
is 1.013514 (the average of all parts of speech is 2.162583).
The 1st highest number of forms (2) was observed with the lemma “not”: not, t.
The 2nd highest number of forms (1) was observed with the lemma “Achtung”: Achtung.
The 3rd highest number of forms (1) was observed with the lemma “L”: L.
PART
occurs with 3 features: cs-feat/Foreign (100; 1% instances), cs-feat/Style (11; 0% instances), cs-feat/NameType (7; 0% instances)
PART
occurs with 5 feature-value pairs: Foreign=Yes
, NameType=Com
, NameType=Oth
, NameType=Sur
, Style=Coll
PART
occurs with 7 feature combinations.
The most frequent feature combination is _
(7064 tokens).
Examples: jen, až, asi, li, ne, nejen, prý, jenom, ano, bohužel
Relations
PART
nodes are attached to their parents using 22 different relations: cs-dep/advmod:emph (4753; 66% instances), cs-dep/advmod (825; 11% instances), cs-dep/mark (700; 10% instances), cs-dep/cc (276; 4% instances), cs-dep/dep (159; 2% instances), cs-dep/root (117; 2% instances), cs-dep/conj (100; 1% instances), cs-dep/nmod (83; 1% instances), cs-dep/flat:foreign (66; 1% instances), cs-dep/orphan (39; 1% instances), cs-dep/obj (23; 0% instances), cs-dep/nsubj (8; 0% instances), cs-dep/appos (7; 0% instances), cs-dep/case (5; 0% instances), cs-dep/acl (4; 0% instances), cs-dep/discourse (3; 0% instances), cs-dep/fixed (3; 0% instances), cs-dep/ccomp (2; 0% instances), cs-dep/advcl (1; 0% instances), cs-dep/flat (1; 0% instances), cs-dep/iobj (1; 0% instances), cs-dep/xcomp (1; 0% instances)
Parents of PART
nodes belong to 14 different parts of speech: NOUN (2739; 38% instances), VERB (1746; 24% instances), NUM (828; 12% instances), ADV (710; 10% instances), ADJ (514; 7% instances), DET (199; 3% instances), PROPN (160; 2% instances), ROOT (117; 2% instances), PRON (80; 1% instances), PART (50; 1% instances), CCONJ (19; 0% instances), SYM (10; 0% instances), INTJ (3; 0% instances), SCONJ (2; 0% instances)
6717 (94%) PART
nodes are leaves.
163 (2%) PART
nodes have one child.
190 (3%) PART
nodes have two children.
107 (1%) PART
nodes have three or more children.
The highest child degree of a PART
node is 8.
Children of PART
nodes are attached using 25 different relations: cs-dep/punct (391; 42% instances), cs-dep/conj (135; 14% instances), cs-dep/cc (92; 10% instances), cs-dep/orphan (55; 6% instances), cs-dep/mark (39; 4% instances), cs-dep/dep (38; 4% instances), cs-dep/advmod:emph (30; 3% instances), cs-dep/fixed (25; 3% instances), cs-dep/xcomp (19; 2% instances), cs-dep/amod (18; 2% instances), cs-dep/flat:foreign (18; 2% instances), cs-dep/case (16; 2% instances), cs-dep/cop (10; 1% instances), cs-dep/nmod (10; 1% instances), cs-dep/advmod (9; 1% instances), cs-dep/det (9; 1% instances), cs-dep/nsubj (8; 1% instances), cs-dep/aux (6; 1% instances), cs-dep/advcl (4; 0% instances), cs-dep/obl (3; 0% instances), cs-dep/appos (2; 0% instances), cs-dep/ccomp (2; 0% instances), cs-dep/acl (1; 0% instances), cs-dep/obj (1; 0% instances), cs-dep/vocative (1; 0% instances)
Children of PART
nodes belong to 14 different parts of speech: PUNCT (391; 42% instances), NOUN (114; 12% instances), CCONJ (80; 8% instances), ADV (78; 8% instances), VERB (68; 7% instances), PART (50; 5% instances), SCONJ (44; 5% instances), ADJ (39; 4% instances), ADP (19; 2% instances), DET (18; 2% instances), PROPN (17; 2% instances), AUX (16; 2% instances), PRON (7; 1% instances), NUM (1; 0% instances)
Treebank Statistics (UD_Czech-CAC)
There are 40 PART
lemmas (0%), 41 PART
types (0%) and 3074 PART
tokens (1%).
Out of 16 observed tags, the rank of PART
is: 10 in number of lemmas, 12 in number of types and 15 in number of tokens.
The 10 most frequent PART
lemmas: jen, li, až, nejen, asi, ovšem, ne, jenom, ať, prý
The 10 most frequent PART
types: jen, li, až, nejen, asi, ovšem, ne, jenom, ať, prý
The 10 most frequent ambiguous lemmas: jen (PART 902, NOUN 1), li (PART 554, ADJ 1), až (PART 511, SCONJ 33, CCONJ 6), ovšem (PART 210, ADV 14, CCONJ 5), ať (SCONJ 43, PART 32), s (ADP 3748, PART 13), la (PART 2, ADJ 1), co (PRON 511, ADV 164, SCONJ 16, PART 2, ADJ 1), Le (ADJ 1, PART 1), copak (PRON 7, PART 1)
The 10 most frequent ambiguous types: jen (PART 848, NOUN 1), až (PART 496, SCONJ 29, CCONJ 6), ovšem (PART 189, ADV 12, CCONJ 5), ať (SCONJ 42, PART 27), s (ADP 3046, PART 13), to (DET 1862, PART 11), La (PART 3, ADJ 1), co (PRON 372, ADV 158, SCONJ 15, PART 1, ADJ 1), Copak (PRON 5, PART 1), fakt (NOUN 18, PART 1)
- jen
- až
- ovšem
- ať
- s
- to
- La
- PART 3: Proto se čtveřice Vláďa , Jiří , Věra a Dana mohla vydat přes kanál La * .
- ADJ 1: President Československé socialistické republiky propůjčil mistru sportu , nadpraporčíku Františku Venclovskému vyznamenání Za statečnost , za prokázanou osobní odvahu a příkladnou bojovnost při plavbě kanálem La Manche .
- co
- PRON 372: A že co by na to řekly , když by šli společně darovat krev .
- ADV 158: V naší době se musíme udržet co nejdéle mladé .
- SCONJ 15: Bude tomu měsíc , co narukovali .
- PART 1: Ale co , zatím jsem fejeton ještě stačil díky psacímu stroji napsat .
- ADJ 1: Usilují přitom o vypracování jakési ontologie společenskosti , která současně z druhé strany má reflektovat a zpřítomňovat společenský charakter ontologie tkvící již v samotné podstatě člověka definovaného jakožto zoon politikon , jemuž odpovídá , že lidská existence je současně a vždy také co - existence .
- Copak
- fakt
Morphology
The form / lemma ratio of PART
is 1.025000 (the average of all parts of speech is 2.180683).
The 1st highest number of forms (2) was observed with the lemma “das”: das, des.
The 2nd highest number of forms (1) was observed with the lemma “Al”: Al.
The 3rd highest number of forms (1) was observed with the lemma “La”: La.
PART
occurs with 3 features: cs-feat/Foreign (13; 0% instances), cs-feat/NameType (4; 0% instances), cs-feat/Style (3; 0% instances)
PART
occurs with 4 feature-value pairs: Foreign=Yes
, NameType=Geo
, NameType=Oth
, Style=Coll
PART
occurs with 5 feature combinations.
The most frequent feature combination is _
(3058 tokens).
Examples: jen, li, až, nejen, asi, ovšem, ne, jenom, ať, prý
Relations
PART
nodes are attached to their parents using 16 different relations: cs-dep/advmod:emph (1516; 49% instances), cs-dep/mark (582; 19% instances), cs-dep/cc (448; 15% instances), cs-dep/advmod (293; 10% instances), cs-dep/case (129; 4% instances), cs-dep/root (24; 1% instances), cs-dep/conj (23; 1% instances), cs-dep/dep (23; 1% instances), cs-dep/flat:foreign (7; 0% instances), cs-dep/orphan (7; 0% instances), cs-dep/discourse (6; 0% instances), cs-dep/nmod (6; 0% instances), cs-dep/acl (5; 0% instances), cs-dep/fixed (3; 0% instances), cs-dep/advcl (1; 0% instances), cs-dep/obj (1; 0% instances)
Parents of PART
nodes belong to 13 different parts of speech: NOUN (1086; 35% instances), VERB (820; 27% instances), NUM (346; 11% instances), ADJ (305; 10% instances), ADV (237; 8% instances), DET (100; 3% instances), PROPN (40; 1% instances), SYM (39; 1% instances), PRON (32; 1% instances), ROOT (24; 1% instances), SCONJ (21; 1% instances), PART (18; 1% instances), CCONJ (6; 0% instances)
2856 (93%) PART
nodes are leaves.
167 (5%) PART
nodes have one child.
26 (1%) PART
nodes have two children.
25 (1%) PART
nodes have three or more children.
The highest child degree of a PART
node is 11.
Children of PART
nodes are attached using 21 different relations: cs-dep/fixed (136; 40% instances), cs-dep/punct (62; 18% instances), cs-dep/cc (31; 9% instances), cs-dep/conj (17; 5% instances), cs-dep/advmod:emph (16; 5% instances), cs-dep/dep (12; 4% instances), cs-dep/cop (11; 3% instances), cs-dep/xcomp (11; 3% instances), cs-dep/nsubj (10; 3% instances), cs-dep/orphan (10; 3% instances), cs-dep/mark (5; 1% instances), cs-dep/advcl (2; 1% instances), cs-dep/advmod (2; 1% instances), cs-dep/aux (2; 1% instances), cs-dep/nummod (2; 1% instances), cs-dep/obj (2; 1% instances), cs-dep/obl (2; 1% instances), cs-dep/amod (1; 0% instances), cs-dep/case (1; 0% instances), cs-dep/nmod (1; 0% instances), cs-dep/parataxis (1; 0% instances)
Children of PART
nodes belong to 13 different parts of speech: ADP (129; 38% instances), PUNCT (62; 18% instances), NOUN (27; 8% instances), ADV (25; 7% instances), VERB (21; 6% instances), PART (18; 5% instances), CCONJ (17; 5% instances), AUX (13; 4% instances), SCONJ (8; 2% instances), ADJ (6; 2% instances), DET (5; 1% instances), NUM (4; 1% instances), PRON (2; 1% instances)
Treebank Statistics (UD_Czech-CLTT)
There are 3 PART
lemmas (0%), 3 PART
types (0%) and 49 PART
tokens (0%).
Out of 15 observed tags, the rank of PART
is: 14 in number of lemmas, 14 in number of types and 14 in number of tokens.
The 10 most frequent PART
lemmas: až, jen, nikoli
The 10 most frequent PART
types: až, jen, nikoliv
The 10 most frequent ambiguous lemmas: až (PART 24, X 23, SCONJ 6, CCONJ 1)
The 10 most frequent ambiguous types: až (PART 24, X 23, SCONJ 6, CCONJ 1)
- až
- PART 24: (5) Ustanovení § 52 a 53 se použijí až v účetním období začínajícím 1 . ledna 2004 a později .
- X 23: Ustanovení písmen d) až h) se použijí i pro zahraniční fyzické osoby .
- SCONJ 6: (6) Ustanovení odstavců 1 až 5 se nepoužijí při změně právní formy a přeshraničním přemístění sídla .
- CCONJ 1: Účetní jednotka , která sestavuje výkaz zisku a ztráty v účelovém členění , není povinna dodržet členění v účtových skupinách 50 až 55 a 60 až 64 ; členění přizpůsobí výkazu s přihlédnutím k povinnosti uvedené v § 39 odst. 8 .
Morphology
The form / lemma ratio of PART
is 1.000000 (the average of all parts of speech is 1.685169).
The 1st highest number of forms (1) was observed with the lemma “až”: až.
The 2nd highest number of forms (1) was observed with the lemma “jen”: jen.
The 3rd highest number of forms (1) was observed with the lemma “nikoli”: nikoliv.
PART
does not occur with any features.
Relations
PART
nodes are attached to their parents using 2 different relations: cs-dep/advmod:emph (38; 78% instances), cs-dep/cc (11; 22% instances)
Parents of PART
nodes belong to 4 different parts of speech: X (24; 49% instances), NOUN (21; 43% instances), NUM (3; 6% instances), ADV (1; 2% instances)
49 (100%) PART
nodes are leaves.
The highest child degree of a PART
node is 0.
PART in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]