Foreign
: is this a foreign word?
Values: | Yes |
Boolean feature. Is this a foreign word? Not a loan word but a genuinely foreign word appearing inside native text, e.g. inside direct speech, titles of books etc.
Note that Czech data (especially those from the PDT) often indicate the original part of speech of foreign words. Thus this feature may occur with any POS tag. If the original part of speech is not known, the feature will accompany the cs-pos/X tag.
Yes
: it is foreign
Examples
- … nese jméno VLIW (Very Long Instruction Word – velmi dlouhé instrukční slovo)
Diffs
Prague Dependency Treebank
For proper nouns the borderline between foreign words and loan words is somewhat fuzzy, so e.g. the English personal name George is marked as foreign even though it would not normally be translated (except for names of rulers and saints, which would become Jiří).
Articles in foreign names (the, die, le) are tagged cs-pos/ADJ, not cs-pos/DET.
Treebank Statistics (UD_Czech)
This feature is language-specific.
It occurs with 1 different values: Yes
.
8256 tokens (1%) have a non-empty value of Foreign
.
3352 types (3%) occur at least once with a non-empty value of Foreign
.
3184 lemmas (6%) occur at least once with a non-empty value of Foreign
.
The feature is used with 13 part-of-speech tags: cs-pos/PROPN (3218; 0% instances), cs-pos/ADJ (2432; 0% instances), cs-pos/NOUN (1596; 0% instances), cs-pos/ADP (524; 0% instances), cs-pos/ADV (102; 0% instances), cs-pos/VERB (102; 0% instances), cs-pos/PART (100; 0% instances), cs-pos/CCONJ (72; 0% instances), cs-pos/PRON (56; 0% instances), cs-pos/NUM (26; 0% instances), cs-pos/DET (17; 0% instances), cs-pos/INTJ (6; 0% instances), cs-pos/SCONJ (5; 0% instances).
PROPN
3218 cs-pos/PROPN tokens (4% of all PROPN
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which PROPN
and Foreign
co-occurred: Polarity=Pos (3218; 100%), Case=EMPTY (2514; 78%), Abbr=EMPTY (2374; 74%), NameType=Com (2178; 68%), Animacy=EMPTY (1933; 60%), Number=EMPTY (1875; 58%).
PROPN
tokens may have the following values of Foreign
:
Yes
(3218; 100% of non-emptyForeign
): IRA, HZDS, Floyd, International, Nature, Science, Sinn, Fein, group, Times
Foreign
seems to be lexical feature of PROPN
. 100% lemmas (1297) occur only with one value of Foreign
.
ADJ
2432 cs-pos/ADJ tokens (1% of all ADJ
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which ADJ
and Foreign
co-occurred: Polarity=Pos (2428; 100%), Degree=Pos (2419; 99%), Animacy=EMPTY (2338; 96%), Case=EMPTY (2311; 95%), Number=EMPTY (2223; 91%), Gender=EMPTY (2215; 91%).
ADJ
tokens may have the following values of Foreign
:
Yes
(2432; 100% of non-emptyForeign
): New, the, open, US, Pink, la, Le, die, Deutsche, Czech
Foreign
seems to be lexical feature of ADJ
. 100% lemmas (927) occur only with one value of Foreign
.
NOUN
1596 cs-pos/NOUN tokens (0% of all NOUN
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which NOUN
and Foreign
co-occurred: Polarity=Pos (1595; 100%), Case=EMPTY (1071; 67%), Animacy=EMPTY (886; 56%), Number=EMPTY (835; 52%).
NOUN
tokens may have the following values of Foreign
:
Yes
(1596; 100% of non-emptyForeign
): play, managementu, management, CD, facto, s, neem, o, st, ECM
Foreign
seems to be lexical feature of NOUN
. 100% lemmas (849) occur only with one value of Foreign
.
ADP
524 cs-pos/ADP tokens (0% of all ADP
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which ADP
and Foreign
co-occurred: AdpType=Prep (524; 100%), Case=EMPTY (303; 58%).
ADP
tokens may have the following values of Foreign
:
Yes
(524; 100% of non-emptyForeign
): de, of, van, in, di, von, versus, ad, Pro, on
Foreign
seems to be lexical feature of ADP
. 100% lemmas (53) occur only with one value of Foreign
.
VERB
102 cs-pos/VERB tokens (0% of all VERB
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which VERB
and Foreign
co-occurred: Aspect=EMPTY (102; 100%), Polarity=Pos (99; 97%), Gender=EMPTY (95; 93%), Person=EMPTY (56; 55%), Mood=EMPTY (54; 53%), Tense=EMPTY (52; 51%), Voice=EMPTY (52; 51%).
VERB
tokens may have the following values of Foreign
:
Yes
(102; 100% of non-emptyForeign
): is, Be, est, transit, Can, Check, Come, Habent, Keep, Love
Foreign
seems to be lexical feature of VERB
. 100% lemmas (76) occur only with one value of Foreign
.
ADV
102 cs-pos/ADV tokens (0% of all ADV
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which ADV
and Foreign
co-occurred: PronType=EMPTY (100; 98%), Polarity=EMPTY (96; 94%), Degree=EMPTY (96; 94%).
ADV
tokens may have the following values of Foreign
:
Yes
(102; 100% of non-emptyForeign
): cca, priori, Today, Here, live, Only, Sic, dove, echt, ešte
Foreign
seems to be lexical feature of ADV
. 100% lemmas (65) occur only with one value of Foreign
.
PART
100 cs-pos/PART tokens (1% of all PART
tokens) have a non-empty value of Foreign
.
PART
tokens may have the following values of Foreign
:
Yes
(100; 100% of non-emptyForeign
): off, not, oui, t, So, ne, sorry, viva, non, Achtung
Foreign
seems to be lexical feature of PART
. 100% lemmas (23) occur only with one value of Foreign
.
CCONJ
72 cs-pos/CCONJ tokens (0% of all CCONJ
tokens) have a non-empty value of Foreign
.
CCONJ
tokens may have the following values of Foreign
:
Yes
(72; 100% of non-emptyForeign
): and, et, und, As, ma, or, e, n
PRON
56 cs-pos/PRON tokens (0% of all PRON
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which PRON
and Foreign
co-occurred: PrepCase=EMPTY (56; 100%), Reflex=EMPTY (55; 98%), Variant=EMPTY (55; 98%), Gender=EMPTY (40; 71%), PronType=Prs (37; 66%), Number=Sing (30; 54%).
PRON
tokens may have the following values of Foreign
:
Yes
(56; 100% of non-emptyForeign
): it, All, I, You, Me, Some, Us, We, ja, who
Foreign
seems to be lexical feature of PRON
. 100% lemmas (23) occur only with one value of Foreign
.
NUM
26 cs-pos/NUM tokens (0% of all NUM
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which NUM
and Foreign
co-occurred: NumType=Card (26; 100%), Gender=EMPTY (26; 100%), NumForm=Word (26; 100%), Case=EMPTY (23; 88%), NumValue=1,2,3 (22; 85%), Number=Plur (20; 77%).
NUM
tokens may have the following values of Foreign
:
Yes
(26; 100% of non-emptyForeign
): Four, Twenty, Seven, one, Five, Six, Three, Tri, seděm
DET
17 cs-pos/DET tokens (0% of all DET
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which DET
and Foreign
co-occurred: Animacy=EMPTY (16; 94%), Number[psor]=EMPTY (13; 76%), Case=EMPTY (12; 71%), Gender=EMPTY (12; 71%), Person=EMPTY (11; 65%), Poss=EMPTY (9; 53%).
DET
tokens may have the following values of Foreign
:
Yes
(17; 100% of non-emptyForeign
): My, That, This, Your, sua, C, Notre, These, ce, cui
Foreign
seems to be lexical feature of DET
. 100% lemmas (12) occur only with one value of Foreign
.
INTJ
6 cs-pos/INTJ tokens (7% of all INTJ
tokens) have a non-empty value of Foreign
.
INTJ
tokens may have the following values of Foreign
:
Yes
(6; 100% of non-emptyForeign
): O, propos, Bang, Boom, Crash
SCONJ
5 cs-pos/SCONJ tokens (0% of all SCONJ
tokens) have a non-empty value of Foreign
.
SCONJ
tokens may have the following values of Foreign
:
Yes
(5; 100% of non-emptyForeign
): ak, ako, as, gdyž, kak
Relations with Agreement in Foreign
The 10 most frequent relations where parent and child node agree in Foreign
:
PROPN –[flat:foreign]–> ADJ (846; 100%),
NOUN –[flat:foreign]–> ADJ (525; 100%),
PROPN –[flat:foreign]–> PROPN (240; 100%),
NOUN –[flat:foreign]–> NOUN (143; 99%),
ADJ –[flat:foreign]–> ADJ (126; 100%),
NOUN –[flat:foreign]–> ADP (114; 100%),
ADJ –[flat:foreign]–> PROPN (88; 100%),
NOUN –[flat:foreign]–> PART (50; 100%),
ADJ –[flat:foreign]–> NOUN (39; 100%),
NOUN –[flat:foreign]–> PROPN (26; 87%).
Treebank Statistics (UD_Czech-CAC)
This feature is language-specific.
It occurs with 1 different values: Yes
.
519 tokens (0%) have a non-empty value of Foreign
.
383 types (1%) occur at least once with a non-empty value of Foreign
.
372 lemmas (1%) occur at least once with a non-empty value of Foreign
.
The feature is used with 10 part-of-speech tags: cs-pos/NOUN (255; 0% instances), cs-pos/ADJ (117; 0% instances), cs-pos/ADP (63; 0% instances), cs-pos/PROPN (37; 0% instances), cs-pos/PART (13; 0% instances), cs-pos/ADV (12; 0% instances), cs-pos/PRON (7; 0% instances), cs-pos/VERB (7; 0% instances), cs-pos/DET (5; 0% instances), cs-pos/CCONJ (3; 0% instances).
NOUN
255 cs-pos/NOUN tokens (0% of all NOUN
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which NOUN
and Foreign
co-occurred: Polarity=Pos (255; 100%), Animacy=EMPTY (176; 69%).
NOUN
tokens may have the following values of Foreign
:
Yes
(255; 100% of non-emptyForeign
): luxe, vitro, generis, nepusto, pusto, excellence, homo, lege, peeling, Buch
Foreign
seems to be lexical feature of NOUN
. 100% lemmas (202) occur only with one value of Foreign
.
ADJ
117 cs-pos/ADJ tokens (0% of all ADJ
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which ADJ
and Foreign
co-occurred: Polarity=Pos (117; 100%), Degree=Pos (113; 97%), Animacy=EMPTY (104; 89%), Case=EMPTY (83; 71%), Number=EMPTY (80; 68%), Gender=EMPTY (77; 66%).
ADJ
tokens may have the following values of Foreign
:
Yes
(117; 100% of non-emptyForeign
): online, signifiant, super, la, Jazykovedným, New, Telephone, Tonkünstler, ferenda, fit
Foreign
seems to be lexical feature of ADJ
. 100% lemmas (96) occur only with one value of Foreign
.
ADP
63 cs-pos/ADP tokens (0% of all ADP
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which ADP
and Foreign
co-occurred: AdpType=Prep (63; 100%).
ADP
tokens may have the following values of Foreign
:
Yes
(63; 100% of non-emptyForeign
): de, in, a, ad, cross, of, par, Pro, ante, aus
Foreign
seems to be lexical feature of ADP
. 100% lemmas (20) occur only with one value of Foreign
.
PROPN
37 cs-pos/PROPN tokens (0% of all PROPN
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which PROPN
and Foreign
co-occurred: Polarity=Pos (37; 100%), Abbr=EMPTY (36; 97%), Case=EMPTY (30; 81%), Number=EMPTY (24; 65%), Animacy=EMPTY (24; 65%).
PROPN
tokens may have the following values of Foreign
:
Yes
(37; 100% of non-emptyForeign
): Combi, Kombi, Manche, Orchester, Bell, Böhmen, Corriere, Fruit, Gaudeamus, George
Foreign
seems to be lexical feature of PROPN
. 100% lemmas (32) occur only with one value of Foreign
.
PART
13 cs-pos/PART tokens (0% of all PART
tokens) have a non-empty value of Foreign
.
PART
tokens may have the following values of Foreign
:
Yes
(13; 100% of non-emptyForeign
): La, das, des, Le, el, non, quo, Al
ADV
12 cs-pos/ADV tokens (0% of all ADV
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which ADV
and Foreign
co-occurred: Polarity=EMPTY (12; 100%), Degree=EMPTY (12; 100%), PronType=EMPTY (12; 100%).
ADV
tokens may have the following values of Foreign
:
Yes
(12; 100% of non-emptyForeign
): priori, quo, defacto, explicite, expost, innuce, ipsofacto, memoriam, theory
PRON
7 cs-pos/PRON tokens (0% of all PRON
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which PRON
and Foreign
co-occurred: PrepCase=EMPTY (7; 100%), Variant=EMPTY (7; 100%), Reflex=EMPTY (7; 100%), Number=Sing (5; 71%), Person=3 (4; 57%), Case=Loc (4; 57%), PronType=Prs (4; 57%).
PRON
tokens may have the following values of Foreign
:
Yes
(7; 100% of non-emptyForeign
): eo, ipso, Tous, er, they
VERB
7 cs-pos/VERB tokens (0% of all VERB
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which VERB
and Foreign
co-occurred: Gender=EMPTY (7; 100%), Polarity=Pos (7; 100%), Aspect=EMPTY (7; 100%), Voice=EMPTY (4; 57%), VerbForm=Inf (4; 57%), Number=EMPTY (4; 57%), Person=EMPTY (4; 57%), Mood=EMPTY (4; 57%), Tense=EMPTY (4; 57%).
VERB
tokens may have the following values of Foreign
:
Yes
(7; 100% of non-emptyForeign
): are, formo, movere, savoir, singen, singt, vivre
DET
5 cs-pos/DET tokens (0% of all DET
tokens) have a non-empty value of Foreign
.
The most frequent other feature values with which DET
and Foreign
co-occurred: Animacy=EMPTY (5; 100%), Number[psor]=EMPTY (5; 100%), Person=EMPTY (5; 100%), Number=Sing (4; 80%), Poss=Yes (3; 60%), Case=Gen (3; 60%), Gender=Masc,Neut (3; 60%), PronType=Prs (3; 60%).
DET
tokens may have the following values of Foreign
:
Yes
(5; 100% of non-emptyForeign
): sui, hoc, quem
CCONJ
3 cs-pos/CCONJ tokens (0% of all CCONJ
tokens) have a non-empty value of Foreign
.
CCONJ
tokens may have the following values of Foreign
:
Yes
(3; 100% of non-emptyForeign
): et, and
Relations with Agreement in Foreign
The 10 most frequent relations where parent and child node agree in Foreign
:
NOUN –[flat:foreign]–> ADJ (35; 100%),
NOUN –[conj]–> NOUN (23; 59%),
NOUN –[flat:foreign]–> ADP (17; 100%),
ADJ –[flat:foreign]–> NOUN (14; 100%),
NOUN –[flat:foreign]–> NOUN (14; 82%),
NOUN –[case]–> ADP (11; 52%),
PROPN –[flat:foreign]–> ADJ (7; 100%),
ADJ –[conj]–> ADJ (7; 88%),
NOUN –[flat:foreign]–> PART (5; 100%),
PROPN –[flat:foreign]–> PROPN (5; 100%).
Foreign in other languages: [ar] [cs] [da] [de] [es] [et] [fi] [fo] [hi] [nl] [sl] [u]