Form
: Form
One of the characteristics of Irish is its tendency for initial mutation to occur in certain circumstances. This is triggered by the preceding word and affects the spelling of nouns, adjectives and verbs. Nouns in Irish are divided into classes according to the way they are inflected to form the genitive singular. There are five such noun-classes or declensions. (The Christian Brothers, 1994)
Ecl
: eclipsis
This feature occurs when the initial consonant or vowel of a word is eclipsed by a prefixing consonant. This is either a voiced consonant for voiceless consonants, (e.g. /tʲ/ → /dʲ/, /k/ → /g/) or a nasalised consonant for voiced consonants (e.g. /dʲ/ → /nʲ/, /g/ → /ŋ/). Vowels are eclipsed by adding n- or t-.
Not every consonant can experience eclipsis. The consonants that can be eclipsed in Irish are: p, b, t, d, c, g and f.
Eclipsis will happen in a number of environments:
- after the plural possessive nouns ár, bhur and a (“our”, “your (pl.)”, “their”)
- on singular count nouns following the numbers 7-10
- after the preposition i “in”
- on plural nouns in the genitive case after the definite article
- on singular nouns in the dative case after the definite article
- following certain clitics such as interrogative particles (an, nach), complementisers (go, nach) and relativisers (a, nach)
Examples
- a gcuid iarrachtaí “their efforts”
- seacht mbliana “seven years”
- i nGaeilge “in Irish”
- costas na n-oibreacha “cost of the works”
- ar an bhfocal “on the word”
- nach bhfaca sé “he didn’t see”
Emp
: emphatic
The emphatic form is a special form a word takes to mark emphasis in Irish.
Examples
- dom “to me”
- domsa “to me (emph)”
- a deirim “I said”
- a deirimse “I (emph) said”
Len
: lenition
Lenition is by far the most common means of initial mutation in the treebank. When lenited, h is added immediately after the initial consonant.
These are some of the environments that trigger lenition:
- following the definite article (see Definite for specifics)
- following the vocative particle a (see PartType, Case)
- after certain adjectives (singular possessive pronouns, uile, aon, dhá, etc.)
- after certain simple prepositions (a, de, do, faoi, etc.)
- following the past tense of the copula is
- following preverbal particles in the past tense (níor, ar, etc.)
- verb forms in the past tense
Examples
- an fharraige “the sea”
- A Dhochtúir Van Helsing “Doctor Van Helsing”
- mo chuid oibre “my work”
- faoi cheist “under question”
- Ba mhaith liom “I would like”
- Níor chuir sin “that doesn’t put”
- tháinig “came”
HPref
: h-prefix
When two vowels come together in Irish, a h-prefix is inserted before the second vowel in order to simplify pronunciation.
Examples
- go hálainn “lovely”
- na heisimirce “emigration”
- de h-Íde “from Íde”
- ní hamháin “not only”
VF
: Vowel form
Vowel form is an indicator of spelling changes that occur in copular verbs when followed by a word that begins with a vowel or a lenited consonant.
Examples
- is copula, “is”
- ab ea iad “they are”
- gurbh é “it was”
- B’fhearr leis lit. “it was better for him”
Treebank Statistics (UD_Irish)
This feature is language-specific.
It occurs with 5 different values: Ecl
, Emp
, HPref
, Len
, VF
.
1737 tokens (13%) have a non-empty value of Form
.
950 types (25%) occur at least once with a non-empty value of Form
.
692 lemmas (25%) occur at least once with a non-empty value of Form
.
The feature is used with 12 part-of-speech tags: ga-pos/NOUN (926; 7% instances), ga-pos/VERB (650; 5% instances), ga-pos/ADJ (57; 0% instances), ga-pos/PROPN (45; 0% instances), ga-pos/AUX (25; 0% instances), ga-pos/NUM (20; 0% instances), ga-pos/DET (6; 0% instances), ga-pos/ADP (3; 0% instances), ga-pos/PRON (2; 0% instances), ga-pos/ADV (1; 0% instances), ga-pos/PART (1; 0% instances), ga-pos/X (1; 0% instances).
NOUN
926 ga-pos/NOUN tokens (25% of all NOUN
tokens) have a non-empty value of Form
.
The most frequent other feature values with which NOUN
and Form
co-occurred: Definite=EMPTY (926; 100%), VerbForm=EMPTY (731; 79%), Number=Sing (646; 70%), Case=NomAcc (634; 68%), Gender=Masc (474; 51%).
NOUN
tokens may have the following values of Form
:
Ecl
(243; 26% of non-emptyForm
): gceist, gcuid, gcás, ndóigh, bhfeidhm, gContae, gcomhairle, gcrích, mír, bhfadEmp
(1; 0% of non-emptyForm
): liostasaHPref
(22; 2% of non-emptyForm
): haigne, hordú, háit, halt, ham, hamanna, hathair, haturnae, heisimirce, heolaíochtLen
(660; 71% of non-emptyForm
): bheith, chur, dhéanamh, chuid, thabhairt, chéile, fho-alt, chineál, fhios, bhaint
Paradigm alt | Ecl | HPref | Len |
---|---|---|---|
bhfo-alt | halt | fho-alt, alt |
VERB
650 ga-pos/VERB tokens (59% of all VERB
tokens) have a non-empty value of Form
.
The most frequent other feature values with which VERB
and Form
co-occurred: Voice=EMPTY (598; 92%), Mood=Ind (547; 84%), Tense=Past (382; 59%).
VERB
tokens may have the following values of Form
:
Ecl
(184; 28% of non-emptyForm
): bhfuil, raibh, mbeidh, mbeadh, mbíonn, ndeachaigh, bhfuair, dtiocfadh, mbaineann, bhfacaEmp
(3; 0% of non-emptyForm
): Creidimidne, atáimse, deirimseLen
(463; 71% of non-emptyForm
): bhí, bheidh, raibh, bhíonn, thug, tháinig, bheadh, bhaineann, chaith, bhain
Paradigm bí | Ecl | Emp | Len |
---|---|---|---|
_ | mbíonn | bhíonn | |
Mood=Cnd | mbeadh | bheadh | |
Mood=Cnd|Number=Sing|Person=1 | mbeinn | ||
Mood=Cnd|Number=Sing|Person=2 | bheifeá | ||
Mood=Cnd|Number=Plur|Person=3 | bheidís | ||
Mood=Imp|Number=Sing|Person=1|Tense=Past | mbínn | ||
Mood=Imp|Tense=Past | mbíodh | bhíodh | |
Mood=Ind|Number=Sing|Person=1|PronType=Rel|Tense=Pres | atáimse | ||
Mood=Ind|Number=Sing|Person=1|Tense=Past | rabhas | bhíos | |
Mood=Ind|Number=Plur|Person=1|Tense=Fut | bheimid | ||
Mood=Ind|Number=Plur|Person=1|Tense=Past | Bhíomar | ||
Mood=Ind|Number=Plur|Person=1|Tense=Pres | bhfuilimid | ||
Mood=Ind|Number=Plur|Person=3|Tense=Past | bhíodar | ||
Mood=Ind|Polarity=Neg|Tense=Fut | mbeidh | bheidh | |
Mood=Ind|Polarity=Neg|Tense=Fut|Voice=Auto | mbeifear | ||
Mood=Ind|Polarity=Neg|Tense=Past | raibh | ||
Mood=Ind|Polarity=Neg|Tense=Past|Voice=Auto | rabhthas | ||
Mood=Ind|Polarity=Neg|Tense=Pres | bhfuil | ||
Mood=Ind|Tense=Fut | mbeidh | bheidh | |
Mood=Ind|Tense=Past | raibh | bhí | |
Mood=Ind|Tense=Pres | bhfuil, fuil | ||
Polarity=Neg | bhíonn |
ADJ
57 ga-pos/ADJ tokens (8% of all ADJ
tokens) have a non-empty value of Form
.
The most frequent other feature values with which ADJ
and Form
co-occurred: VerbForm=EMPTY (56; 98%), Case=EMPTY (54; 95%), Number=EMPTY (54; 95%), Gender=EMPTY (54; 95%), Degree=Pos (52; 91%).
ADJ
tokens may have the following values of Form
:
Ecl
(1; 2% of non-emptyForm
): gcéannaHPref
(13; 23% of non-emptyForm
): háirithe, hamháin, hiontach, han-luath, hiomlán, hálainn, héifeachtúilLen
(43; 75% of non-emptyForm
): cheart, Bhriotáineach, chóir, mhaith, chultúrtha, chéanna, mhó, mhór, Bhig, Bhunaithe
Paradigm céanna | Ecl | Len |
---|---|---|
gcéanna | chéanna, céanna |
Form
seems to be lexical feature of ADJ
. 97% lemmas (31) occur only with one value of Form
.
PROPN
45 ga-pos/PROPN tokens (9% of all PROPN
tokens) have a non-empty value of Form
.
The most frequent other feature values with which PROPN
and Form
co-occurred: Number=Sing (35; 78%), Gender=Masc (31; 69%), Case=Gen (30; 67%).
PROPN
tokens may have the following values of Form
:
Ecl
(10; 22% of non-emptyForm
): mBaile, Mí, gCuan, gCuideachtaí, nGaillimh, nGearmáinHPref
(2; 4% of non-emptyForm
): h-Íde, míLen
(33; 73% of non-emptyForm
): Bhaile, Mháire, Chill, Chonamara, Ghaeilge, Bhríde, Cholm, Chonaill, Dhonncha, Edward
Paradigm Baile | Ecl | Len |
---|---|---|
Number=Sing | Bhaile | |
Number=Plur | mBaile |
Form
seems to be lexical feature of PROPN
. 94% lemmas (30) occur only with one value of Form
.
AUX
25 ga-pos/AUX tokens (11% of all AUX
tokens) have a non-empty value of Form
.
The most frequent other feature values with which AUX
and Form
co-occurred: VerbForm=Cop (25; 100%), PronType=EMPTY (22; 88%), Polarity=EMPTY (22; 88%), Tense=Past (18; 72%).
AUX
tokens may have the following values of Form
:
VF
(25; 100% of non-emptyForm
): b’, gurb, ab, gurbh, níorbh, mb’, nárbh
NUM
20 ga-pos/NUM tokens (11% of all NUM
tokens) have a non-empty value of Form
.
The most frequent other feature values with which NUM
and Form
co-occurred: NumType=Card (11; 55%).
NUM
tokens may have the following values of Form
:
Ecl
(3; 15% of non-emptyForm
): gceithre, gcéadHPref
(2; 10% of non-emptyForm
): haonLen
(15; 75% of non-emptyForm
): chéad, dhó, cheithre, sheacht, sheasca, thrí
Paradigm céad | Ecl | Len |
---|---|---|
NumType=Card | chéad | |
NumType=Ord | gcéad | chéad |
DET
6 ga-pos/DET tokens (0% of all DET
tokens) have a non-empty value of Form
.
The most frequent other feature values with which DET
and Form
co-occurred: Gender=EMPTY (6; 100%), Number=EMPTY (6; 100%), Case=EMPTY (6; 100%), Definite=Def (4; 67%), PronType=EMPTY (4; 67%).
DET
tokens may have the following values of Form
:
Ecl
(4; 67% of non-emptyForm
): ngach, gachHPref
(2; 33% of non-emptyForm
): haon
ADP
3 ga-pos/ADP tokens (0% of all ADP
tokens) have a non-empty value of Form
.
The most frequent other feature values with which ADP
and Form
co-occurred: Gender=EMPTY (3; 100%), PronType=EMPTY (3; 100%), Number=Plur (2; 67%), Person=1 (2; 67%).
ADP
tokens may have the following values of Form
:
Len
(3; 100% of non-emptyForm
): dhom, dhíobh, dhúinn
PRON
2 ga-pos/PRON tokens (0% of all PRON
tokens) have a non-empty value of Form
.
The most frequent other feature values with which PRON
and Form
co-occurred: Gender=EMPTY (2; 100%).
PRON
tokens may have the following values of Form
:
Len
(2; 100% of non-emptyForm
): cheachtar, thusa
X
1 ga-pos/X tokens (1% of all X
tokens) have a non-empty value of Form
.
The most frequent other feature values with which X
and Form
co-occurred: Dialect=Munster (1; 100%), Abbr=EMPTY (1; 100%), PronType=EMPTY (1; 100%).
X
tokens may have the following values of Form
:
Len
(1; 100% of non-emptyForm
): chuireas
ADV
1 ga-pos/ADV tokens (0% of all ADV
tokens) have a non-empty value of Form
.
ADV
tokens may have the following values of Form
:
Len
(1; 100% of non-emptyForm
): Thuaidh
PART
1 ga-pos/PART tokens (0% of all PART
tokens) have a non-empty value of Form
.
The most frequent other feature values with which PART
and Form
co-occurred: PronType=Rel (1; 100%), PartType=Vb (1; 100%), Polarity=EMPTY (1; 100%).
PART
tokens may have the following values of Form
:
Ecl
(1; 100% of non-emptyForm
): n-a
Relations with Agreement in Form
The 10 most frequent relations where parent and child node agree in Form
:
VERB –[conj]–> VERB (34; 61%),
PRON –[appos]–> NOUN (2; 100%),
ADJ –[advcl]–> ADJ (1; 100%),
X –[advcl]–> VERB (1; 100%),
ADJ –[ccomp]–> ADJ (1; 100%).
Form in other languages: [ga]