home ga/feat edit page issue tracker

This page still pertains to UD version 1.

Form: Form

One of the characteristics of Irish is its tendency for initial mutation to occur in certain circumstances. This is triggered by the preceding word and affects the spelling of nouns, adjectives and verbs. Nouns in Irish are divided into classes according to the way they are inflected to form the genitive singular. There are five such noun-classes or declensions. (The Christian Brothers, 1994)

Ecl : eclipsis

This feature occurs when the initial consonant or vowel of a word is eclipsed by a prefixing consonant. This is either a voiced consonant for voiceless consonants, (e.g. /tʲ/ → /dʲ/, /k/ → /g/) or a nasalised consonant for voiced consonants (e.g. /dʲ/ → /nʲ/, /g/ → /ŋ/). Vowels are eclipsed by adding n- or t-.

Not every consonant can experience eclipsis. The consonants that can be eclipsed in Irish are: p, b, t, d, c, g and f.

Eclipsis will happen in a number of environments:

Examples

Emp : emphatic

The emphatic form is a special form a word takes to mark emphasis in Irish.

Examples

Len : lenition

Lenition is by far the most common means of initial mutation in the treebank. When lenited, h is added immediately after the initial consonant.

These are some of the environments that trigger lenition:

Examples

HPref : h-prefix

When two vowels come together in Irish, a h-prefix is inserted before the second vowel in order to simplify pronunciation.

Examples

VF : Vowel form

Vowel form is an indicator of spelling changes that occur in copular verbs when followed by a word that begins with a vowel or a lenited consonant.

Examples


Treebank Statistics (UD_Irish)

This feature is language-specific. It occurs with 5 different values: Ecl, Emp, HPref, Len, VF.

1737 tokens (13%) have a non-empty value of Form. 950 types (25%) occur at least once with a non-empty value of Form. 692 lemmas (25%) occur at least once with a non-empty value of Form. The feature is used with 12 part-of-speech tags: ga-pos/NOUN (926; 7% instances), ga-pos/VERB (650; 5% instances), ga-pos/ADJ (57; 0% instances), ga-pos/PROPN (45; 0% instances), ga-pos/AUX (25; 0% instances), ga-pos/NUM (20; 0% instances), ga-pos/DET (6; 0% instances), ga-pos/ADP (3; 0% instances), ga-pos/PRON (2; 0% instances), ga-pos/ADV (1; 0% instances), ga-pos/PART (1; 0% instances), ga-pos/X (1; 0% instances).

NOUN

926 ga-pos/NOUN tokens (25% of all NOUN tokens) have a non-empty value of Form.

The most frequent other feature values with which NOUN and Form co-occurred: Definite=EMPTY (926; 100%), VerbForm=EMPTY (731; 79%), Number=Sing (646; 70%), Case=NomAcc (634; 68%), Gender=Masc (474; 51%).

NOUN tokens may have the following values of Form:

Paradigm altEclHPrefLen
bhfo-althaltfho-alt, alt

VERB

650 ga-pos/VERB tokens (59% of all VERB tokens) have a non-empty value of Form.

The most frequent other feature values with which VERB and Form co-occurred: Voice=EMPTY (598; 92%), Mood=Ind (547; 84%), Tense=Past (382; 59%).

VERB tokens may have the following values of Form:

Paradigm EclEmpLen
_mbíonnbhíonn
Mood=Cndmbeadhbheadh
Mood=Cnd|Number=Sing|Person=1mbeinn
Mood=Cnd|Number=Sing|Person=2bheifeá
Mood=Cnd|Number=Plur|Person=3bheidís
Mood=Imp|Number=Sing|Person=1|Tense=Pastmbínn
Mood=Imp|Tense=Pastmbíodhbhíodh
Mood=Ind|Number=Sing|Person=1|PronType=Rel|Tense=Presatáimse
Mood=Ind|Number=Sing|Person=1|Tense=Pastrabhasbhíos
Mood=Ind|Number=Plur|Person=1|Tense=Futbheimid
Mood=Ind|Number=Plur|Person=1|Tense=PastBhíomar
Mood=Ind|Number=Plur|Person=1|Tense=Presbhfuilimid
Mood=Ind|Number=Plur|Person=3|Tense=Pastbhíodar
Mood=Ind|Polarity=Neg|Tense=Futmbeidhbheidh
Mood=Ind|Polarity=Neg|Tense=Fut|Voice=Autombeifear
Mood=Ind|Polarity=Neg|Tense=Pastraibh
Mood=Ind|Polarity=Neg|Tense=Past|Voice=Autorabhthas
Mood=Ind|Polarity=Neg|Tense=Presbhfuil
Mood=Ind|Tense=Futmbeidhbheidh
Mood=Ind|Tense=Pastraibhbhí
Mood=Ind|Tense=Presbhfuil, fuil
Polarity=Negbhíonn

ADJ

57 ga-pos/ADJ tokens (8% of all ADJ tokens) have a non-empty value of Form.

The most frequent other feature values with which ADJ and Form co-occurred: VerbForm=EMPTY (56; 98%), Case=EMPTY (54; 95%), Number=EMPTY (54; 95%), Gender=EMPTY (54; 95%), Degree=Pos (52; 91%).

ADJ tokens may have the following values of Form:

Paradigm céannaEclLen
gcéannachéanna, céanna

Form seems to be lexical feature of ADJ. 97% lemmas (31) occur only with one value of Form.

PROPN

45 ga-pos/PROPN tokens (9% of all PROPN tokens) have a non-empty value of Form.

The most frequent other feature values with which PROPN and Form co-occurred: Number=Sing (35; 78%), Gender=Masc (31; 69%), Case=Gen (30; 67%).

PROPN tokens may have the following values of Form:

Paradigm BaileEclLen
Number=SingBhaile
Number=PlurmBaile

Form seems to be lexical feature of PROPN. 94% lemmas (30) occur only with one value of Form.

AUX

25 ga-pos/AUX tokens (11% of all AUX tokens) have a non-empty value of Form.

The most frequent other feature values with which AUX and Form co-occurred: VerbForm=Cop (25; 100%), PronType=EMPTY (22; 88%), Polarity=EMPTY (22; 88%), Tense=Past (18; 72%).

AUX tokens may have the following values of Form:

NUM

20 ga-pos/NUM tokens (11% of all NUM tokens) have a non-empty value of Form.

The most frequent other feature values with which NUM and Form co-occurred: NumType=Card (11; 55%).

NUM tokens may have the following values of Form:

Paradigm céadEclLen
NumType=Cardchéad
NumType=Ordgcéadchéad

DET

6 ga-pos/DET tokens (0% of all DET tokens) have a non-empty value of Form.

The most frequent other feature values with which DET and Form co-occurred: Gender=EMPTY (6; 100%), Number=EMPTY (6; 100%), Case=EMPTY (6; 100%), Definite=Def (4; 67%), PronType=EMPTY (4; 67%).

DET tokens may have the following values of Form:

ADP

3 ga-pos/ADP tokens (0% of all ADP tokens) have a non-empty value of Form.

The most frequent other feature values with which ADP and Form co-occurred: Gender=EMPTY (3; 100%), PronType=EMPTY (3; 100%), Number=Plur (2; 67%), Person=1 (2; 67%).

ADP tokens may have the following values of Form:

PRON

2 ga-pos/PRON tokens (0% of all PRON tokens) have a non-empty value of Form.

The most frequent other feature values with which PRON and Form co-occurred: Gender=EMPTY (2; 100%).

PRON tokens may have the following values of Form:

X

1 ga-pos/X tokens (1% of all X tokens) have a non-empty value of Form.

The most frequent other feature values with which X and Form co-occurred: Dialect=Munster (1; 100%), Abbr=EMPTY (1; 100%), PronType=EMPTY (1; 100%).

X tokens may have the following values of Form:

ADV

1 ga-pos/ADV tokens (0% of all ADV tokens) have a non-empty value of Form.

ADV tokens may have the following values of Form:

PART

1 ga-pos/PART tokens (0% of all PART tokens) have a non-empty value of Form.

The most frequent other feature values with which PART and Form co-occurred: PronType=Rel (1; 100%), PartType=Vb (1; 100%), Polarity=EMPTY (1; 100%).

PART tokens may have the following values of Form:

Relations with Agreement in Form

The 10 most frequent relations where parent and child node agree in Form: VERB –[conj]–> VERB (34; 61%), PRON –[appos]–> NOUN (2; 100%), ADJ –[advcl]–> ADJ (1; 100%), X –[advcl]–> VERB (1; 100%), ADJ –[ccomp]–> ADJ (1; 100%).


Form in other languages: [ga]