PROPN
: proper noun
Description
A proper noun is a noun that is the name of a specific individual, place, object or organisation. In Irish, proper nouns always have initial capitalisation.
Personal names are treated as a sequence of proper nouns. Note that some Irish names have name particles, such as Mac, Ó, Ní, etc., that form part of this sequence (e.g. Anne-Marie Nic Dhonncha).
Similarly, placenames can occur as a string of proper nouns (e.g. Baile Átha Cliath “Dublin”), as can organisations (e.g. an iris Irish Computer “the Irish Computer magazine”). Sometimes these strings can have an internal structure containing other parts of speech such as determiners, for example (Parlaimint na hEorpa “the European Parliament”).
When initial mutation occurs with proper nouns in Irish, the inflection is lowercase, while the main form retains the initial capitalisaion (e.g. i mBaile Átha Cliath “in Dublin”). Similarly, some titles can have lower-case prefixes (e.g. an t-iar-Ghobharnóir “the former Governor”).
Note that days of the week and months of the year in Irish, while capitialised, are not marked as proper nouns but common nouns instead.
Examples
- Lá ‘le Pádraig “St. Patrick’s Day”
- Co. Chiarraí “Co. Kerry”
- Eoraip “Europe”
- Michael D. Higgins
- Lucht Oibre Labour Party
Treebank Statistics (UD_Irish)
There are 353 PROPN
lemmas (12%), 371 PROPN
types (9%) and 474 PROPN
tokens (3%).
Out of 16 observed tags, the rank of PROPN
is: 2 in number of lemmas, 4 in number of types and 10 in number of tokens.
The 10 most frequent PROPN
lemmas: Gaeilge, Éire, Baile, Frainc, Seán, Átha, Cliath, Eoraip, Gaillimh, Éireannach
The 10 most frequent PROPN
types: Gaeilge, Seán, Fraince, Átha, Éirinn, Cliath, Ghaeilge, mBaile, Bhaile, Chaoin
The 10 most frequent ambiguous lemmas: Cliath (PROPN 5, NOUN 1), Éireannach (PROPN 4, ADJ 1), Gearmáin (PROPN 3, NOUN 1), Eaglais (PROPN 2, NOUN 1), Muire (PROPN 2, NOUN 1), mí (NOUN 6, PROPN 2), Bean (PROPN 1, NOUN 1), Mór (ADJ 2, PROPN 1), gleann (NOUN 1, PROPN 1), inis (VERB 2, PROPN 1)
The 10 most frequent ambiguous types: Cliath (PROPN 5, NOUN 1), Eaglais (PROPN 2, NOUN 1), Fianna (PROPN 2, NOUN 1), mí (NOUN 2, PROPN 1), Ard (ADJ 2, PROPN 1), Bean (PROPN 1, NOUN 1), Muire (NOUN 1, PROPN 1), don (ADP 25, PROPN 1), Éireannach (ADJ 1, PROPN 1)
- Cliath
- PROPN 5: ) , Bhuail Éire iad dhá uair i gcluiche coimhlinteach , i mBaile Átha Cliath i 1965 agus i 1989 , 1-0 an dá uair .
- NOUN 1: TOGRA IONAID - Eolas a chur ar fáil ar fholúntais fostaíochta le FÁS ; soláthraítear cúrsaí chomh maith trí Bhoth Eolais ( WATIS ) , atá nasctha le hoifig SEIRBHÍSÍ CEANTAIR áitiúil FÁS agus le príomh-cheanncheathrú FÁS i mBaile Áth Cliath .
- Eaglais
- PROPN 2: Go deimhin úsáideadh dílseacht don Eaglais Bhunaithe mar ghléas le leatrom a dhéanamh ar Chaitlicigh agus le daoine áirithe a iompú in aghaidh na hEaglaise inar rugadh iad .
- NOUN 1: Ní hé amháin sin ach bhí de thoradh ar an dá bheart fadbhreathnaitheacha sin aige gur mhaolaigh ar naimhdeas na Fraince don Eaglais .
- Fianna
- mí
- Ard
- ADJ 2: Tá an teach a bhí ag an mhinistéir ansin , ‘ An Lios ‘ , an teach inar thug Dúbhglas de h-Íde an chuid ba mhó dá óige , ina sheasamh fós , timpeall céad slat isteach ón bhóthar idir Dún Gar agus Bealach an Doirín agus ceathrú mhíle ó shéipéal Eaglais na hÉireann a raibh athair Dhúbhglais chun a bheith ina reachtaire ann ar an Phortach Ard .
- PROPN 1: Orthu siúd bhí John Baptist Crozier , eaglaiseach a bhí ina Ardeaspag ar Ard Mhacha i ndeireadh a shaoil .
- Bean
- Muire
- don
- Éireannach
- ADJ 1: Chuimhneofaí ar an Reibiliúnaí Éireannach mar laoch-ghunnadóir , miongháire doiléir os cionn cóta trinse , ag stánadh amach as an stair ar an ngrianghrafadóir fadó , fadó .
- PROPN 1: Má tá boird stáit agus comhlachtai stáir in Éirinn ag dul ag fógairt mar ‘ phacáiste ‘ ar leith d’ Éirinn , má tá RTÉ ag dul i gcomhar le Sky maidir le dáileadh na mbealach teilifís Éireannach ar bhosca digiteach Sky ar fud na tíre ( cinneadh atá tar éis an-gheit a bhaint as oifig an Aire de Valera , deirtear linn ) níor cheart cur suas leis an neamhaird ar leith a dhéananna meáin na Sasasanach ar Éirinn de ghnáth .
Morphology
The form / lemma ratio of PROPN
is 1.050992 (the average of all parts of speech is 1.393750).
The 1st highest number of forms (4) was observed with the lemma “Éire”: hÉireann, Éire, Éireann, Éirinn.
The 2nd highest number of forms (3) was observed with the lemma “Baile”: Baile, Bhaile, mBaile.
The 3rd highest number of forms (3) was observed with the lemma “Gaeilge”: GHAEILGE, Gaeilge, Ghaeilge.
PROPN
occurs with 7 features: ga-feat/Gender (448; 95% instances), ga-feat/Case (445; 94% instances), ga-feat/Number (428; 90% instances), ga-feat/Form (45; 9% instances), ga-feat/Definite (29; 6% instances), ga-feat/Abbr (1; 0% instances), ga-feat/NounType (1; 0% instances)
PROPN
occurs with 14 feature-value pairs: Abbr=Yes
, Case=Dat
, Case=Gen
, Case=NomAcc
, Case=Voc
, Definite=Def
, Form=Ecl
, Form=HPref
, Form=Len
, Gender=Fem
, Gender=Masc
, NounType=Weak
, Number=Plur
, Number=Sing
PROPN
occurs with 31 feature combinations.
The most frequent feature combination is Case=NomAcc|Gender=Masc|Number=Sing
(265 tokens).
Examples: Seán, Naomh, Pádraig, Baile, Bhlascaoid, Briain, Chorcaí, Dhúbhglais, Dochartaigh, Eoin
Relations
PROPN
nodes are attached to their parents using 16 different relations: ga-dep/compound (167; 35% instances), ga-dep/nsubj (60; 13% instances), ga-dep/nmod (59; 12% instances), ga-dep/flat:name (56; 12% instances), ga-dep/obl (45; 9% instances), ga-dep/conj (19; 4% instances), ga-dep/flat (17; 4% instances), ga-dep/appos (14; 3% instances), ga-dep/root (11; 2% instances), ga-dep/obj (8; 2% instances), ga-dep/xcomp:pred (7; 1% instances), ga-dep/vocative (6; 1% instances), ga-dep/advmod (2; 0% instances), ga-dep/amod (1; 0% instances), ga-dep/case (1; 0% instances), ga-dep/det (1; 0% instances)
Parents of PROPN
nodes belong to 13 different parts of speech: NOUN (188; 40% instances), PROPN (122; 26% instances), VERB (110; 23% instances), PART (13; 3% instances), ROOT (11; 2% instances), ADJ (10; 2% instances), ADP (8; 2% instances), X (5; 1% instances), PRON (3; 1% instances), AUX (1; 0% instances), CCONJ (1; 0% instances), NUM (1; 0% instances), PUNCT (1; 0% instances)
191 (40%) PROPN
nodes are leaves.
159 (34%) PROPN
nodes have one child.
82 (17%) PROPN
nodes have two children.
42 (9%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 6.
Children of PROPN
nodes are attached using 25 different relations: ga-dep/case (94; 20% instances), ga-dep/det (78; 17% instances), ga-dep/compound (64; 14% instances), ga-dep/flat:name (61; 13% instances), ga-dep/punct (46; 10% instances), ga-dep/nmod (19; 4% instances), ga-dep/conj (18; 4% instances), ga-dep/flat (17; 4% instances), ga-dep/appos (13; 3% instances), ga-dep/amod (12; 3% instances), ga-dep/cc (11; 2% instances), ga-dep/advmod (5; 1% instances), ga-dep/mark (5; 1% instances), ga-dep/case:voc (4; 1% instances), ga-dep/cop (3; 1% instances), ga-dep/csubj:cleft (3; 1% instances), ga-dep/ccomp (2; 0% instances), ga-dep/nsubj (2; 0% instances), ga-dep/nummod (2; 0% instances), ga-dep/acl:relcl (1; 0% instances), ga-dep/advcl (1; 0% instances), ga-dep/mark:prt (1; 0% instances), ga-dep/obj (1; 0% instances), ga-dep/obl:prep (1; 0% instances), ga-dep/vocative (1; 0% instances)
Children of PROPN
nodes belong to 15 different parts of speech: PROPN (122; 26% instances), ADP (98; 21% instances), DET (77; 17% instances), PUNCT (46; 10% instances), NOUN (45; 10% instances), PART (21; 5% instances), CCONJ (16; 3% instances), ADJ (11; 2% instances), ADV (6; 1% instances), PRON (6; 1% instances), VERB (6; 1% instances), X (5; 1% instances), AUX (3; 1% instances), NUM (2; 0% instances), SCONJ (1; 0% instances)
PROPN in other languages: [am] [ar] [bg] [bxr] [ca] [ckb] [cop] [cs] [cu] [da] [de] [el] [en] [es] [et] [eu] [fa] [fi] [fo] [fr] [ga] [gl] [got] [grc] [he] [hi] [hr] [hu] [id] [it] [ja] [kk] [kmr] [ko] [la] [lv] [mr] [nl] [no] [pl] [pt] [ro] [ru] [sa] [sk] [sla] [sl] [so] [sr] [sv] [swl] [ta] [tr] [ug] [uk] [u] [urj] [ur] [vi] [yue] [zh]