Orthography

L-Urćographie

3.1 History of Written Tunisian

The history of Tunisian begins with the Roman conquest of Carthage in 146 BC. With the annexation of northern Africa into the Roman Empire came the introduction of Latin as the language of administration and learning. Classical Latin became the dominant written language in the region, though Vulgar Latin was the dominant language amongst the general population. The Crisis of the Third Century brought about a sudden decline in literacy in the region; it is from this era that we begin to see the first written traces of North African Latin in the form of misspellings in graffiti and occasional use of words of Berber origin. There is very little written evidence from the era of Vandal occupation in the fifth and early six centuries. When the Eastern Roman Empire reconquers the province in 533, Latin would presumably have returned to its former social status, though few texts seem to have survived.

The Arab conquest of Tunisia in 698 AD brought about significant changes to the region. Though Latin continued to be used in religious contexts and was widely used by the educated elite alongside Arabic in the region (particularly due to the sizable Christian population and persistant trade relations with Europe), knowledge of formal Latin amongst the lower classes was virtually nonexistant, and they would write in their own Romance vernacular using Arabic script. Thus, uniquely amongst the Romance languages, the first texts that are clearly written in Old Tunisian were written using the Arabic alphabet rather than the Roman one.

In the 12th century, Christian crusades had resulted in the establishment of several Christian city-states along the Tunisian coastline. Though governed by the Normans, Middle Tunisian was the language of daily life, and a deliberate effort to expunge traces of Muslim rule led to the re-emergence of Latin script. The spelling of this time was heavily influenced by Middle French. The unification of Tunisia under a Christian king in the 16th century began a new era of Middle Tunisian as a prestige language, and educated writers, usually being multilingual, introduced a number of French and Italian spelling conventions into the language.

Tunisian spelling was first codified formally in the early 18th century by L-Ucdimie Regale yt Tenès, the Royal Academy of Tunisia (then spelled Lu Acadimie Regale de Tenes), a standard heavily influenced by Latin and French pronunciations. In 1803, a major reform resulted in a new spelling system with a much closer correspondence to the spoken language, particularly with respect to vowels. A number of minor changes have taken place since then, but the modern orthography is essentially the same as the one created by the 1803 reform.

3.2 Modern Tunisian Alphabet

The modern Tunisian alphabet consists of 27 letters: A B C Ć Ç D E F G H Ħ I J L M N O P Q R S T U V Y Z Ź. Tunisian also makes use of a number of letters that are not part of the alphabet: Á, É, È, Í, Ó, Ú; these are considered variants of the basic letterform they derive from, so that a dictionary would include words beginning with both A and Á in the 'A' section.

3.3 Spelling and Orthographic Conventions

3.3.1 Representation of Vowels

Tunisian vowel representation is for the most part quite regular. The letters a, i, o, and u correspond to the short vowels /a i o u/. An acute accent marks a long vowel, so á, é, í, ó, and ú represent /aː eː iː oː uː/. However, given that the contrast between /a/ and /aː/ has been neutralized in non-word-final position in most dialects, the distribution of orthographic a and á is mostly historic and by and large must be memorized.

The representation of /e/ and /ə/ is more complicated. The acute form é always represents /eː/ (since there is no long schwa), while the graveaccent è marks short /e/ unambiguously. However, not all /e/ is spelled with è, meaning that e can represent both /e/ and /ə/. The following rules describe the vast majority of cases:

  1. /eː/ is represented with é.
  2. /e/ in a monosyllabic word or word-initially is spelled e.
  3. /e/ in all other circumstances is spelled è.
  4. /ə/ is spelled e.

Before /m n r l/, where the length contrast in vowels is neutralized, the use of acute accents is mostly historic and cannot be predicted: óm /oːm/ “person” and nóm /noːm/ “no”, but dom /doːm/ “house” and com /koːm/ “like, as”.

The letter y is more complicated. It can represent both /i/ and /ə/, although the latter is only found in clitics. Historically, the letter y represented a reduced vowel no longer present in Tunisian that eventually merged with /i/. This includes historical epenthetic vowels (ystle /ˈistlə/ “star” from Latin stēlla) or a former diphthong or vowel + /n/ sequence that collapsed in unstressed environments (pysá /piˈsaː/ “weigh” from Latin pensāre). This later evolved into its modern of indicating a ‘mutating /i/’, that is, an /i/ that is prone to mutating under certain morphosyntactic conditions (cf. l-ustle /ˈlustlə/ “the star”, peis /ˈpejs/ “I weigh”). Consequently, modern Tunisian orthography has many instances of y that do not descend from this one-time reduced vowel, but appears in words where these same alternations appear, as in ysme /ˈismə/ “name” ~ l-usme /ˈlusmə/ “the name”, in contrast to istat /ˈistat/ “summer” ~ l-istat /ˈlistat/ “the summer”. Note that y always represents short /i/; there is no corresponding long vowel *ý glyph used in Tunisian. All non-mutating instances of /i(ː)/ are typically spelled with i and í.

In diphthongs, the glide /j/ is usually represented with i: ai ei oi ui ia ie io iu /aj ej oj uj ja je jo ju/; morpheme-finally, y may be used instead, as in tçay [ˈtʃaj] “tea”. The glide /w/ is represented by o, except if the nucleus is /o/, in which case u is used: ao eo io ou oa oe oi uo /aw ew iw ow wa we wi wo/. The spellings io and oi are ambiguous.

An acute accent on a vowel located after another vowel serves as a diaeresis, indicating that the two vowels should be pronounced as separate syllables and not as a diphthong. It does not necessarily mean the acute vowel is long: /aji, ajiː/.

In many foreign loans, and especially in scientific or international vocabulary, the spelling of vowels may be far more conservative than the pronunciation, with the spelling more closely reflecting the foreign form. In such cases it is not possible to predict the pronunciation from the spelling alone: bióloggie /biˈjoːlgə/ “biology”, telefone /təlˈfonə/ “telephone”, gèneral /ˈʒeːndraːl/ “general (n.)”.

3.3.2 Representation of Consonants

The following table summarizes the spelling of consonants sounds in regular environments.

Spelling Context Pronunciation Examples
b /b/ bivey /ˈbivej/ “to drink”
republicce /rəˈpublikə/ “republic”
promb /ˈproːm/ “lead”
c before e, é, è, i, í, y /s/ císer /ˈsiːseːr/ “caesar”
revoltciun /rəˈvolsjuːn/ “revolution”
pièce /ˈpjesə/ “coin”
everywhere else /k/ cude /ˈkudə/ “tail”
catracte /ˈkaːtraktə/ “waterfall”
dec /ˈdek/ “ten”
cc intervocalically, before e, é, è, i, í, y /k/ accé /aˈkeː/ “maple”
fremicce /frəˈmikə/ “ant”
iccest /ˈikəst/ “this”
ch word-initially or post-consonantally, before e, é, è, i, í, y /k/ chyntá /kinˈtaː/ “to sing”
ymche /ˈimkə/ “female friend”
musche /ˈmuskə/ “music”
ć /θ/ ćéatr /ˈθjaːtr/ “theater”
praće /ˈpraːθə/ “plaza”
oareć /ˈwaːrəθ/ “heir”
ç /ʃ/ çeld /ˈʃeːld/ “hot”
uçí /uˈʃiː/ “neighbor”
baç /ˈbaʃ/ “in order to”
çs /ʃtʃ/, /ʃs/, /s/ protepièçs /ˈprotəˈpjes/ “wallet”
riçs /ˈriʃs/ “feathers”
roçs /ˈroʃtʃ/ “mouths”
d morpheme-finally after n grend /ˈgreːn/ “large”
vèstminds /visˈmiːnz/ “clothing”
prefond /prəˈfoːn/ “deep”
everywhere else /d/ duve /ˈduvə/ “where”
redic /rəˈdik/ “root”
frid /frid/ “large”
/dˤ/ balúdħ /baˈluːdˤ/ “oak”
dħale /ˈdˤaːlə/ “rib”
medħéq /məˈdˤeːq/ “strait”
f /f/ fèmne /ˈfeːmnə/ “woman”
farefit /ˈfaːrəfit/ “butterfly”
cnif /ˈknif/ “knife”
g before e, é, è, i, í, y /ʒ/ gèneral /ˈʒeːndraːl/ “general”
gèntil /ˈʒeːntiːl/ “gentle”
viage /ˈvjaːʒə/ “trip, voyage”
everywhere else /g/ glios /ˈgljos/ “good”
égal /ˈeːgaːl/ “equal”
loug /ˈlowg/ “long”
gg intervocalically, before e, é, è, i, í, y /g/ leigge /ˈlejgə/ “tongue, language”
pragge /ˈpraːgə/ “coastline”
fraggey /ˈfraːgej/ “to break”
gh word-initially or post-consonantally, before e, é, è, i, í, y /g/ gheómetrie /gəˈjoːmtrə/ “geometry”
silghe /ˈsiːlgə/ “chard”
ghez /ˈgez/ “gas, petrol”
gn /nj/ signe /ˈsinjə] “sign”
magnific /ˈmaːnjifik/ “magnificent”
bégne /ˈbenjə/ “bath”
h /x/ húte /ˈxuːtə/ “whale”
fahil /ˈfaːxiːl/ “easy”
haoh /ˈxawx/ “peach tree”
j /ʒ/ jazire /ˈʒaːzirə/ “island”
mejèst /məˈʒest/ “teacher”
haj /ˈxaʒ/ “pilgrimage”
l /l/ libr /ˈlibr/ “book”
eld /ˈeːld/ “other”
foil /ˈfojl/ “leaf”
m /m/ márçs /ˈmaʃs/ “stairs”
ysme /ˈismə/ “name”
prim /ˈprim/ “first”
n /n/ nóv /ˈnoːv/ “nine”
Tenès /təˈnes/ “Tunisia”
óccian /ˈoːkjaːn/ “ocean”
irregularly /ŋ/ caun /ˈkawŋ/ “meat”
qoun /ˈqowŋ/ “horn”
rèstaurant /ˈresturaːŋ/ “restaurant”
p /p/ pac /ˈpak/ “peace”
cupr /ˈkupr/ “copper”
lup /ˈlup/ “jackal, wolf”
q /q/ qoun /ˈqowŋ/ “horn”
faqs /ˈfaːqs/ “cucumber”
araq /ˈaːraq/ “liquor”
r /r/ rost /ˈrost/ “mouth”
yrio /ˈiriw/ “river”
catr /ˈkaːtr/ “four”
s /s/ /ˈseː/ “evening”
pesc /ˈpesk/ “fish”
tres /ˈtres/ “three”
/sˤ/ sħahre /ˈsˤaːxrə/ “desert”
sħabats /ˈsˤaːbats/ “shoes”
Sħén /ˈsˤeːn/ “China”
t /t/ temp /ˈteːmp/ “weather”
lètre /ˈletrə/ “letter”
lutot /ˈlutut/ “everything”
/tˤ/ tħaib /ˈtˤajb/ “ripe”
qutħú /quˈtˤuː/ “cotton”
tħube /ˈtˤubə/ “brick”
v /v/ vioe /ˈviwə/ “village”
nive /ˈnivə/ “snow, ice”
brev /ˈbrev/ “brief”
z /z/ zaitúnay /ˈzajtuːnaj/ “olive tree”
ázulí /azuˈliː/ “blue”
laoz /ˈlawz/ “almond tree”
ź /ð/ źabut /ˈðaːbut/ “armpit”
áźe /ˈaːðə/ “member”
meź /ˈmeð/ “middle”

3.3.3 Foreign Loans

Foreign loans, or at least ones originating from other languages that use the Latin alphabet, tend to only partially assimilate to Tunisian spelling conventions. This is manifested in two main ways.

First, Tunisian will preserve certain non-native spelling conventions. For instance, /ʒ/ is usually spelled j in native words, but in loans from languages such as English, French, and Italian where g + a front vowel represents /ʒ/ or /dʒ/, this spelling will generally be maintained: garage “garage” [gʌ.ˈraː.ʒə], not *garaj or *garaje. The same principle applies to the pronunciation of c as /s/ before front vowels, which does not occur in any native morphemes. Note that if the conditioning environment for a particular pronunciation disappears, the spelling may revert back to regular Tunisian rules, as in the plural garajs “garages” [gʌ.ˈraːʒz]; the spelling *garages would imply an additional vowel /ə/ that is not there, while *garags would only ever imply a hard /g/ rather than /ʒ/. Examples include:

Second, as a corollary to the above, Tunisian spelling will frequently reflect vowel quality and quantity as they were in the original language and not in modern Tunisian. This manifests itself in the form of orthographic vowels that are not pronounced at all or that are pronounced differently than they are spelled; in the latter case, this is usually a case of o being pronounced /u/, è being pronounced /i/, or any vowel being pronounced /ə/:

This is not absolute, however, and there are many clear loanwords that do reflect the vowel changes as well: pulicie /puˈliːsjə/ “police” (which helpfully has u instead of o, but which fails to indicate the fact that this word was borrowed with a long /iː/).

Vowel length in loanwords can also be rather erratic. When borrowing from Arabic, Tunisian generally maintains the original vowel length as seen in spoken Tunisian Arabic. When borrowing from European languages, however, vowels are almost always borrowed short except before coda /m n r l/ (perhaps a side-effect of these consonants' high sonority) and sporadically to force a specific stress assignment:

When discussing loanwords in Tunisian, it is also important to keep in mind that for much of its history Tunisian was in closer contact with Arabic than any Western European languages, and consequently many internationalisms entered the language via Arabic. This accounts for many cases when Tunisian at first glance appeared to have gone against the above generalities and assimilated loanwords beyond immediate recognition, such as Ambricce /ˈaːm.bri.kə/ “America” or qaoe /ˈqawə/ “coffee”, words that tend to be very consistent (and very different) in continental European languages.

1) The suffix ciun /sjuːn/ “-tion” could perhaps be described as a hyperforeignism. The pronunciation with /s/, which displaced the native pronunciation with /t/, is of French origin; since /s/ is commonly spelled c before front vowels in French loans and French t very rarely represents /s/ outside of this particular suffix, the c spelling appears to have been adopted erroneously in Tunisian to preserve the supposed French form.