Aethodian language

From MicroWiki, the free micronational encyclopædia
  (Redirected from Theodian language)
Jump to navigation Jump to search

Theodian
ʼə́ plə fyɔ́ tya
Created byMiles B Huff
Regulated byThe Theodian government
(via the Jury of Linguistics)
Spoken inTheodia
Total speakers0
FamilyLanguage isolate (Theodic)
Element originsMixed
Influenced byEsperantoLingálaLojban, Tok Pisin; ergonomics, relativism
Writing system(s)Theodic
TypeEngelang

Theodian is the primary official language of the Technocratic Republic of Theodia. It is a micronational engelang/explang (a type of constructed language, or 'conlang'). It is being designed by linguist Miles Bradley Huff (known in the micronational sphere as "Swena"), and is a formal language that will eventually—once the language is operational—be governed by the Theodian Jury of Linguistics. Informal varieties are not stigmatized; the formal variety exists mainly to mitigate the communicational difficulties posed by linguistic change-over-time and dialectal variation, especially when it comes to formal documents.

The Theodian language is ceaselessly changing, so please note that this article was current as of 2021, April. Work began on the language in early 2010, although precursors were created as early as late 2009, with the original writing system dating back to 2005.

Note that, although Theodic, the script in which Theodian is conventionally written, is a vertical script; it has generally been written horizontally in this article, so as to better fit English typesetting constraints.

History

Theodian began as Þeodspraask[IPA], which was itself a continuation of Libraskë[IPA], an earlier conlang designed ad hoc by Miles. Etymologically speaking, "Þeod-spraask" meant "nation's language". It utilized a complicated runic writing system, and was lexically and grammatically influenced primarily by Old English and other Germanic languages. Miles developed this early form of Theodian throughout the year 2010. The language originally had a very relativistic bent, and was about optimizing the way Theodians thought.

The language went through several name-changes, from the original Þeodspraask, to Þeodspråxa[IPA], to Þeûdspråx[IPA], to Cavdzut Sprago[IPA], to BlabyJodi[IPA], to the modern ʼЭbləFiodia[IPA]; the English word for it similarly changed, from the original Theodspraask, to Theodspraughsk[IPA], to the modern Theodian. The language was, at first, something of a "kitchen sink", as it was an unelegant concoction of large numbers of arbitrary linguistic features that lacked reasons for inclusion.

Its original article, which was over 80KiB's of text (a record not surpassed by this article until 2016, February), was one of a handful of "Good Articles" on the original MicroWiki. The article was mangled by the switches to Referata and MicroWiki.org.uk (the ancestors of the current MicroWiki), and lost "Good Article" status some time later.

Theodspraask's original alphabet, the Milic fuþark, dates back to 2005, when Miles learned the Anglo-Saxon Fuþorc. Theodspraughsk was one of the first official languages of the Runic Union, and for a short time adopted the Union's official runic script, before returning to Milic runes. Over time, these runes gradually simplified into something resembling their current forms; and Theodian moved on to a novel, fully featural alphabet: Theodic.

The lexicon of Theodian has been completely redone several times throughout its history, and Miles has attributed this to the difficulty of finding objective ways to derive words. Modern Theodian uses a variety of derivational methodologies in a tier system, built largely on a substrate of international scientific vocabulary taken from Interlingua.

Development

There are two general kinds of Theodian: Formal, which is developed by Theodia's Linguistics Jury; and Informal, which is Theodian as it is actually spoken. This follows Theodia's general cultural paradigm, where a single "high" culture is developed by the technocracy; and "low" cultures are allowed to develop and thrive as they will.

Formal Theodian serves a number of important purposes in Theodian society: It serves as a neutral and standardized lingua franca; Its clear definitions are especially important for legal documents and the like; And its being managed by the Linguistics Jury facilitates the adoption of recent scientific advances into the language and its idioms.

Informal varieties of Theodian have no official governance, and their use is not stigmatized. It is expected that there will eventually be a sort of creolization, with formal Theodian as the acrolect.

(Under the country's current transitional government, the functions of the Linguistics Jury are performed by Swena, and development is not well-structured.)

Phonology

Phonology at a glance

IsochronyRegular pitch‑accent
Vowel-systemS9

Number of consonants15
Number of monophthongs9
Number of diphthongs4

Airstreams2
Suprasegmentals5 (via allophony)
Tones1
Note: Phonological analysis of Theodian generally concentrates on, or uses as a point of reference, Formal Theodian as specified in the latest legislation release. Considerable variation may exist in dialects of Informal Theodian.

Phonemes

The tables below show the phoneme-inventory of Formal Theodian. The symbols are from the IPA (International Phonetic Alphabet).

Vowels
Monophthongs
Front Near‑
front
Central Near‑
back
Back
Close





i ɨ u e ə o ɛ a ɔ

Low
Near‑close
Close‑mid
Mid F1
Open‑mid High
Near‑open
Open
High F2 Low
Diphthongs
a-stem o-stem
i-end a͡i̯ ɔ͡i̯
u-end a͡u̯ ɔ͡u̯

As can be seen above, Theodian has a S9 vowel-system (ie, a 9-point "square" vowel-system). On top of these 9 qualities, the language also has, via allophony: nasalization, rhotacization, pharyngealization, and breathy voicing. All of Theodian's diphthongs are closing.

Design notes and history

The vowels needed to be a square number in order to evenly fit into a height/backness chart; this meant, practically, either 4 or 9 vowels. `4` gave only limited room for case prefixes, so `9` was chosen. Theodian's vowels are the 9 most common and simple qualities in the world's languages, per the data at PHOIBLE. The average frequency of a Theodian monophthong among the world's languages is about 53%.

Originally, there were only 8 vowels -- /ɨ/ was omitted -- so that the total number of letters in the Theodian alphabet would be 24, which is evenly dividable by 12, the numerical base used by the language. An extra vowel (/ɨ/) was added in December 2020, thus bringing the total to `9`. This was done to avoid waste on the keyboard: Theodian uses a chorded keyboard layout, in which each hand types a different axis of the height/backness and place/manner charts, meaning that /ɨ/ would be typeable anyway.

Consonants
Labial Coronal Dorsal Laryngeal
Bi LD LL De Al PA AP Pa Ve Uv Ph EG Gl
Nasal m n ŋ
Plosive p t k ʔ
Fricative f s x h
Approximant w l j w ʕ
Trill
Semivowels
Front Central Back
Close j w
Mid (l)
Open (ʕ)
  • Note that this table shows allophony via extended boxes; but only phonemes have been explicitly written.

Theodian's consonants are positioned fairly regularly along the place-manner axes, with approximately 4 consonants per manner and place, the only exceptions being the nasals and laryngeals (which only have three), and the dorsals (which could be said to have 5, due to /w/). Notable is the lack of a voiced or aspirated set among the obstruents. The approximants could all be considered semivowels in some sense, with [w] being equivalent to [u̯], [l] being close to [ɚ̯], [j] being equivalent to [i̯], and [ʕ] being equivalent to [ɑ̯̈].

Design notes and history

The consonants were selected after the orthography was first prototyped, and so it was known that they would need to fit into a 4x4 grid (more details about the orthography can be found below). As such, the 4 most common manners and the 4 fundamental places were selected. Initially, the consonants were going to be completely regular; but this resulted in a rather strange phoneme-inventory; so they were revised such that each individual consonant would be the most common consonant cross-linguistically in that particular broad category (ie, [f] is the most common fricative among the labial places), using the data available at PHOIBLE. The average frequency of a Theodian consonant among the world's languages is about 60%.

Comparison to other languages

Theodian's vowel inventory is present in a few natural languages, such as Masai and Thai. There is, however, no known natural language with Theodian's exact consonant-inventory. In terms of individual features, Theodian happens to share a little bit in-common with Danish, having a nearly identical R-sound (/ʕ/) as well as some creaky voicing in vowels, among other similarities.

Per the data available at PHOIBLE (which has information on the phoneme-inventories of around 2000 languages), the average number of consonants per language is just under 24, and the average number of vowels is about 10.5. Therefore, compared to most documented languages, Theodian has a smallish consonant-inventory (with 15 consonants) and a medium-sized vowel-inventory (with 9 vowels). If only vowel qualities are counted (so, excluding suprasegmentals), then Theodian, relative to the most common vowel-inventory (/a/, /e/, /i/, /o/, /u/), has a pretty big vowel-inventory.

The average frequency of a Theodian phoneme (excluding diphthongs) among the world's languages is about 57%, with consonants at around 60%, and monophthongs at around 53%.

Allophony

Formal Theodian's allophonic processes can be described with 11 rules, 3 of which are collections of other rules (When expanded, there is a total of 17 rules.). Like with all allophonic transformations, they are ordered; but there are, of course, times when the order does not necessarily matter, due to following rules neither (1) being fed/bled by nor (2) counterfeeding/counterbleeding the preceding rule(s). Where feeds and bleeds occur or could occur is indicated after the name of each rule.

In general, allophony, in Theodian, serves to reduce the amount of articulation necessary to pronounce words. As such, most of these rules are assimilative in nature, as well as entirely optional. The optionality of each rule is indicated below. As with other languages, careful speech exhibits less allophony than casual speech.

  • Theodian consonants change depending upon what vocoids follow them. These changes are regular, and are as follows:
    1. (optional) Coronal/dorsal retraction: (2 rules in 1)
      • Before [+back] vocoids: [+coronal] consonants receive [-anterior], and [-approximant +dorsal] consonants receive [+back].
        A similar but less-exact way of saying this, is that [t], [s], [n], [l], [k], [x], [ŋ] become retracted ([t̠], [s̠], [n̠], [l̠], [k̠], [x̠], [ŋ̠], respectively) before [w], [u], [o], [ɔ].
    2. (optional) Coronal/dorsal advancement: (2 rules in 1) (feeds coronal laminalization)
      • Before [+front] vocoids: [+coronal] and [-approximant +dorsal] consonants receive [+front].
        A similar but less-exact way of saying this, is that [t], [s], [n], [l], [k], [x], [ŋ] become fronted ([t̟], [s̟], [n̟], [l̟], [k̟], [x̟], [ŋ̟], respectively) before [j], [i], [e], [ɛ].
    3. (optional) Coronal laminalization: (feeds coronal palatalization)
      • Before [+front +tense] vocoids: [-approximant +coronal] consonants receive [+distributed].
        A similar but less-exact way of saying this, is that [t̟], [s̟], [n̟] become palato-alveolar ([ȶ̟], [ʃ], [ȵ̟], respectively) before [j], [i], [e].
    4. (optional) Coronal palatalization:
      • Before [+front +tense +close] vocoids: [-approximant +coronal] consonants receive [+dorsal].
        A similar but less-exact way of saying this, is that [ȶ̟], [ʃ], [ȵ̟] become alveolo-palatal ([ȶ], [ɕ], [ȵ], respectively) before [j], [i].
        As such, ⟨fjɔ tja⟩, which is phonemically /ˈfjɔ.tja/, is phonetically [ɸɥɔ́.ȡa].
  • Theodian obstruents vary somewhat, both freely and allophonically:
    1. (optional) Obstruent voicing variation: (feeds obstruent voicing-assimilation)
      • Obstruent voicing varies freely; however, there are some tendencies:
        • Obstruents at the beginning of a word or clitic are typically voiceless.
        • Obstruents elsewhere are typically voiced.
    2. (mandatory) Obstruent voicing-assimilation:
      • Obstruents must be [αVoice] with any neigbouring obstruent in the same syllable.
  • Some Theodian consonants, when they occur in the coda, recursively merge with the rest of their syllable. These changes are regular, and are as follows:
    1. (mandatory) Coda-reduction: (5 rules in 1) (bleeds coronal tapping)
      • //σʕ////σ̴// : //_$// (ie, [ʕ] applies uvularization/pharyngealization to the syllable).
      • //σl////σ˞// : //_$// (ie, [l] applies rhotacization to the syllable). Note that this effect ceases propagation through the syllable upon hitting a phone which is [+consonant -vowel].
      • //σN////σ̃// : //_$// (ie, nasals apply nasalization to the syllable). Note that this effect ceases propagation through the syllable upon hitting a stop.
      • //σʔ////σ̰// : //_$// (ie, [ʔ] applies creaky voicing to the syllable). Note that this effect ceases propagation through the syllable upon hitting a voiceless sound.
      • //σh////σ̤// : //_$// (ie, [h] applies breathy voicing to the syllable). Note that this effect ceases propagation through the syllable upon hitting a voiceless sound.
  • Some Theodian consonants change intervocalically. These changes are regular, and are as follows:
    1. (optional) Coronal tapping:
      • [+coronal -strident] ➔ tap : //V_V//
        [t], [n], [l] become tapped ([ɾ], [ɾ̃], [ɺ], respectively) when between two vowels, even between words.
  • At least one of Theodian's phonemes is rather unstable in its realization. More details can be found below:
    1. (optional) Rhotic variation:
      • [ʕ]'s place is generally somewhat unstable, being anywhere between fully uvular and fully pharyngeal. Its manner is also fairly unstable, being variously realized as a trill, a tap, a fricative, and an approximant. The approximant realization is most common in isolation, whereas the others are most common after an obstruent.
        Example: ⟨fran sɛ⟩, /ˈfʕan.sɛ/, [ˈfʁ̝aⁿ.zɛ], which means 'France').
  • Theodian phones change depending on the roundedness of nearby sounds. These changes are regular, and are as follows:
    1. (optional) Consonant rounding-assimilation: (applies recursively within a word)
      • [+consonant][+round] : //_[+syllabic +round]//
        A similar but less-exact way of saying this, is that [f], [j], and all other consonants become rounded ([ɸ], [ɥ], etc, respectively) before a rounded sound in the same syllable.
        Hence, ⟨fjɔ tja⟩, which is phonemically /ˈfjɔ.tja/, is phonetically [ɸɥɔ́.ȡa]; and ⟨fjɔ tja⟩, which is phonemically /ˈfjɔ.tja/, is phonetically [ɸɥɔ́.ȡa].
    2. (optional) Vowel rounding-assimilation:
      • [-consonant][+round] : //[-syllabic +round]_//
        A similar but less-exact way of saying this, is that [i], [e], [ɛ], [a], [ə] become rounded ([y], [ø], [œ], [ɶ], [ɵ], respectively) after [w].
Conventions

As is the case with many language-specific adaptations of the IPA, Theodian's follows certain conventions:

  • /ʕ/ is often transcribed as ⟨r⟩, as it acts as a rhotic in Theodian.

Design notes

Although it might seem that having no unnecessary allophony would be ideal in any language seeking to be "optimal", there are various reasons for its inclusion in Theodian, foremost among them being ease of pronunciation. Most of the above rules result in place-assimilation, which reduces the total amount of articulation; one results in a simplification of a consonant-cluster (coronal palatalization); one is a simple lenition (coronal tapping); a couple are there to allow for free variation (obstruent voicing-variation, rhotic variation), and thereby less precise pronunciation; one is more or less unavoidable (obstruent voicing-assimilation); one dramatically simplifies codas (coda-reduction); and two are largely present only for the sake of logical regularity (coronal/dorsal retraction, coronal/dorsal advancement).

There are three rules (coronal laminalization, consonant rounding-assimilation, vowel rounding-assimilation) which are partially artistic in nature (Miles liked the sounds they bring in, but couldn't justify making them phonemic), but all three are assimilative, and so do help simplify articulation somewhat.

Phonotactics

Theodian has the following phonotactical rules:

General
  • Every syllable has an onset and a nucleus, and may have a coda.
  • The onset may be occupied by up to 2 consonant phonemes, and may not be empty.
  • The nucleus may be occupied by any single phonemic vowel-quality, and may not be empty.
  • The second quality in a diphthong patterns as a consonant.
  • The coda may be occupied by up to 2 consonant-phonemes, and may be empty.

By implication of the above constraints:

  • Hiatus is not permitted.
  • The least complicated Theodian syllable is //CV//.
  • The most complicated Theodian syllable is //CCVCC//.
Restrictions on αFeatures
  • With the exception of /s/, consonants may not cluster with a consonant of a comparable place of articulation (ie, dorsals can't cluster with dorsals).
  • Consonants may not cluster with a consonant of a comparable manner of articulation (ie, fricatives can't cluster with fricatives).
Miscellaneous constraints
  • With the exception of /s/, syllables follow the sonority hierarchy, with consonants further from the nucleus having to be less sonorant than consonants closer to the nucleus; but with the onset's sonority of course independent from that of the coda.
  • Neither /ʔ/ nor /h/ may occur in any complex onset.
  • All possible complex syllable-onsets take the form of: //sPFNA// (where P ≡ Plosive, F ≡ Fricative, N ≡ Nasal, and A ≡ Approximate); where any individual component of such rules may be left out, and where no other rule(s) would be violated.
For example, //Nf// is not allowed, because it would violate the sonority hierarchy; //sf// is not allowed, because it would violate the restriction on αFeatures; and //sPA// is not allowed, because it would violate the limit of 2 consonants per onset. As such, it is very important to apply the above constraints to this formula, in order to divine what is, and what is not, a valid syllable onset in Theodian.
  • All possible syllable codas are an ordered subset of the following set of elements: {A, N, L}; where A ≡ Approximant, N ≡ Nasal, and L ≡ Laryngeal.
For example, /ʔh/ is not allowed, because {L, L} is not a subset of {A, N, L}; /ʔʕ/ is not allowed, because it violates the sonority-hierarchy; and //ʕNʔ// is not allowed, because it would violate the limit of 2 consonants per coda.
/ʕʔ/, /ʕ/, and ////, however, are, for example, allowed, because they are subsets of {A, N, L}, follow the sonority-hierarchy, and have no greater than 2 phonemes.

Design notes

These rules were designed to give Theodian the ability to approximate cluster-rich languages, while also remaining relatively easy to pronounce, itself. The rime, in particular, is seen by Miles as particularly neat, as while phonologically the nucleus and coda are distinct, phonetically the coda is merged into the nucleus. This allows the language to approximate final consonants, while still (potentially) remaining comparable in articulatory ease to a language with no final consonants.

Isochrony & prosody

The form of isochrony and accent used in Theodian are, respectively, syllable-timing and pitch-accent. This means that all syllables have effectively equivalent lengths and amplitudes, and that stress is indicated via pitch alone. Theodian's pitch-accent is regular, applying a high tone (˥) to the first syllable of every lexeme (thus effectively making every lexeme a trochee), and applying a neutral tone (˧) to all other syllables. So, the word /təˈxju.ʕi/ would sound like [tʰə˧.ʝu˥.ʕi˧], as /ˈxju/ is the first syllable of a lexeme, and the other syllables are not.

Syllable-timing was selected to avoid the vowel-reduction of stress-timing, as vowel-qualities are integral to Theodian's grammar. Lexemes were chosen to be trochees, so as to provide a clear contrast between them and any surrounding lexemes or clitics. By happenstance, the ends of words are already distinguished from the beginnings of others by aspiration as all Theodian words start with obstruents when properly inflected; so Theodian's isochrony does not need to distinguish them. Pitch-accent was selected for to indicate prominence, because using vowel-length for this purpose would, given the trochaic nature of the language, have likely devolved into a distinction based upon amplitude (since trochees tend towards accent by amplitude, and iambs to accent by length), and thence into a stress-timed system (which, again, is strongly associated with vowel-reduction processes, something which would dismantle Theodian's agglutinative grammar). Coincidentally, the average number of tones in each language is approximately 1 (per the data at PHOIBLE), which yet again places Theodian into a sort of phonemic middle-ground. Part of the selection of pitch-accent was inspired by the pitch-modulation of GLaDOS, from Portal; but it would not have been chosen if not for the reasons above.

Theodian diphthongs are all falling diphthongs, which means that the first quality in each diphthong is more prominent than the second. This was chosen to be the case, as this is the norm for closing diphthongs.

Intonation is not widely used distinctively, as the grammar is used to indicate things such as questions and sarcasm.

Prosodic stress is realized by lengthening and amplifying the syllable.

Orthography

Aethodic

Created byMiles B Huff
Regulated byThe Aethodian government
(via the Jury of Linguistics)
Familya priori
TypeFeatural alphabet
Influenced byRoman alphabet, Runes, Hangul, Bimodular numerals; ergonomics
Language(s)Aethodian

Writing directionBTB-?LtR 
Number of styles7
Number of characters25
Number of diacritics3
For more information regarding the script as a whole, please see: Theodic script

Theodian's orthography is a language-specific adaptation of basic Theodic, and everything that implies. This means that the language, except where otherwise specified, uses a vertical writing-direction (BtBtB-LtR between syllables and LtR within syllables); multiple 'styles', with support for everything from telegraphs to blind people; etc. Theodian, however, has a particular advantage to being written in Theodic, just as Korean does with Hangul: The script was custom-made for it; and, in Theodian's case, the language was also custom-made for the script.

Differences from the standard

Theodian uses all 25 letters provided by the basic Theodic script. As well, the underline diacritic is, in Theodian, used to indicate aspiration, instead of secondary stress or tone.

Letters

Letter Name Transliterations
 #  Theodic IPA Theodic IPA ASCII Cyrillic Greek Hangul Roman
 1 ʼ /ʔ/ ʼáuʼ /ˈʔauʔ/ ` ӏ ᾿ ʼ
 2 x /ⁿ/ ʼáux /ˈʔauⁿ/ x н ν n
 3 p /p/ páiʼ /ˈpaiʔ/ p п π p
 4 r /ʕ/ ráu /ˈʕau/  r р ρ r
 5 k /k/ kɔ́uʼ /ˈkɔuʔ/ k к κ k
 6 m /m/ máix /ˈmaiⁿ/ m м μ m
 7 h /h/ háuh /ˈhauh/ h г h
 8 j /j/ jɔ́u /ˈjɔu/  j ь ι y
 9 f /f/ fáih /ˈfaih/ f ф φ f
10 ŋ /ŋ/ ŋɔ́ux /ˈŋɔuⁿ/ q ң γγ ŋ
11 c /x/ cɔ́uh /ˈxɔuh/ c х χ kh
12 w /w/ wái /ˈwai/  w в β w
13 t /t/ tɔ́iʼ /ˈtɔiʔ/ t т τ t
14 n /n/ nɔ́ix /ˈnɔiⁿ/ n н ν n
15 s /s/ sɔ́ih /ˈsɔih/ s с σ s
16 l /l/ lɔ́i /ˈlɔi/  l л λ l
17 ə /ə/ tə́ /ˈtə/   y ъ α ə
18 a /a/ /ˈsa/   a а αα a
19 ɛ /ɛ/ cɛ́ /ˈxɛ/   e э ε ɛ
20 e /e/ /ˈke/   ee е εε e
21 i /i/ ŋí /ˈŋi/   i и η i
22 ɨ /ɨ/ nɨ́ /ˈnɨ/   yy ы υ ı
23 u /u/ /ˈmu/   u у ου u
24 o /o/ /ˈpo/   oo оо ο o
25 ɔ /ɔ/ fɔ́ /ˈfɔ/   o o ω ɔ
Note: Transliterated text has its spacing restructured to be between lexeme-boundaries.

Design notes and history

Theodian uses all of the letters from the basic Theodic script, giving it a total of 16 consonants and 9 vowels.

At first, the letters' phonic values were very strictly constructed from their features; but the place axis was later broadened (alveolar -> coronal, for example), and the most cross-linguistically widespread phone in each place-manner combination was chosen for each box in the graph (see the Phononlogy section above for more info).

The names of the letters were originally based on the ICAO spelling alphabet, in the interest of increasing the ease by which one might spell in the presence of interference; but using these had the potential to result in confusion in the event that the Theodic and Roman alphabets ever be mixed, and ICAO spellings do not support several of Theodic's letters.

Resultantly, a novel naming system was derived. In it, the name of every letter, apart from ⟨x⟩ and the vowels (due to their inabilities to occur in onsets in Theodian), begins with the sound it represents, and the aforementioned exceptions begin with ⟨ʼ⟩ (This was later changed, for vowels.). For their nuclei, vowels receive themselves, and consonants receive one of four diphthongs, according to their places: /au/ for laryngeals, /ɔu/ for dorsals, /ɔi/ for coronals, and /ai/ for labials. For their codas, vowels and approximants receive nothing; while the remaining letters receive one of four codas, according to their manners: /ʔ/ for plosives, /h/ for fricatives, and /ⁿ/ for nasals. Resultantly, every consonantal name — apart from ⟨x⟩'s, which lacks redundancy in manner — is fully redundant, with the consonant's features encoded for not only by the consonant itself, but also all following letters (The nucleus encodes place, and the coda encodes manner.).

Diphthongs were chosen for the nuclei of consonants in order to differentiate them from the vowels, which use monophthongs for their nuclei. The codas were designed such that they would match the manners of their corresponding letters' sounds. Consonantal approximants were not given a coda, as doing so would have required breaking the phonotactic preference against having matching approximants in the onset and coda. The diphthongs themselves were also chosen for phonotactic reasons: the laryngeals couldn't use a diphthong starting with /ɔ/, since this sound cannot occur after a /ʕ/; the dorsals couldn't use a diphthong ending in /i̯/, since this sound matches /j/; and the labials couldn't use a diphthong ending in /u̯/, since this sound matches /w/. The remaining diphthongs were decided per the following reasons: the alveolars' diphthongs were chosen to end in /i/, since /u/ is closer to /ɫ/, a cross-linguistically somewhat common degeneration of /l/; and the labials' were chosen to start with /a/, since /ɔ/ would have forced these sounds into allophonic forms.

Originally, the diphthongs were /ɔi/ for laryngeals, /ai/ for dorsals, /au/ for coronals, and /ɔu/ for labials; but this forced more sounds into allophony than was desired, and violated newer revisions to the phonotactics.

The phonetic features of the vowel letters are also doubly encoded in their names: first with a consonant, and then with the vowel itself. Nasals represent closed vowels (nasals are technically a kind of held stop), plosives represent mid vowels, and fricatives represent open vowels (fricatives are stops that don't stop). Labials represent back vowels (since they're all rounded), coronals represent central vowels, and velars represent front vowels (since palatalization (a phenomenon associated with front vowels) is velar). Vowels do not have any consonants in their codas; this is to preserve the clarity of their qualities. To minimize overlap with the consonant names, approximants (which also have no consonants in their names' codas) are not used in vowel names. These vowel names were added in December 2020 -- much later than the consonants' names. Prior to this, all vowel names were a glottal stop followed by the vowel in-question.

The letters are ordered such that consonants which double as numbers come first. The remaining four consonants were then appended, in the interest of compatibility with hexadecimal; they were generated akin to the dozenal numbers. The vowels were then ordered in a circular direction, starting from ⟨⟩ (/ə/) (which is used throughout the language as a "default" vowel, and should therefore come first), and working up through the unrounded vowels and down through the rounded vowels until ⟨⟩ (/ɔ/) is reached. To make this order as useful as possible, many schematic portions of Theodian have been designed such that their alphabetical order matches their conceptual order (eg: color-terms, when alphabetized, become ordered by luminance, hue, and chroma).

Numerals

Theodian's numerals' values are, apart from ⟨w⟩'s (which is '0'), actually identical to their alphabetical indices; and are therefore an example of an alphabetic numeral system. Although they can be inferred above, the table below re-enumerates the forms of these numerals, as well as their equivalents in western Arabic numerals, for your convenience.

Theodic ʼ x p r k m h j f ŋ c w t n s l
Arabic   1     2     3     4     5     6     7     8     9     A     B     0     C     D     E     F  

Design notes and history

The system of numerals has changed somewhat over the years. This section not only explains the reasoning behind the system, but also the reasoning behind its changes. It will begin with the system as it was originally.

In March 2015, Miles discovered "Bimodular" numerals, and later applied the thought behind them to the basic shapes of Theodic consonants in order to derive a set of positional numerals which were internally both linguistically and graphically meaningful. The numerals may appear, at first glance, to be in a random order; but this is not the case, as they are actually a mixture of half-quaternary placement of manner, and ternary placement of place. This means that, every 2 and every 4 numerals, a pattern is repeated in the numeral's manner, ie the top half of the numeral; and that, every 3 numerals, a pattern is repeated in the numeral's place, ie the bottom half of the numeral. This, essentially, is just generating numerals based on the prime factorization of the base (twelve, in this case), which would be (2*2)*3; but using sounds to do so. Because the Theodic script is so schematic, this results in numerals that have three layers of patterns both phonetically and orthographically.

Although some may worry that having letters and numbers be identical to each other may cause confusion; Miles considered the facts that (1) the Theodian language has the positional system heavily baked-in; (2) letters are frequently used for enumeration even where separate, dedicated numerals do exist; and (3) combining the two means less symbols that need to be created, learned, and typed.

The sequence of [+Voice] and [-Voice] features (in Theodian, only plosives and fricatives can be devoiced) was chosen for evens and odds, respectively. This, in addition the the phonetical contrast, causes a graphical distinction between rounder (bowl/hook) and flatter (bar/stem) letters. ⟨ ⟩ and ⟨ ⟩ were chosen to represent higher values (ie, more plural values) over ⟨⟩ and ⟨ ⟩, as the formers are more sonorous, and a lower sonority was taken by Miles to align more closely with a lower value. Place, which doesn't have sonority, was treated identically, due to graphical similarity.

The sequence of ⟨⟩, ⟨ ⟩, ⟨ ⟩, ⟨ ⟩ was originally chosen for manner, in accordance with the above principles. The sequence of ⟨ ⟩, ⟨ ⟩, ⟨ ⟩ was chosen for place, so that non-multiples of 3 would start in a sound produced by the tongue, while multiples of 3 would start in a sound produced by the lips; which provides a great contrast, both phonetically and orthographically, between the two conditions. The graphically simpler sequence of ⟨⟩, ⟨ ⟩, ⟨ ⟩ was not originally chosen, because it would have generated ⟨ ⟩, which, in the numerical-positional case system in use in Theodian at the time, would have had no way of being pronounced, as this system required that each numeral's name begin with its associated phoneme—but ⟨ ⟩ is not allowed in syllable-onsets.

Resultantly, the key to deciphering the relations between the numerals, is to look for the rounded bits. Every time you see a ⟨ ⟩ or a ⟨ ⟩ in a numeral, it's even. Every time you see a ⟨ ⟩ in a numeral, it's divisible by 4. Every time you see a ⟨ ⟩ in a numeral, it's divisible by 3. The system was then extended to be capable of mostly (⟨ ⟩ was still a problem) supporting hexadecimal notation, by adding the four remaining consonants in order of place.

In late 2016—after numerical-positional case was discarded and, resultantly, the names the symbols had as letters being used also for when they are numbers—there was no longer any reason to use ⟨ ⟩ and ⟨ ⟩ instead of ⟨⟩ and ⟨ ⟩. As such, to simplify digits and maximize the graphemical distance between those numbers which are, and those which aren't, multiples of 3; ⟨⟩, ⟨ ⟩, and ⟨ ⟩ were adopted for the bottom combinant portions. The top combinant portions were largely kept the same, but with the caveat that the first 6 numerals would use a sequence of ⟨⟩, ⟨ ⟩, ⟨⟩, ⟨ ⟩; while the latter 6 numerals would use a sequence of ⟨ ⟩, ⟨ ⟩, ⟨ ⟩, ⟨ ⟩. This not only simplifies most of the digits (especially 1, 3, and 5), but also makes it such that the first 6 numerals effectively fit graphically in any number they are a multiple of. It should be noted that this is simply an unintended side-effect of the intentionally encoded meanings, which are shown exclusively via either place or manner. The numerals 5 and 7, it should be noted, received the place opposite that which they were supposed to receive. That was done so that 5 would remain graphically separate from 1, and 7 from 11. Although this resulted in non-labials not all matching in place on each side of the 6, it did result in all odds on either side being either plosives (if less than 6) or fricatives (if greater than 6). Due to this change, the auxiliary hexadecimal numerals all ended up being coronal, the place with perhaps the most visually complex symbol.

Romanization

Although there is an official Theodian:Roman transliteration-scheme which is suitable for general use, there is also an official Romanization scheme, which is designed to work more comfortably for people used to orthographies based on the Roman alphabet. This Romanized version of Theodian is the one that is most typically used by foreign publications, whereas the Roman transliteration is mostly only used educationally, and the ASCII transliteration between speakers of Theodian; this is largely because the transliterations are so different from most Roman-based orthographies that they can be rather opaque to those unaccustomed to them.

The goal of the official Romanization scheme is to create an orthography that can be read by as many Roman alphabet readers as possible. In this light, certain letters with inconsistent cross-language pronunciations, like 'z', are largely eschewed; and in order to make the results seem more familiar, several of the Roman script's quirks have been incorporated.

The standard way to Romanize Theodian from an ASCII transliteration is this: (note that each step should done in order)

Romanization instructions
  1. All spaces are removed.
  2. A space is placed between words, and also between lexemes and PoS clitics.
  3. All letters are decapitalized.
  4. ⟨q⟩ becomes ⟨gn⟩ at the beginning of a word or lexeme, ⟨n⟩ before a ⟨k⟩ or ⟨j⟩, and ⟨ng⟩ elsewhere.
  5. ⟨`⟩ becomes ⟨q⟩ word-finally.
  6. ⟨`⟩ is deleted word-initially.
  7. ⟨`⟩ becomes ⟨ʼ⟩.
  8. The codas ⟨xh⟩, ⟨rh⟩, and ⟨lh⟩ respectively become ⟨hx⟩, ⟨hr⟩, and ⟨hl⟩.
  9. ⟨x⟩ becomes ⟨n⟩.
  10. ⟨n⟩ before ⟨p⟩ or ⟨f⟩ becomes ⟨m⟩.
  11. In word-final codas, ⟨n⟩ becomes ⟨m⟩ after ⟨o⟩ or ⟨u⟩.
  12. ⟨m⟩ before ⟨q⟩ becomes ⟨n⟩.
  13. ⟨yy⟩ becomes ⟨ı⟩.
  14. ⟨y⟩ becomes ⟨ə⟩.
  15. ⟨j⟩, when at the beginning of a word/lexeme becomes ⟨y⟩.
  16. ⟨j⟩ becomes ⟨i⟩.
  17. Any monosyllabic open-class lexeme whose syllable nucleus is ⟨i⟩ adds a ⟨y⟩ to its end.
  18. ⟨i⟩, when at the end of a word/lexeme or when between two other vowels, becomes ⟨y⟩.
  19. ⟨yy⟩ becomes ⟨yi⟩.
  20. ⟨ai⟩ becomes ⟨aiy⟩.
  21. ⟨w⟩ becomes ⟨u⟩.
  22. ⟨o⟩ becomes ⟨å⟩.
  23. ⟨åi⟩ becomes ⟨oy⟩.
  24. ⟨au⟩ becomes ⟨ao⟩.
  25. ⟨åu⟩ becomes ⟨ou⟩.
  26. ⟨cj⟩ becomes ⟨hj⟩.
  27. ⟨c⟩ becomes ⟨kh⟩.
  28. ⟨si⟩ and ⟨se⟩ respectively become ⟨ši⟩ and ⟨še⟩.
  29. ⟨ši⟩ becomes ⟨š⟩ when the ⟨i⟩ is immediately followed by a vowel.
  30. ⟨e⟩ becomes ⟨ä⟩.
  31. ⟨ää⟩ becomes ⟨e⟩.
  32. ⟨e⟩, when at the end of a syllable, becomes ⟨ey⟩.
  33. ⟨ə⟩ becomes ⟨ë⟩.
  34. All consonants except the beginnings of words and lexemes become voiced (voiceless consonants without voiced equivalents: ʼ q f h s š).
  35. ⟨dš⟩ becomes ⟨dj⟩.
  36. ⟨g⟩ becomes ⟨k⟩ when before ⟨i⟩, ⟨e⟩, ⟨y⟩, ⟨ue⟩, or ⟨ui⟩.
  37. The first letter of each sentence, as well as all names and other proper nouns, is capitalized.
  38. The first vowel in each lexeme receives a combining acute.
  39. The rest of the letters stay the same.

Anglicization

Although the official Romanization scheme renders Roman text that should be legible to most languages, there is also an additional Anglicization scheme for Theodian. This exists largely to render Theodian names in an ASCII- and English-friendly format.

The standard way to Anglicize Theodian from a Romanization is this: (note that each step should done in order)

Anglicization instructions
  1. All letters are decapitalized.
  2. All acute accents are removed.
  3. ⟨h⟩ is deleted when in the coda.
  4. ⟨ʼ⟩ and ⟨q⟩ are deleted in the onset.
  5. ⟨ʼ⟩ becomes ⟨q⟩.
  6. ⟨dj⟩ becomes ⟨j⟩.
  7. ⟨tš⟩ becomes ⟨ch⟩.
  8. ⟨š⟩ becomes ⟨sh⟩.
  9. ⟨u⟩ becomes ⟨w⟩ when it is part of a syllable onset.
  10. ⟨u⟩ becomes ⟨oo⟩ when stressed.
  11. ⟨ı⟩ becomes ⟨i⟩.
  12. ⟨ey⟩ becomes ⟨ay⟩ in the coda.
  13. ⟨ey⟩ becomes ⟨ai⟩.
  14. ⟨aiy⟩ becomes ⟨ei⟩.
  15. ⟨iy⟩ becomes ⟨ee⟩.
  16. ⟨i⟩ becomes ⟨ie⟩ when stressed.
  17. ⟨å⟩ becomes ⟨o⟩.
  18. ⟨är⟩ becomes ⟨air⟩.
  19. ⟨ä⟩ becomes ⟨e⟩.
  20. ⟨ë⟩ becomes ⟨a⟩ when it is the sole coda of an unstressed syllable.
  21. ⟨ë⟩ becomes ⟨e⟩.
  22. ⟨ks⟩ and ⟨gs⟩ become ⟨x⟩.
  23. ⟨k⟩ becomes ⟨c⟩ except before ⟨e⟩ and ⟨i⟩.
  24. All consonants except the beginnings of words and lexemes become voiced (voiceless consonants without voiced equivalents: h s q).
  25. ⟨q⟩ becomes ⟨t⟩.
  26. The first letter of each sentence, as well as all names and other proper nouns, is capitalized.
  27. The rest of the letters stay the same.

Grammar

Grammar at a glance
Morphological typologyAgglutinative
Morphosyntactic alignmentNom‑acc (pro-drop)
Head directionPragmatic head‑initial
Constituent orderPragmatic V1
Adpositional orderPragmatic PMT

Summary

Theodian is an agglutinative language, having easily identifiable and separable morphemes and a high morpheme-to-word ratio: around 3:1.

Every open class word consists of a prefixial clitic containing two pieces of information: the word's part of speech, and an inflection indicating the thematic role it has or agrees with. Following this, comes any reasonable number of lexemes, after each of which may come any reasonable number of derivational suffices.

In several cases, individual lexemes can be broken down into morphemes (perhaps most notably, the colour vocabulary); but these operate outside the typical morphemical clockwork. In addition to this, many morphemes (especially schematic ones) can be further broken down into a number of fuzzy phonesthemes (for example, the relativizing lexeme "Ru`" can be broken into three phonesthemes: /ʕ/, representing a bend; /u/, representing focus; and /ʔ/ representing breaking something up; thus creating a collective meaning somewhere along the lines of "bending and breaking the focus").

Please know that the grammar of Theodian is yet incomplete and, although relatively robust, cannot yet express all that a language must. The features discussed in this article are those which are most stable and most likely to make it in to the first formal release of the language; however, even they are not set in stone, and are quite liable to change at any given time.

Miscellania

Word-order

Theodian, being a highly inflected language, has very free word order, having word-order tendencies more-so than word-order rules: in phrases that contain a verb, the verb usually comes first; and in phrases with wh-questions, the wh-word usually comes first (wh-fronting). Generally, though, words are written in order of salience, and spoken in order of convenience (ie, whatever comes to mind first).

Theodian's word-order is so free that it (theoretically) permits extreme amounts of scrambling. An example: "QUA-ACC-Very RFR-ACC-Egg RFR-NOM-Me DESC-ACC-Green RFR-VB-Eat DESC-VB-Previous". This means "I ate very green eggs". In practice, such scrambling is fairly rare outside of poetry.

Valency

Theodian is pragmatically pro-drop. In this case, that means any argument or adjunct to a verb may be omitted so long as the verb remains at least intransitive (a valency of 1). As such, passivization can effectively be achieved by simply deleting the original subject — no promotion of the object required. An example: "RFR-VB-Eat RFR-ACC-Egg". This means "Eggs are eaten.". Accordingly, all verbs are ambitransitive and subject to diathesis alternation. For the sake of simplicity, though, all Theodian sentences are traditionally considered to be active voice; although others have analyzed the language as being predominantly middle-voice.[1]

Additionally, Theodian seemingly allows multiple phrases to fill the same nodes (This can be analyzed with n-ary branching or empty nodes.). The way that this works, is that all items of the same inflection in the same complementizer clause are treated identically by the grammar. Some examples:

  • "RFR-VB-Eat RFR-VB-Enjoy RFR-ACC-Egg RFR-ACC-Ham DESC-ACC-Green."
    "Both eggs and ham, both of which are green, are both eaten and enjoyed.".
  • "RFR-VB-Eat RFR-VB-Enjoy RFR-ACC-Egg RFR-ACC-REL RFR-NOM-Ham DESC-NOM-Green REL."
    "Both ham and green eggs are both eaten and enjoyed.".

In both of these examples, eggs and ham are not only being eaten, but also enjoyed. Additionally, in the first example, both the eggs and the ham are green; but in the second example, only the ham is green. The embedding of 'ham' and 'green' into a relative clause insulates the scope of 'green', as it can only apply to identically inflected items in the same level.

Morphosyntactic alignment

Theodian is officially a nominative-accusative language with variable valency. However, because Theodian cases primarily inflect for thematic relation and because Theodian can be analyzed as being predominantly middle-voice, it is also possible to analyze the language as fluid-S.

Theodian was originally going to be tripartite. However, this required the use of a superfluous case; which ultimately made it simpler to merge A with either S or O. Merging A with S (nominative-accusative alignment) is cross-linguistically more-common, and Theodian prefers agentive intransitive verbs to patientive ones, anyway; so this is what the language switched to.

Copulas

Theodian copulas are normal verbs, and their predicates are inflected like normal objects. This takes the most advantage of the language's standard grammatical paradigms. Debate is still on-going regarding how general the copula will be.

Compounds

Compound words are constructed head-initially, by stacking bare lexemes onto the end of another lexeme. All of these lexemes are interpreted as being of the same PoS as the clitic attached to their headmost lexeme; and compound-words' subclasses are determined ad-hoc by their constituents' meanings. Where it is needed to express a compounded concept which would require constituents of differing PoS's, like 'skyscraper', a phrase consisting of these ideas can be assembled, and then placed into a relativizing lexeme, like so: RFR-NOM-REL RFR-VB-Scrape RFR-ACC-Sky REL.

Compounding does not influence the placement of stress within the compounded words.

Reduplication

Reduplication, as a distinct grammatical phenomenon, does not exist in Theodian. Seemingly reduplicated words exist; but they function identically to any other compound word, and an adjectival phrase would be capable of expressing the same meaning. An example of a seemingly reduplicative phrase in Theodian, is "milk-milk", which means "real milk", as opposed to various non-animal-milks. Essentially, the reference of the expression is tightened, moving it closer to its prototypical meaning. This is especially useful for de-innuendoizing words, and is, coïncidentally, similar to how English often handles reduplication.

Preclitics

Theodian uses clitics to inflect for part of speech ("PoS") and thematic role ("case"). These clitics must occur immediately before their head, and may not be at all estranged from it. They do not affect stress. These clitics may stack, thus allowing for subderivations of PoS's without the need for a true relative clause. As an example: "RFR-NOM-RFR-VB-Run" means 'a running', because it takes the bare lexeme 'run', makes it a verb, and then makes that verb a noun.

Parts of speech

Theodian has a robust part of speech (PoS) system with a number of unique traits:

  • Nouns and verbs use the same PoS morpheme. Adjectives and verbal adverbs do, too. This minimizes morpheme duplicity, requiring half as many PoSes and "case" inflections compared to keeping them separate, as was done in older versions of Theodian.
  • Preclitics are used instead of another positionality, so that how a word is being used is evident before the word is actually used — this is intended to help avoid garden-path parses, and was originally inspired by the widespread use of prefixes in Lingála.
  • A word's part of speech is largely dynamic, and can be changed by simply changing its PoS morpheme; so that "ʼǝ tir" means "a shot", and "ʼe tir" means "to shoot". This was inspired by Esperanto.
  • PoS clitics can be compounded to create "complex" parts of speech, as in "ʼǝ ʼe tir", "a shooting".
  • In common speech, compounded PoS morphemes are lenited, so "ʼǝ ʼe tir" becomes "ʼǝhe tir". This is intended to increase speakers' ease of pronunciation by avoiding an excessive number of stops, and to help show the continuity of successive PoS clitics.
  • Every lexeme has one inherent part of speech, which is recorded alongside its dictionary definition. When deriving from a lexeme by using complex parts of speech, the PoS morpheme for the lexeme's inherent part of speech can be dropped, so long as it would have been used with the default "case". For example, "ʼe tir" is actually an abbreviation of "ʼehǝ tir", as the lexeme "tir" is inherently a noun.
Simple

The primary parts of speech and their associated meanings, preclitics, and openness are as follows:

PoS Morpheme Open?
Name Abbr Theodic IPA
Referrer RFR ʼ / h ʔ/h Open
Descriptor DESC k / c k/x Open
Qualitator QUAL t / s t/s Open
Determiner DET p / f p/f Closed
  • A PoS clitic is only "simple" if it corresponds to the lexeme's inherent part of speech. If not, it's "complex".
Complex

The following chart shows how adding a secondary PoS clitic changes the grammatical functions of a word:

Secondary Primary Function
DESC RFR-NOM Noun adjunct / Possessive
RFR-NOM DESC Adnoun ("poor" -> "the poor")
RFR-NOM RFR-VRB Gerund
RFR-VRB RFR-NOM To do/make the Referrer.
RFR-VRB DESC To become the Descriptor
  • Referrers are special — nominal and verbal referrers are grammatically different parts of speech, even though they share the same PoS inflection; so definitions of hybrid PoSes must specify which kind of Referrer they're using.

Case

All PoS clitics are inflected for thematic role — "case", for the sake of convenience. This inflection occurs immediately after the part of speech morpheme, and is always a vowel.

Theodian does not have prepositions, but rather expresses more complex relations by means of diction (example: "RFR-INS-Center DESC-INS-Road" instead of "Through the center of the road"). Of particular interest are four words (translatable to "interior", "surface", "adjacency", and "state") which, when combined with the relational cases, allow 16 different physical relations to be expressed.

It is important to note that cases in Theodian are primarily concerned with thematic role, rather than syntax. Because of this, any sentence can theoretically take any number of referrers in any case, so long as the phrase is semantically coherent. As well, Theodian lacks a passive (as well as most other forms of verbal voice), as verbs are not inherently required to have an adjunct of any particular case; so, instead of "it was hit", Theodian can simply delete the subject without any other changes to the sentence: "RFR-VB-Hit RFR-ACC-It". Copulas are treated as normal verbs, and their predicates are in the accusative.

All possible "case" inflections are as follows:

Action Case Abbr Morpheme Thematic relation(s)
Instrumental INS i i way, manner, instrument, equipment
Verbal VB e e action
Evidential EVI ɛ ɛ evidence
Actors Case Abbr Morpheme Thematic relation(s)
Accusative ACC ɨ ɨ patient, theme
Nominative NOM ə ə agent, experiencer, force
Causal CAU a a purpose, cause
Obliques Case Abbr Morpheme Thematic relation(s)
Dative DAT u u destination (time or space), recipient, benefactor
Locative LOC o o location (time or space), invocation
Ablative ABL ɔ ɔ origin (time or space), beneficiary
  • Front-vowel cases relate directly to the action being performed or the sentence as a whole, central-vowel cases are actors, and back-vowel cases are oblique.
  • Close-vowel cases are targets, mid-vowel cases are generic, and open-vowel cases are origins.
  • Nominative nouns are the "unmarked default" and are represented by a Schwa. For a time, the default was the verb; but this was changed to better accommodate the gavagai instinct, in which people assume unmarked things to be nouns; and also to make the most common situation (a nominative noun) the default.
  • Theodians typically conceive of time as a fourth spatial dimension, so the ablative, locative, and dative cases are temporal as well as 3D-physical (ie, start-time, current time, end-time).

Logical constructions

Connectives

The Theodian language makes use of only the following three non-polar logical connectives: and (defined as taking the minimal truth-value), or (defined as taking the maximal truth-value), and iff (returns 'true' if the values are identical (alternatively defined as taking the product of all truth-values, but this definition is base-specific)). These connectives, rather than appearing among the items they connect, are placed anywhere in the sentence (typically at the beginning), and are inflected for PoS and case/mood, thus marking any matching elements as being a part of a particular logical phrase. As well, all connectives in Theodian are effectively unary, and the item they affect can be thought of as a set.

For example, the sentence "Se and I like eggs or fish" can be grammatically translated as "DET-AND-RFR-NOM DET-OR-RFR-ACC RFR-IND-Like RFR-NOM-Se RFR-NOM-Me RFR-ACC-Eggs RFR-ACC-Fish", and logically written as ∧{Se, I} Likes ∨{Eggs, Fish}. Accordingly, infinitely many items may be included within these sets without any association to the right or left.

The default connective is conjunctive, so conjunctive statements do not need to be explicitly marked; so, "DET-AND-RFR-NOM RFR-NOM-Eggs RFR-NOM-Fish" and "RFR-NOM-Eggs RFR-NOM-Fish" are equivalent. As well, given the nature of its unary operatorial system, Theodian logical statements have neither antecedents nor predicates. Accordingly, one-way implications are not expressed as single operators, but are instead expressed either via logical equivalencies (eg, pq becomes ∨{p, ¬q}), or via lexical expressions (eg, "p if q" becomes "p, causes q").

Sets can, of course, be embedded within one another, and via the same way that anything else in Theodian is embedded: with the recursive lexemes. As an example, the English sentence "Barbados, Trinidad and Tobago, and Guyana" (which can be a bit confusing in English, especially without the Oxford comma) is, in Theodian grammar: "RFR-NOM-Barbados RFR-NOM-REL RFR-NOM-Trinidad RFR-NOM-Tobago REL RFR-NOM-Guyana", logically symbolized as: ∧{Barbados, ∧{Trinidad, Tobago}, Guyana}. Although the way that Theodian embeds can make this particular logical notation somewhat counter-intuitive, the more exact representation (in this case: {∧(Barbados, {∧(Trinidad, Tobago)}, Guyana)}) is comparatively notationally cumbersome, so the simplified notion used here is typically preferred. The system has also been interpreted n-arily (eg, Barbados ∧ (Trinidad ∧ Tobago) ∧ Guyana), but this doesn't reflect the way the system is spoken quite as clearly.

The similarities between Theodian's logical notation and "Polish" notation are purely coincidental, as Miles actually came up with these ideas completely independently, only learning about Polish notation a year after having devised the basis of the above system. Advantages of this "Theodian" notation include, but are not limited to, the following: (1) arbitrary associations are unneeded, (2) all connectives have equivalent valencies, and (3) Set-Theory is brought into the realm of conventional formal logic. Having the connectives typically appear early in a phrase, rather than appearing late, was chosen to be the case, so that people, having heard a set and assuming it to be conjunctive, aren't then shocked by the sudden revelation that the set was actually disjunctive. Note that not even by convention do connectives need to occur at the beginning of a phrase; rather, they need only occur before their relevant set.

Polarity

Theodian makes use of a three-valued logic (specifically balanced ternary), rather than the much more common two-valued logic. This means that not only are there words and grammatical forms for 'yes' and 'no', but also for 'maybe', which makes expression of ambiguity rather convenient and which should theoretically train Theodian-speakers to picture reality in balanced ternary rather than in binary.

(Although English and other languages can express ternary logic, it's the default in Theodian, and very well-catered-to in the language. It's more that English can express it, but Theodian must.)

The available polarities in Theodian are: positive (leaves the truth-values unchanged), unknown (sets all truth-values to 'unknown'), and negative (inverts all truth-values). The default polarity is positive, so, just like in English, 'indeed' et al do not need to be stated before positive verbs, adjectives, etc.

Polar connectives, rather than modifying everything of a particular grammatical persuasion, only modify the immediately postceding morpheme, and present one of the few examples of word-order inflexibility in Theodian.

Double negatives

In Theodian, double negatives do not make a sentence more negative, but rather cancel each other out, as is proper in logic (and, coïncidentally, in most standard forms of English).

Mathematical constructions

As with logical operators, mathematical operators are, as much as possible, made into set operations. This works well for operators with the commutative property (such as addition and multiplication), but poses some problems for operators without it (such as subtraction and division). Additionally, the total number of operators is deliberately kept small, but not so much so that things become particularly cumbersome.

The basic operators are as follows:

  • Negatives are indicated with an operator akin to, but separate from, logical NOT, which we will call "NEG". For example: "negative two" becomes "DET-NEG RFR-NOM-Two".
  • Addition is performed with an operator akin to, but separate from, logical AND, which we will call "PLUS". For example: "two plus three" becomes "DET-PLUS-RFR-NOM RFR-NOM-Two" RFR-NOM-Three".
  • Multiplication is performed with a multiplicative operator. The operator itself will be referred-to as "MULT", for the sake of this article. Due to the nature of the language, adjectives can also be used for multiplication if the PLUS operator is present, but this isn't as scalable as a dedicated MULT operator.
  • Exponentiation is not yet defined in its realization.
  • Equality is expressed as it is everywhere else in Theodian: with a copula.

These operators are then combined to create additional operations, as follows:

  • Subtraction: "two minus five equals negative three" can be realized as "DET-PLUS-RFR-TOP RFR-TOP-Two DET-NEG RFR-TOP-Five DET-NEG RFR-NOM-Three"
  • Division: This requires exponentiation, and so cannot yet be expressed.
  • Radicals: This requires exponentiation, and so cannot yet be expressed.

Although this can result in somewhat wordier versions of smaller math problems than English, complex problems become far easier to express, and mathematics is able to use the language's normal grammar without virtually creating its own subset of it. Even so, a reduced form of the language does exist for math, whereby case inflections and the like are left out:

  • 2 - 5 = -3 ("two minus five equals negative three") becomes + 2 -5 = -3 (PLUS two NEG five BE NEG three), with each item being only one syllable.

Here are some examples of more complex problems in English vs Theodian:

  • 5 * (3 - 1). EN: "Five times parentheses three minus one". TH: "DET-MULT-RFR-NOM RFR-NOM-Five RFR-NOM-REL DET-PLUS-RFR-NOM RFR-NOM-Three RFR-NOM-One REL".

Suffixes

(TODO)

Lexicon

A preliminary preview of the lexicon is available here: User:Swena/TheodianWordlist. Where discrepancies exist between it and this article, this article takes precedence.
This section mostly only cover those portions of the lexicon that are generated schematically.

Generation

The Theodian lexicon is a unique mixture of a priori and a posteriori elements, and of international-auxiliary and philosophical derivational practices. It is generated in phases, each with several tiers containing several sources. Within each phase, each higher tier overrides anything from a lower tier with the same definition; the same goes for sub-tier sources. The composite of preceding tiers is fed to the next tier; and the composite of preceding phases is fed to the next phase.

  • Phase 1: Source material
    (The purpose of this phase, is to assemble a collection of source material that can be used to programmatically generate a complete lexicon.)
    (Compound words and the like are deferred to phase 2.)
    • Tier 1: a posteriori
      (All a posteriori lexemes are stored with an IPA transcription, and not in their final, Theodianized form. This transcription is then used to automatically generate phonologically-valid Theodian lexemes on the basis of Optimality; these are what get fed into the next phase.)
    • Tier 2: a priori
  • Phase 2: Filtering
    (This phase fits the compiled source material from phase 1 into predefined semantic "slots".)
    • Tier 1: Semantic primitives
      • Source 1: Until such a time as a coherent list of semantic primes can be generated, the whole of phase 1 will be accepted here as-is.
    • Tier 2: Compound words
      (The semantic primitives created in tier 1 are fit into pre-defined compound-word patterns to generate new, non-primitive lexemes.)
      • Source 1: Compound words widely-used in informal Theodian
      • Source 2: All of Interlingua's compound words.
      • Source 3: International scientific vocabulary
      • Source 4: Compound Wanderwörter
    • Tier 3: Canned phrases
      (Every language has certain canned phrases that are used with frequency.)

Colours

Theodian colours are derived systematically from an HSL model in the order of LHC (where 'L' and 'C' stand, respecively, for 'luminence' and 'chroma'). This order was somewhat derived from the typical development of colour-terminology in languages, such that lightness vs darkness is inflected for first. The hue was chosen to be represented by a vowel due to the large number of both items. Chroma was chosen for the syllable-coda so that it could 'colour' the quality of the nucleus.

Connectives

(TODO)

Directions

(TODO)

Numbers

Theodian numbers, as touched upon earlier, use a modified "bimodular" method to derive 10twelve Theodic consonants, each of which is taken to represent a different integer. Each number's name is identical to that of the letter to which its numeral corresponds.

There are no multiplicative words for "hundred", "thousand", "million", etc. Instead, a reduced form of scientific notation, where only myriadic exponents are used, is preferred. Even though Theodian does not make use of lexical multipliers, the language, like Chinese, is myriad-based—this is so that subitization is maxmally taken advantage of (humans instinctively, accurately, and nearly immediately identify the number of objects in a group of such objects so long as the objects number four or less). To indicate where each myriad begins, the first of every four numbers receives stress. When an incomplete myriad is encountered (as in 10 0000), the first number in that incomplete myriad receives stress, and the other myriads are treated normally (alternatively, 0s can be prepended until there is a complete myriad: 0010 0000).

The radix point is indicated by a stressed morpheme (/ˈwa/).

Additionally, — and in-line with the language's preference for head-initiality — numbers in Theodian are little-endian.

History

Originally, the language was going to inflect each number's position within its myriad (ones, tens, hundreds, thousands); but it was decided that this reserved too large a portion of the lexicon to be worthwhile.

The language was also going to have a complete set of multipliers (single-syllable words for "hundred", "thousand", "million", etc); but these were later forgone for the sake of simplicity, and to reserve the entire human subitization range for actual digits.

There used to be two additional separators: 'wɛ́' (/ˈwɛ/, further forward in the mouth), which was used for multipliers left of the radix point; and 'wɔ́' (/ˈwɔ/, further back in the mouth), which was used for multipliers left of the radix point. At this time, all numbers had /ə/ as their sole vowel. This left /a/ as the only remaining option for the radix point. The additional separators were later dropped from the language (stress was deemed sufficient to demarcate myriads), and the numbers later became much-more unique; leaving 'wá' (/ˈwa/) as the sole remnant of this phase of development.

Inflectional pronouns

Theodian has a unique set of pronouns that can refer to anything in the current shell by its part of speech and case/mood inflection. This is done by simply using that PoS+case/mood as a lexeme. For example: RFR-VB-Lacquer RFR-NOM-Gorilla RFR-ACC-RFR-NOM; or, in Theodian: ʼə lá-kə ʼe kó-ri-la ʼi hé; in romanized Theodian: "ʼə lakə ʼe korila ʼi he"; and in English: "The gorilla lacquered itself".

This not only creates a robust system of pronouns for all parts of speech, but also eliminates the need for special reflexive pronouns; and also effectively fulfills the purposes of obviate and proximate pronouns.

Personal pronouns

Theodian personal pronouns are derivable from a simple combinatorial system.

It should be noted that, although some pronouns may appear to be dual (eg, "  ") or plural (" ʼú ja "), and although it is certainly the case that some pronouns are inferentially singular (eg, " ʼú "); no pronoun actually states the exact number of items in its anaphor(s).

Personal pronouns
Persons Theodic IPA
1   ʼú /ʔú/   
 2  ʼí /ʔí/   
  3 ʼá /ʔá/   
12  /wí/   
1 3 /wá/   
 23 /já/   
123 ʼú ya /ʔú.ja/

Design notes and history

Theodian's personal pronouns were, to some extent, inspired by those of Tok Pisin, and use combinations of pronomial primaries in order to create a coherent system of pronouns. The sounds chosen to represent each primary were not arbitrary. To maximise the acoustic differences between these sounds and to allow them to be replaced with phonemic semivowels when they were combined, only the sounds /u/, /i/, and /a/ were used. /u/ was chosen for the 1st person because the toungue points roughly towards the speaker, /i/ was chosen for the 2nd person because the tongue points towards the interlocutor, and /a/ was chosen for the 3rd person because it was the remaining sound. The sounds were then combined, and ordered from 1st person to 3rd person. Where necessary to match phonotactical constraints, /u/ and /i/ were replaced with their associated semivowels, and glottal stops were added to the fronts of words that would otherwise have started with a vowel.

Pragmatics

Like all languages, Theodian has certain pragmatic maxims, which differ somewhat between its different channels of communication (speaking, writing, and signed).

All
  • Repetition of words is acceptable; it's more important to be accurate than florid.
  • (TODO)
Spoken
  • (TODO)
Written
  • Grammatical parallelism is important
  • (TODO)
Signed
  • (TODO)

Examples

Theodic ʼə jú-raʼ lé-kih kə ré-pu-pliʼ fjɔ́-tja
Romanization ë yurat-lekih kë Reybublit-Fiådia
English a legislative jury of the Republic of Theodia
Theodic ʼe tír ʼə sól-daʼ ʼı sá-kiʼ .
Romanization Ey tir ë soldat ı sakit.
English A soldier is shooting an arrow.

Potential benefits

Theodian has been carefully designed from the beginning with the perhaps dubious goal of improving cognition. While early versions of the language sought to "optimize" thought by grammatically obliging speakers to communicate and think in certain ways (depending heavily on linguistic relativism); later and current versions of the language instead aim to be as ergonomic as possible for the user -- to get out of the way of the fluent speaker. A portion of this goal is easing the burden of learners, and giving them something useful in return. This section details many of the language's intended benefits, relative to other languages.

Schematic lexemes

Theodian's schematically-derived lexemes are designed to provide extralinguistic benefits to speakers by embedding various models of reality directly into the language. Theodian's colour terminology, for example, teaches learners the HSL colour model and some general principles of colour theory.

Theodian's schematic lexemes should also generally be easier to learn than the usual non-schematic lexemes, since learners need only learn the rules behind their generation, rather than memorize pages of arbitrary lexemes by rote (as is the norm for other languages). For example: the 16 consonants of the Theodic script are fundamentally composed of 4 simple shapes drawn atop each other on a place-manner chart. This exposes learners to the basics of articulatory phonetics right in the alphabet, and all with just 4 shapes and 2 rules.

Schematic lexemes exist in Theodian for everything from formal logic to the primary flavours detected by the tongue, and the Linguistics Jury of Theodia is expected to expand this schematicity to more and more domains over time. As well, and like with the rest of the language, Formal Theodian's existing schematic lexemes are constantly being revised in response to new advancements in human understanding.

This feature is intended to be a major draw, as baking these models into the language simplifies quite a few things in day-to-day life. For example, Theodians know whether something is divisible by 3 if it ends in a number whose name starts with a lip sound -- no math required. Theodians know that cyan and blue are next to each other in the color wheel, because their vowels are very similar and they're next to each other in the alphabet -- no study of color theory required. This helps to reduce the distance between the educated and uneducated in society, and reduces the amount that must be taught in school for an individual to be a functional member of a modern society.

Alphabetical order

Conceptually-related lexemes are designed to come in a sensible order when alphabetized. Example: while in English, "high/medium/low" sorts as "high/low/medium"; in Theodian, it sorts "low/medium/high". This means that you can just use alphabetical order for pretty much anything, and things will almost always end-up in a useful order, with minimal aforethought.

As well, since compound words are head-first, concepts sort from big-to-small: English "greenhouse" is instead "housegreen", and gets sorted with other types of houses, not with other types of greens. This is particularly useful in indices and in programming: "customerName" and "businessName" become "nameCustomer" and "nameBusiness", thus making them sort together, and making the variables line up better in monospaced fonts. Theodian's extremely free word-order also, of course, allows for deviations from this norm, meaning that a programmer can grammatically get away with both "customerName" and "nameCustomer".

Minimal unintended ambiguity

Theodian grammar is designed to be extremely clear and explicit, to the extent that part of speech tagging isn't even required for computers to process the language. All aspects of the language are tuned to avoid many of the unintentional ambiguities of other languages. Fundamentally, it is designed to be a grammar that doesn't get in the way of what people are trying to say.

This feature comes at the cost of having to provide a preposition alongside every word. The unique trade-off of this, is that simple expressions can be a bit long compared to other languages; while more-complex expressions can often be quite a bit shorter than in other languages.

Scientific vocabulary

International scientific vocabulary (ISV) is the basis of most words in Theodian. This provides speakers with clear benefits: learning anything in the sciences will require far less memorization of vocabulary, since the scientific words are identical to the common words; and discussions between laypeople and academics should hopefully be more on the same page. Additionally, the ubiquity of international scientific vocabulary words among the World's languages makes it easier for Theodian speakers to learn a number of languages, particularly in Europe.

For learners, however, these benefits are less clear, since learning Theodian constitutes learning a vast quantity of ISV; and that can be quite the time-investment, one that could have been spent learning other, more-widely-spoken languages. While the use of ISV as the backbone of the lexicon should still reduce learning times relative to using ie randomly-generated words, it doesn't make the language instantaneous to learn. People educated in the sciences will have a lower time cost to learn than laypeople.

International vocabulary

To keep on-top of global linguistic trends and maximize lexical similarities with as many languages as possible, the Jury has instituted a preference for Wanderwörter and other translingual expressions. This should make it a little easier for Theodian speakers to learn almost any language; and vice-versa.

Poetic freedom

Theodian's extremely free word-order lends itself extraordinarily well to poetry; but the language's isochrony makes it difficult to play with stress in the way that one can in, say, English.

Syntax research

Theodian syntax is capable of extreme amounts of scrambling, it combines verbs and nouns into the same inflectional part of speech, and its case inflections are based on thematic role rather than literal case. The formest of these, especially, heavily pushes the limits of what is typically considered possible in many constituency models of grammar, such as X-Bar Theory; but is permitted in most dependency models of grammar. The extents to which native speakers of Theodian perform scrambling given the complete freedom of word-order offered by the language, would provide a unique test of some phrase-structure theorems.

External links

See also

References

  1. Hagler, Jason (2020)