Authentic Dialog & Spoken Grammar

Creating authentic dialog & speech transcriptions is difficult, especially for beginner levels. While it is difficult to simplify natural speech in any language, and to give learners authentic language input, there are some additional challenges for material development specific to Tibetan we should be aware of:

  1. Diversity. The Tibetic language family is large & diverse. What we are accustomed to thinking of as “languages” correlates more strongly with the concept of ཕ་སྐད་ or ཡུལ་སྐད་ (mother tongue or local language) than with བོད་སྐད་ (Tibetan language), which is a much broader term. While there is no official Standard Language for teaching how to speak Tibetan as a Second Language, many classrooms and textbooks default is the Lhasa variety. For many years, even experts would claim that the language spoken in the diaspora is Lhasa Central Tibetan; now, we recognize that diaspora speech varieties are unique. Because these are the speech communities our students communicate with, our materials need to reflect the Zhichag, or Diaspora varieties of Tibetan.
  2. Diglossia. The second challenge we face is the diglossic nature of Tibetan. Unlike the speech varieties, which lack a clear standard (many speech varieties are recognized, and seen as valid), there are standards for how Tibetan is written. This variety is mainly a historical variety of Lhasa Central Tibetan (7th–11th c.). This variety has changed significantly over the last 1,000 years or so—especially recognizable is how pronunciations have changed (gnyis གཉིས་ to nyi ཉི་, as gnyis-bcu གཉིས་བཅུ་ to nyi-shu ཉི་ཤུ་). However, Standard Literary Tibetan is extremely conservative in retaining its historical spellings & grammar.

These challenges influence the way speakers speak, especially in formal settings, or when they are asked to speak slowly, or clearly, or in a situation where they know they will be recorded and/or have their speech transcribed. Because Standard Literary Tibetan and Lhasa Central Tibetan are prestigious, they are seen as “more correct” than other varieties of Tibetan. Community attitudes about “correct” Tibetan can then lead to speakers to self-edit, or self-correct their language in a way that will conform to these views on language. The effect is speech that is less authentic, or artifically grammaticized.

Specifically, we can observe the following features of common speech being erased, ignored, or de-emphasized in textbooks, careful speech, & the classroom—it is important to say that, despite common conceptions surrounding language change, none of these features is a sign of loss, degredation, ungrammatical speech (in fact, they are all common and ordinary changes that happen across all languages):

  • The Discourse Marker (-འ་). This may take the form of emphasis (རེད་ད།) or tag question (རེད་བ།); this grammatical marker is exceedingly common in natural speech, affecting around 50% of sentences. We should expect a similar frequency in an authentic dialog or transcript.

  • The Leveled Verb Form. Traditionally, verbs in Tibetan were inflected to give tense information (eg བལྟས ལྟ བལྟ ལྟོས); in modern speech, they have evolved to give this information via helping verb constructions (eg ལྟ་ཡིན། ལྟ་གི་ཡོད། ལྟ་གི་ཡིན། ལྟ་ད།). Linguists call this change ‘morphological leveling’, and it is a perfectly normal grammatical evolution. In Zhichag Tibetan, this change affects practically 100% of verbs.

  • The Common Adposition (གི་). There are five traditional spellings for the adposition known as the “connective” in Tibetan: ཀྱི་ / གྱི་ / གི་ / ཡི་ / -འི་. In many cases, including verbal constructions (though it is debated whether the verbal usage is a true འབྲེལ་སྒྲ་ or not), this adposition is realized as “གི་”, including after words that end in vowel sounds (eg བྱེད་གི་ཡོད། , འགྲོ་གི་འདུག , and འདི་གི་).

  • The Optional Marker (unmarked ལ་ / གི་ / གིས་). Many of Tibetan’s grammatical elements are optional; this is true in everyday speech, but it is also true in literature, especially poetry. In pedagogical settings, however, there is a tendency is to over-grammitcalize speech in such a way as to add the maximal number of grammatical markers possible. In authentic dialog, they are truly optional in many contexts (and this is not “ungrammatical”).

  • The Modern Loanword. The varieties of Tibetan spoken in India, Nepal, and the West use many loanwords from the languages like English, Hindi, and Nepali. This is a natural result of language contact. Historically, Tibetan has absorbed loanwords from many of its neighbors and trading partners, including words like མོག་མོག་ momo (Chinese), ཕྱུ་པ་ chupa (Arabic), ཇ་ ja (Chinese), པི་ཝང་ piwang (Khatonese), དེབ་ deb (Persian), ཨེམ་ཆི་ emchi (Mongolian), among others. That Tibetan has continued borrowing today is a reflection of this very same process. Loanwords are especially common for new items and foods in the informal register, or technical terms in more formal settings. They appear at a rate of about 0.7% of words in everyday Zhichag speech.

  • The Filler Word (ཨ་ནི་ / འ་). The connective and filler word “ཨ་ནི་” or “ཨ་ནས་” (ani) is the second-most frequent word in everyday speech after the verb “རེད་” (ray). It is more frequent than the preposition “ལ་” (la) and the demonstrative pronoun “དེ་” (de). In speech, a filler word or hesitation marker is a sound or word that participants in a conversation use to signal that they are pausing to think but are not finished speaking. It is important for learners to learn these words in order to properly indicate their pauses, and to give listeners appropriate communicative cues.

  • The Helping Verb བསྡད་. The verb བསྡད་ (bsdad) is used as a helper verb to indicate present continuous actions. In everyday speech, it is nearly as frequent as the present / habitual and past tenses. Some common verbs it is used with include: བསྡད་, འགྲོ་, and ཟ་ (eg ཨ་ནི་བུ་མོ་དེ་ཁ་ལག་ཟ་བསྡད་ཞག, ནང་ལོགས་ལ་བསྡད་བསྡད་འདུག, ཁོ་ཕྱི་ལོགས་ལ་འགྲོ་བསྡད་འདུག་ག, སྒྲུང་བཤད་བསྡད་ཡོད།).

  • The Post-verb Subject/Object. While Tibetan is a ‘verb-final’ language, with a S-O-V sentence structure, it is also not uncommon to find the subject or object of a sentence clarified, or restated, after the verb. For example: ནང་ལོགས་ལ་བསྡད་བསྡད་འདུག ཨ་མ་ལགས་དང་པ་ལགས། ; བུ་གཅིག་གིས་ར། མེ་ཏོག་ལ་ཆུ་བླུག་བསྡད་ཞག བུ་གཅིག་གིས་ར། ; ཁོ་རང་ཆེན་པོ་ཆགས་རེད་ད་ར། བུ་དེ།

  • The Superfluous Honorific. Honorifics are seen as a prestigious element of formal or refined speech. There is an especial tendency to over-emphasize them in pedagogical contexts, like the language-learning classroom. However, they are relatively rare in authentic, everyday speech.

  • The Possessive Marker (ཁ་གྱི / ག་ཅི་). There is a speech-only marker for possessives pronounced “kaji”. We have documented the following spellings: ག་ཅི་, ཁ་གྱི་, and རྒ་རྗི་. Examples: དེབ་དེ་ངའི་ག་ཅི་རེད། “That’s my book”, ལྷམ་དེ་ཁོ་ག་ཅི་རེད། “Those are his shoes”, ངའི་ཁ་གྱི་གྲོགས་པོ་་་ “My friend…”, འདི་ངའི་ཁ་གྱི་རེད། “This is mine”.

བསྡད་ is not only used to form present-tense conjugation. It brings a meaning of its own, which is that the action has already started since some time at the time of the utterance, and that the action will continue in the future.

What happens is that most of the time, when we ask a question like “what are you doing?”, it is the perfect situation to elicit an answer that contains བསྡད་. The reason being that if we are asking the question, it is because we see that our interlocutor is already doing something. So, naturally, the answer will be something to the effect of “I am in the middle of doing such-and-such”, which translates as “such-and-such བྱེད་བསྡད་ཡོད།”. Hence its wide usage. Its literary counterpart is བཞིན་, which has the exact same meaning nuance.

On the other hand, I’m not quite sure we can say that this construction is as grammaticalized as the present continuous is in the English language…

I wouldn’t spell the རེད་བ་ equivalent to a simple ར་ because it is not the same pronounciation.

Three reasons:

  • the contracted རེད་བ་ is not as short as when you say “goat”. the full version has the slots of two syllables, which the contracted version covers by lengthening the vowel.
  • phonologically speaking, it is not a ར་ that is in the mind of a tibetan when he pronounces that. even though he phonetically pronounces རའ་, the words that appear in his mind when he pronounces are རེད་བ་.
  • unlike English, Tibetan doesn’t have (yet???) a strategy to mark the contraction of words. There is no equivalent to the “c’mon” that leaves no ambiguity that what is meant is “come on”. So when anyone will read ར་, the first impression will be to not understand what is written.

So if you HAVE to write it the way it is said, do something a bit crazier, just like “c’mon” for “come on” and write རེའ་ or something that leaves no ambiguity that this word has suffered an alteration and that its value is to be found beyond its surface form.

just a phonetic remark: there is more airflow in the pronounciation than for ཅི་ on its own, hinting that it will be more accurately spelt ཇི་

I think your example perfectly fits a “present continuous” use case. I agree with your description of its use, and it is what I was trying to say (in case that wasn’t clear :wink: )!

These others are spellings chosen by native speaker transcribers, and the examples come straight from the corpus.

Since I have no desire to be prescriptive about spellings, I will simply report things how Tibetans themselves have written them. (Though I personally like the suggestion for རའ་, I don’t think anyone will confuse it for ‘goat’, and it is easily understood in context).

It took me about one minute of thinking to understand there was no goat and we were not talking about the liquid that is left after making cheese (ད་ར་). :grin:

I think if it took me so much time, I’m sure it will also happen to anyone who is not familiar to it…

lol i was confused the first time, too, but only one minute and only the first time, right? subsequent sentences will be clear.

maybe NT @trinley has some thoughts about spelling standardization for this kind of thing…?