I Introduction
Introducing the concept of language corpora as collections of authentic, “real world” texts of speech or writing that aim to represent a given linguistic variety.
Asking students about which language corpora they are familiar with, if they have used them and, if so, in which situations and for which purposes.
II Diving into Latgalian through the Corpus
(Recommended work in pairs or in small groups of 3-4 persons)
- Open the Latgalian spoken language corpus MuLaR (https://mularkorpuss.rta.lv/#!/).
- Get to know the sections (Language choice, Corpus, Statistics, About). Tell each other what everyone in the group has found out about the Latgalian speech corpus.
- Open the section Corpus, subsection Search, choose 1-2 words, e.g. es (I/me), saime (family), volūda (language). Write the word in the Search window.
- Listen to all utterances which include the chosen word. Simultaneously read the transcript. Do this more than one time for different examples.
- Compare the language which you hear with the written transcripts. Answer to the following questions:
- What is easier for you to understand: spoken or written language? Why?
- What differences do you hear and see between the oral and the written forms?
- Describe some of the characteristics of the oral language (tone, pitch, pauses).
- Is it easy to transcribe emotional reactions of the speakers (sadness, laughing and others)? Give examples.
- What is common, and what is different, between the transcript of the oral form and the sentences created in the written form at the beginning?
- Which transcripts would you improve? After listening to the oral speech – are there examples which you would transcribe differently?
- Choose the option Metadata. Compare the language using criteria such as:
- gender,
- year of birth,
- place.
Which speakers could you understand best? What do you think – why?
III Digging Deeper using the Latgalian Corpora
- Choose one of the Latgalian corpora: Latgalian Speech Corpus, https://mularkorpuss.rta.lv/#!/ ) or the Corpus of Written Texts (https://korpuss.lv/id/MuLa2022).
You can also decide to explore and use both of them.
- Choose one issue of the Latgalian language that you wish to explore and discover in-depth. Possible options are:
- phonetics (e.g., how particular sounds are pronounced, the most difficult sound in Latgalian, the biggest sound differences between Standard Latvian (SL) and Latgalian, sounds which are pronounced or transcribed differently)
- lexical items and semantics (e.g., which words differ in Latgalian from SL not only phonetically; the words used most frequently in the corpus; which of the words are used in your family; which words were new for you)
- grammar (e.g., which morphemes, such as prefixes, suffixes, are similar to SL, and which are different; which words differ in gender or number in Latgalian and SL; differences in case)
- Create a small group and explore 1-2 research questions about which you will agree to work together.
IV Presentation of the Project
Prepare a short presentation: what have you explored and learned about Latgalian in your group?
In order to provide evidence, use examples from the Latgalian corpora.
For teachers: more ideas on how to teach Latgalian and contrasts to SL using language corpora can be found in the following video lectures: https://ltg.korpuss.rta.lv/en/video_lectures/
Area of Interest: New technologies and social media
Skills: Listening, Speaking, Reading, and Writing
Competences:
Age Bracket: 16 – 18
Time Commitment: Over 60 minutes
Affordability:
Materials:
computers, sound devices
Expert recommendations: