Strani Jezici (Jan 2024)
Building MaLi, a Croatian-Italian bilingual child corpus
Abstract
The knowledge we have about language and first language acquisition would not have been unveiled in the absence of previous efforts in collecting language data, e.g., recording the spontaneous interactions between children and adults. The CHILDES database (MacWhinney, 2000) gathers child speech in many of the world’s languages, including Croatian (documented in Kovačević’s 2002 corpus). In this paper, we describe the construction of MaLi, a corpus documenting the language productions of two bilingual children acquiring Croatian and Italian simultaneously. After a short survey of the methods used in collecting child language data with special regard to diary notes and audio recordings, we discuss the background and the details of the data collected in MaLi: we provide an overview of the sociolinguistic context of bilingual first language acquisition of the children observed and a description of the structure of the corpus. We first devote our attention to the data collection, management, and coding of the diary notes. Afterwards, we examine the collection and elaboration of the audio recordings and their ongoing transcription. In our concluding remarks, we offer a short assessment of the advantages and limits of the corpus along with a survey of the future possibilities for the use of this resource.
Keywords