Calidoscópio (May 2010)

Corpus compilation: Representativeness and the CORPOBRAS

  • Lúcia Pacheco de Oliveira,
  • Maria Carmelita Padua Dias

DOI
https://doi.org/10.4013/4872
Journal volume & issue
Vol. 7, no. 3
pp. 192 – 198

Abstract

Read online

This paper discusses an important parameter in corpus design and compilation: representativeness. This parameter is related to the need to include in corpora texts that represent several uses of the language so that comprehensive descriptions can be developed. The paper also presents a corpus of Brazilian Portuguese – CORPOBRAS – that comprises 27 discourse genres and is guided by the representativeness parameter. The paper finally lists several corpus-based studies that draw upon CORPOBRAS data. Key words: CORPOBRAS, corpus linguistics, genre variation, representativeness, oral and written discourse.