ComPara: A corpus linguistics in English of computation in architecture dataset

Anca-Simona Horvath

Data in Brief (Jun 2022)

ComPara: A corpus linguistics in English of computation in architecture dataset

Anca-Simona Horvath

Affiliations

Anca-Simona Horvath: Research Laboratory for Art and Technology, Department of Communication and Psychology, Aalborg University, Rendsburggade 14, 6212, 9000 Aalborg, Aalborg, Denmark

Journal volume & issue: Vol. 42
p. 108169

Abstract

Read online

ComPara is a corpus linguistics dataset in English focused on computational architecture or architecture where technology functions as a driver for its conceptualization, design, and materialization. Sometimes computational architecture is also referred to as digital, parametric, algorithmic or generative architecture, and, as has been shown, each of these terms has different flavours [9]. Other corpus linguistics for architecture have been built containing texts written over a relatively limited time span and focusing on the language used in the profession in general [1,2]. The text which makes up ComPara is written between 2005 and 2019 and focuses on computational architecture. The corpus is built from two sources: the journal Architectural Design [3] and the eVolo skyscraper competition [4]. The former is one of the journals which has focused most on the theoretical discourse surrounding computation in architecture [5], while the latter is one of the most prestigious competitions focusing on ‘technological advancements in architecture’ [4].The corpus includes the titles of Architectural Design's journal issues, titles of all articles and the keywords which are associated to the Introduction article in the journal's web page for each issue for the period between 2005 to 2019. From the eVolo Skyscraper competition, the titles of all winning projects and honorable mentions as well as all abstracts describing the projects between 2006 and 2019 were collected. This amounts to around 100.000 words. The purpose of building this dataset was to help gain a better understanding of the digitalization of architecture over 15 year time-span [8]. Further quantitative, qualitative or mixed method analysis can be carried out using the ComPara corpus by following specific topics or trends over time or by comparing the corpus to other sources.

Published in Data in Brief

ISSN: 2352-3409 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Science (General)
Website: http://www.journals.elsevier.com/data-in-brief/

About the journal

Abstract

Keywords