Statistical models for language representation

Rubén Dorado

doi:10.21158/23823399.v1.n1.2013.1208

Ontare (Sep 2015)

Statistical models for language representation

Rubén Dorado

Affiliations

Rubén Dorado

DOI: https://doi.org/10.21158/23823399.v1.n1.2013.1208
Journal volume & issue: Vol. 1, no. 1

Abstract

Read online

ONTARE. REVISTA DE INVESTIGACIÓN DE LA FACULTAD DE INGENIERÍA This paper discuses several models for the computational representation of language. First, some n-gram models that are based on Markov models are introduced. Second, a family of models known as the exponential models is taken into account. This family in particular allows the incorporation of several features to model. Third, a recent current of research, the probabilistic Bayesian approach, is discussed. In this kind of models, language is modeled as a probabilistic distribution. Several distributions and probabilistic processes, such as the Dirichlet distribution and the Pitman- Yor process, are used to approximate the linguistic phenomena. Finally, the problem of sparseness of the language and its common solution known as smoothing is discussed.

Published in Ontare

ISSN: 2382-3399 (Print); 2745-2220 (Online)
Publisher: Universidad Ean
Country of publisher: Colombia
LCC subjects: Technology: Engineering (General). Civil engineering (General): Environmental engineering; Technology: Technology (General): Industrial engineering. Management engineering
Website: https://journal.universidadean.edu.co/index.php/Revistao/index

About the journal

Abstract

Keywords