Methods of Corpus Analysis and Topic/Domain Classification

Main Subject Approaching Grammar

[see also: Flyer (pdf, 1.9M)]

Embedded in a general empirical-linguistic research programme, in this main subject we aim to work out feasible research strategies for the development of explanatory grammatical theories. In this process, grammar is understood mainly as a psychological and social phenomenon.


The only basic hypothesis of our approach builds on the general perception of an Emergent Grammar (Hopper 1987, 1998), according to which all grammatical regularities are emergent in essence and are constantly influenced and transformed by linguistic usage. These regularities have a psychological reality: in the form of linguistic routines of individual speakers, which emerge from the accumulating language experience of the respective speaker and which develop in an ongoing process. Consequential, the grammatical regularities have a social reality: in the form of language conventions, which again can be described informally as the intersection of individual grammars (i.e. linguistic routines) of most speakers in a linguistic community.

From its assumed twofold reality follows at once, that grammatical regularities reciprocally influence the linguistic usage: it is evident, that speakers use their linguistic routines “routinely”, and as far as they are interested in a successful communication, they prefer to use the conventions of the relevant linguistic community respectively. These general assumptions - as far as they apply - predict, that each grammatical regularity has correlates in a suitable corpus, given that the corpus is sufficiently large and stratified enough. In this main subject we implement a strictly empirical research strategy, that builds mainly on this prediction.

Empirical Research Strategy

Linguistic routines of individual speakers are part of the implicit linguistic knowledge and can not automatically be made explicit. Likewise, linguistic conventions of a linguistic community in terms of the above characterisation are not explicitly graspable. Grammatical regularities as real phenomena can therefore not be investigated directly - instead we try to approach them indirectly through their corpus correlates (hence the name of this main subject). We try to approach the corpus correlates by simulating the inductive psychological processes, that underlie the emergent nature of these regularities. In the process of this, we plan to proceed in small, inductive steps.

The corpus correlates that are gained inductively, can admittedly be very abstract structures, but these structures can be regarded as descriptions and are not yet part of a possible explanatory grammar theory themselves. Only by exploring a great number of corpus correlates (of the same type), you can make more general observations and abductively deduce from them new hypotheses (on a theoretical level) about the real grammatical structures. Each of these hypotheses in turn has to be validated empirically (via deduction and falsification).

Central Challenges

For each type of suspected corpus correlates, particularly the following tasks have to be processed.

  • check psychological reality
  • find suitable cognitive conceptualisation
  • explore systematically
  • deduce hypotheses and test them empirically


The current research works concentrate on syntagmatic and paradigmatic structures.

Publications (Selection)

We publish advances within the framework of this main subject on a regular basis in lectures and articles. The titles of these publications are usually preceded by the term “Approaching Grammar”.

Keibel, Holger / Belica, Cyril / Kupietz, Marc / Perkuhn, Rainer (2011): Approaching grammar: Detecting, conceptualizing and generalizing paradigmatic variation. In Konopka, Marek / Kubczak, Jacqueline / Mair, Christian / Štícha, František / Wassner, Ulrich (eds.): Grammar & Corpora 2009: Selected contributions from the conference Grammar and Corpora, Sept. 22-24, 2009, Mannheim. Tübingen: Narr.

Kupietz, Marc / Keibel, Holger (2009): Gebrauchsbasierte Grammatik: Statistische Regelhaftigkeit. In: Konopka, Marek / Strecker, Bruno (Hrsgg.): Deutsche Grammatik – Regeln, Normen, Sprachgebrauch. Berlin/New York: de Gruyter, 33-50.

Keibel, Holger / Kupietz, Marc (2009): Approaching grammar: Towards an empirical linguistic research programme. In: Minegishi, Makoto / Kawaguchi, Yuji (eds.): Working Papers in Corpus-based Linguistics and Language Education, No. 3. Tokyo: Tokyo University of Foreign Studies (TUFS), 61-76.

Keibel, Holger / Kupietz, Marc / Belica, Cyril (2008): Approaching grammar: Inferring operational constituents of language use from large corpora. In: Štícha, František / Fried, Mirjam (eds.): Grammar & Corpora 2007: Selected contributions from the conference Grammar and Corpora, Sept. 25-27, 2007, Liblice, Czech Republic. Prague: ACADEMIA, 235-242.

Back to Project Page


Dr. Holger Keibel <keibel@ids-...>


 Sitemap     Search     Impressum     Contact    Print