Corpora of Written Language

Fields of Application

The IDS text corpora are an essential empirical basis for the IDS, not only for linguistic research, but also for the national and international research in German studies, of which the constantly increasing number of online users is evidence. They are also being used at an increasing rate for interdisciplinary studies, for instance in the fields of psychology, neurology, cognitive science, language therapy, communications and media studies as well as statistics.

You can find a list of DeReKo-based teaching and research activities here.


DeReKo is designed as a kind of primordial sample of language use. From this sample each user can create his own virtual corpus, which is

  • suitable for his special scientific issue,
  • representative of the area of linguistic research explored
  • balanced regarding relevant strata needed for this purpose

For more information on the primordial sample design see:

Kupietz, Marc / Belica, Cyril / Keibel, Holger / Witt, Andreas (2010): The German Reference Corpus DeReKo: A primordial sample for linguistic research. In: Calzolari, Nicoletta et al. (eds.): Proceedings of the 7th conference on International Language Resources and Evaluation (LREC 2010). Valletta, Malta: European Language Resources Association (ELRA), 1848-1854.