Size and Extent
The IDS has started the construction of electronic text corpora in the mid sixties. The size of the corpora has increased from about 28 million text words in 1992 to 55 billion text words in 2023 (this is equivalent to about 140 million book pages, if an average of 400 words per page is assumed). Many staff members have participated in creating the largest collection of its kind worldwide. The corpus archive is being extended continually and existing corpus material is being edited in terms of quality management in an ongoing process. The results of these works are published regularly through the Corpus Query System project (see Release-Chronicle).
Unfortunately, a small part of the archived corpora is not accessible from outside the IDS for copyright and licensing reasons. Over the last years, this part could be reduced to under 5%. In general, the IDS corpora may be used for scientific, non-commercial purposes only. For more details about the options available for the use of the IDS corpora see: Information regarding the availability
Dr. Marc Kupietz <kupietz@ids-...>
Cyril Belica <belica@ids-...>
Dr. Harald Lüngen <luengen@ids-...>
Rainer Perkuhn <perkuhn@ids-...>
Ehemalige am Korpusaufbau beteiligte Mitarbeiter des IDS: