Technologies for Advanced Knowledge Extraction

The project TAKE aims to adapt, develop and utilize a range of language and knowledge technologies for the gradual automatic extraction of knowledge from the World Wide Web. Rule-based and statistical methods for language processing will be combined for systematically extending a body of formalized knowledge.

The central technology for this endeavor is semantically driven advanced information extraction, especially relation extraction, i.e., the detection of instances of semantic relations in large volumes of texts. Such relevant relations may belong to several classes such as facts, definitions, events, citations and opinions.

In TAKE, information extraction is not viewed as a pragmatic shortcut to getting at least something out of natural language texts but rather as a method for gradually approaching the unsolved problem of text understanding in a systematic and controlled way.

Existing bodies of formalized linguistic knowledge such as lexicons, morphologies and grammars will be utilized as well as tools for statistical processing.

The developed methods, architectures and systems will be tested and demonstrated in two knowledge domains:

Publications of the project are listed below.

TAKE is funded under contract 01IW08003 by the Federal Ministry of Education and Research.

Project Managers: Hans Uszkoreit, Ulrich Schäfer


ACL Anthology Searchbench

Since the ACL-HLT 2011 conference (paper), the ACL Anthology Searchbench is available online at http://aclasb.dfki.de.
It is also reachable via the ACL Anthology start page itself.

The Searchbench combines semantic, full text and bibliographic search in more than 28,000 Computational Linguistics papers of the ACL Anthology from the past 50 years, including the complete Journal.

Highlights are

The Searchbench itself requires a recent web browser (Firefox 3.6 or higher, Safari 5, Opera 11, Chrome 12, IE 8/9) with JavaScript enabled.

The Searchbench is not perfect - it is a milestone in the ongoing research project (TAKE). There was no manual correction of OCR or NLP errors. Missing author affiliation data of 2010 and 2011 papers will be added later.

However, we hope you find it a useful tool also for your scientific work. Your feedback is welcome ("Feedback" button at left bottom)!

- The TAKE Searchbench team Ulrich Schäfer, Bernd Kiefer, Christian Spurk, Jörg Steffen and Rui Wang
...with thanks to all others who have contributed to this endeavor (see "About" at left bottom).

The Searchbench has been developed in the context of the BMBF-funded project TAKE, the DFG Cluster of Excellence on Multimodal Computing and Interaction (M2CI) and the international DELPH-IN collaboration.

A previous version of the ACL Anthology Searchbench is described in the ACL-2011 paper The ACL Anthology Searchbench.

ACL Anthology Searchbench Screenshot
ACL Citation Browser Screenshot


ACL-2012 Main Conference Workshop:
CfP: Rediscovering 50 years of discoveries
CfP: Rediscovering 50 years of discoveries
Contributed Task CfC + Final Workshop Proceedings As single pdf

TAKE Publications