Towards a UNified Algorithm for the Generation of Referring Expressions
TUNA was a research project funded by the UK's Engineering and Physical Sciences Research Council (EPSRC). It involves a collaboration between the Department of Computing Science, University of Aberdeen, the Open University, and the University of Tilburg. The project started in October 2003, and ended in Feburary 2007.
Natural Language Generation programs generate text from an underlying Knowledge Base. It can be difficult to find a mapping from the information in the Knowledge Base to the words in a sentence. Difficulties arise, for example, when the Knowledge Base uses `names' (ie, databases keys) that a hearer/reader does not understand. This can happen, for instance, if the Knowledge Base contains an artificial name like `#Jones083', because `Jones' alone is not uniquely distinguishing; it is also true if the Knowledge Base deals with entities for which no names at all are in common usage (eg, a specific tree or a chair). In all such cases, the program has to "invent" a description that enables the reader to identify the referent. In the case of Mr. Jones, for example, the program could give his name and address; in the case of a tree, some longer description may be necessary (eg, `the green oak on the corner of ... and ...'. The technical term for this set of problems is Generation of Referring Expressions (GRE). GRE is a key aspect of almost any Natural Language Generation system.
Existing GRE algorithms tend to focus on one particular class of referring expressions, for example conjunctions of atomic or relational properties (eg, `the black dog', `the book on the table'). Our research is aimed at designing and implementing a new algorithm for the generation of referring expressions that generates appropriate descriptions in a far greater variety of situations than any of its predecessors. The algorithm will be more complete than its predecessors because it is able to construct a greater variety of descriptions (involving negations, disjunctions, relations, vagueness, etc.). The descriptions generated should also be more appropriate (ie, more natural in the eyes of a human hearer/reader), because the algorithm will be based on empirical studies involving corpora and controlled experiments. Among other things, these empirical studies will address the question under what circumstances the descriptions should be logically under- or over specific; they will also allow us to prune the search space (ie, the space of all descriptions) which would otherwise threaten to make the problem intractable. The project combines (psycho) linguistic, computational and logical challenges and should be of interest to people whose intellectual home is in either of these areas.
- Kees van Deemter (PI, University of Aberdeen)
- Richard Power (Co-Investigator, Open University)
- Emiel Krahmer (Visiting Fellow, University of Tilburg)
- Ielka van der Sluis (Post-Doctoral Research Fellow)
- Albert Gatt (Research student)
- Sebastian Varges (Post-Doctoral Research Fellow, 2003-2005)
Papers that describe some of the technical background to the project (in pdf format):
- K. van Deemter (2002) Generating Referring Expressions: Boolean Extensions of the Incremental Algorithm. Computational Linguistics 28(1): 37-52.
- E. Krahmer, S. van Erk, A. Verleg (2003) Graph-based Generation of Referring Expressions. Computational Linguistics, 29(1): 53-72.
- For our project plans, see the TUNA project proposal (May, 2002)
- The TUNA final project report (May 2007).
- Gatt, A., and Van Deemter, K. Lexical choice and conceptual perspective in the generation of plural referring expressions. Journal of Logic Language and Information (JoLLI).
- Paraboni, I., Van Deemter, K., and Masthoff, J. (2007). Generating Referring Expressions: Making Referents Easy to Identity. Computational Linguistics, 33(2).
- van Deemter, K. (2006). Generating Referring Expressions that involve gradable properties. Computational Linguistics, 32(2).
- van der Sluis, I., and Krahmer, E. (2007). Generating Multimodal References. Discourse Processes [Special issue on Dialogue Modelling: Computational and Empirical Approaches]
- van Deemter, K., and Krahmer, E. (2007). Graphs and Booleans. H. Bunt and R. Muskens (eds.), Computing Meaning III. Dordrecht: Kluwer Academic Publishers.
- Croitoru, M., and van Deemter, K. (2007a). A conceptual-graph approach to the generation of referring expressions. Proceedings of the International Joint Conference on Artificial Intelligence.
- Croitoru, M., and van Deemter, K. (2007b). An inferential approach to the generation of referring expressions. Proceedings of the 15th International Conference on Conceptual Structures, ICCS-07
- Gatt, A. and van Deemter, K. (2007). Incremental generation of plural descriptions: Similarity and partitioning. Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP-07
- van der Sluis, I., Gatt, A., and van Deemter, K. (2007). Evaluating Algorithms for the Generation of Referring Expressions: Going Beyond Toy Domains. Proceedings of the International Conference on Recent Advances in Natural Language Processing, RANLP-07
[Note: This paper was accepted for publication after May 2007]
- Gatt, A. (2006a). Structuring knowledge for reference generation: A clustering algorithm. Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, EACL-06
- Gatt, A. (2006b). Generating collective spatial references. Proceedings of the 28th Annual Conference of the Cognitive Science Society, CogSci-06.
- Gatt, A., and van Deemter, K. (2006). Conceptual coherence in the generation of referring expressions. Proceedings of the 44th Annual Conference of the Association for Computational Linguistics (Main Poster Session), COLING-ACL-06.
[Slightly revised version also in Proceedings of the Workshop on Modelling Coherence for Generation and Dialogue Systems, associated with ESSLLI-2006, Malaga, Spain.]
- Khan, I.J., Ritchie, G., and van Deemter, K. (2006). The Clarity-Brevity Trade-off in Generating Referring Expressions. Proceedings of the 4th International Natural Language Generation Conference, INLG-06
- Paraboni, I., and van Deemter, K. (2006). Referring via document parts. Proceedings of the 7th International Conference on Intelligent Text Processing and Computational Linguistics, CICLING-06.
- Paraboni, I., Masthoff, J., and van Deemter, K. (2006a). Overspecified Reference in Hierarchical Domains: Measuring the Benefits for Readers. Proceedings of the 4th International Natural Language Generation Conference, INLG-06
- van Deemter, K., van der Sluis, I., and Gatt, A. (2006). Building a semantically transparent corpus for the generation of referring expressions. Proceedings of the 4th International Conference on Natural Language Generation, INLG-04 (Special session on Data Sharing and Evaluation)
- van Deemter, K. (2004). Finetuning an NLG system through experiments with human subjects: the case of vague descriptions. Proceedings of the 3rd International Conference on Natural Language Generation, INLG-04.
- van der Sluis, I., and Krahmer, E. (2004a). The Influence of Target Size and Distance on the Production of Speech and Gesture in Multimodal Referring Expressions. Proceedings of the 8th International Conference on Spoken Language Processing (ICSLP), October 4-8, Jeju Island, Korea.
- van der Sluis, I., and Krahmer, E. (2004b). Evaluating Multimodal NLG using Production Experiments. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon, Portugal..
- Varges, S. (2004). Overgenerating Referring Expressions Involving Relations. Proceedings of the Third International Conference on Natural Language Generation (INLG-04), Brockenhurst, UK.
- Gatt, A., van der Sluis, I., and van Deemter, K. (2007a). Corpus-based evaluation of referring expressions generation. Position paper at the Workshop on Shared Tasks and Comparative Evaluation, Arlington, Va.
- Gatt, A., van der Sluis, I., and van Deemter, K. (2007b). Evaluating algorithms for the generation of referring expressions using a balanced corpus. Proceedings of the 11th European Workshop on Natural Language Generation, ENLG-07.
- Paraboni, I., van Deemter, K., and Masthoff, J. (2006b). Gerando Expressoes de Referencia com a `Quantidade Certa' de Informacao. Proceedings of the 4th Workshop on Information and Human Language Technology, TIL-06, Ribeirao Preto, Brazil.
- Gatt, A., and van Deemter, K. (2005). Semantic similarity and the generation of referring expressions: A first report. Proceedings of the 6th International Workshop on Computational Semantics (IWCS-6), Tilburg.
- van der Sluis, I., and Krahmer, E. (2005). Towards the Generation of Overspecified Multimodal Referring Expressions. Proceedings of the Symposium on Dialogue Modelling and Generation, Amsterdam.
- Varges, S. (2005a). Spatial Descriptions as Referring Expressions in the MapTask Domain. Proceedings of 10th European Workshop on Natural Language Generation, Aberdeen, Scotland.
- Varges, S. (2005b). Chart Generation Using Production Systems. Proceedings of the 10th European Workshop on Natural Language Generation, Aberdeen, Scotland.
- Varges, S. and van Deemter, K. (2005). Generating referring expressions containing quantifiers. Proceedings of the 6th International Workshop on Computational Semantics (IWCS-6), Tilburg, The Netherlands.