Annotated Bibliography on the Generation of Referring Expressions and Related Problems

The bibliography is split into categories for convenience. Works may be relevant to more than one category. Each category contains links to relevant references listed in other categories. Links to papers are given where available. Some papers have an associated description.

Algorithms and meta-algorithms for GRE

Bateman, J.A. (1999). Using aggregation for selecting content when generating referring expressions. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, ACL-99. Extends a lattice-based approach to aggregation to the case of referring expressions generation. Lattices are used to represent common properties between entities (nodes in the lattice). This results in a static representation of domain knowledge which can be processed efficiently to select identifying properties of a target referent and approximate minimal descriptions.

Dale, R. (1989). Cooking up referring expressions. Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, ACL-89. Describes the GRE algorithm implemented in the EPICURE system, which generates recipes. Input to the algorithm is a structured representation, an instance of the basic ontological category physobj containing information about the properties, quantity and state of the object, and whether it is mass or count. The algorithm proceeds to search for a distinguishing description by selecting properties on the basis of their discriminatory power, calculated in terms of the number of distractors they exclude. Based on a greedy heuristic, the algorithm seeks to satisfy the 'full brevity' interpretation of the Gricean maxim of quantity; the shortest possible distinguishing description is generated. See Oberlander and Dale (1991) for an extension of the algorithm to events.

Dale, R., and Reiter, E. (1995). Computational interpretation of the Gricean maxims in the generation of referring expressions. Cognitive Science, 19(2): 233-263. Describes previous approaches to GRE: the Full Brevity algorithm, based on the greedy heuristic, and Local Brevity. Argues for a weak interpretation of the Gricean maxim of quantity, based on psycholinguistic evidence. Demonstrates the intractability of the 'full brevity' approach to descriptions: finding a brief description is equivalent to minimal set cover, i.e. is NP-Hard. Proposes the Incremental Algorithm which performs hillclimbing along a predetermined preference ordering of descriptors, without backtracking, resulting in descriptions which contain some redundant descriptors. Redundancy is justified on the basis of psycholinguistic evidence. Complexity is linear in the number of descriptors. See Dale and Reiter (1996) for a more detailed outline of the theoretical stance on the Gricean maxims. See Jordan and Walker (2000) for an empirical evaluation of the Incremental Model relative to other theories of reference.

Horacek, H. (1997). An algorithm for generating referential descriptions with flexible interfaces. Proceedings of the 35th Annual meeting of the Association for Computational Linguistics, ACL-97.

Krahmer, E., van Erk, S., and Verleg, A. (2001). A meta-algorithm for the generation of referring expressions. Proceedings of the 8th European Workshop on Natural Language Generation.

Krahmer, E., van Erk, S., and Verleg, A. (2002). Graph-based generation of referring expressions. Computational Linguistics, 28(1). Reinterprets the GRE content selection task as a subgraph isomorphism problem, formalising the domain as a labelled directed graph D, with vertices representing entities and arcs representing properties. Atomic properties and 2-place relations are uniformly represented as disjoint subsets of the set of labels. Avoids the problem of infinite recursion in the generation of relational descriptions reported by Dale and Haddock (1991). Finding a distinguishing description for an intended referent e is a process of constructing a subgraph G of D which corresponds to e and to no other entity. A description of a branch-and-bound algorithm is given to resolve this problem: To identify a referent e, the algorithm starts with the subgraph containing only the vertex e and recursively expands the graph by adding edges from D which are adjacent to the subgraph G. It is shown that the graph-theoretic framework can accommodate other approaches, such as Dale and Reiter's Incremental Algorithm. Contains proposals to incorporate salience weightings and cost-functions to guide subgraph expansion.

Mouret, P., and Rolbert, M. (1998). Dealing with distinguishing descriptions in a guided composition system. Proceedings of the 17th Conference on Computational Linguistics, COLING/ACL-98. Approaches the GRE problem from the point of view of guided composition in user interfaces, where the user is informed at every step what the options are for completing the current utterance. In this paradigm, the GRE problem is not only to identify a description as distinguishing, but also to identify possible completions of an incomplete description. The paper offers a formalisation of Dale's (1989) notion of distinguishing descriptions, and extends it to cover cases of ~inclusion, where one description subsumes another either because of hyperonymy (the dog vs. the animal) or because of given information about the intended referent in prior discourse (the child who robbed -> the robber)

Reiter, E. (1990a). The computational complexity of avoiding false implicatures. Proceedings of the 28th Annual meeting of the Association for Computational Linguistics, ACL-90. Proves the intractability of the Full Brevity algorithm of (Dale, 1989) (equivalent to a Minimal Set Cover problem). Proposes a version of brevity called Local Brevity, incorporating a weaker version of the Gricean maxim of Quantity. Algorithm proceeds by checking that a component of a description cannot be replaced locally by a briefer new component without loss of discriminatory power. Complexity is polynomial. See also: Reiter, 1990b.

Reiter, E., and Dale, R. (1992). A fast algorithm for the generation of referring expressions. Proceedings of the 14th International Conference on Computational Linguistics, COLING-92. An earlier version of the Incremental Algorithm of Dale and Reiter (1995).

Stone, M., and Webber, B. (1998). Textual economy through close coupling of syntax and semantics. Proceedings of the 9th International Workshop on Natural Language Generation. Describes the approach to generating object descriptions in the SPUD system, which combines semantics/pragmatics and their associated syntactic structure in an incremental approach to description building using an ontologically promiscuous knowledge representation and a Lexicalised Tree Adjoining Grammar. At a particular state, the representation of a sentence consists of (a) an instantiated tree; (b) the semantic requirements associated with the tree, and (c) its semantic contributions.

van Deemter, K., and Krahmer, E. (2006). Graphs and Booleans: On the generation of referring expressions. H.Bunt, and R.Muskens (Eds.), Computing Meaning (Vol. III). Dordrecht: Kluwer. Extends the graph-based formulation of Krahmer et al. (2003) to include Boolean operations such as complementation (negation), reference to sets, and set union (disjunction). For the latter, a graph partition algorithm is proposed, which seeks to construct descriptions in Disjunctive Normal Form, rewritten as partitions. Partitions are constructed at increasing levels (starting from level 1), until a distinguishing description is found.

Pragmatics of reference, dialogue and description planning

Appelt, D. (1985a). Some pragmatic issues in the planning of definite and indefinite noun phrases. Proceedings of the 23^rd Annual Meeting of the Association for Computational Linguistics, ACL-85.

Appelt, D. (1985b).Planning English referring expressions. Artificial Intelligence, 26: 1-33. [Reprinted in: B. J. Grosz, K. Sparck Jones, and B. L. Webber (Eds.). (1986). Readings in Natural Language Processing. Los Altos, Ca.: Morgan Kaufmann]. A classic paper describing the NP generation component of the KAMP system. The generation process is modelled as an interactive process between a speaker (modelled the system) and a hearer. Inference about speaker and hearer goals, based on Speech Act theory, guide the generation process. The section on generation of definite, referential NPs contains one of the earliest insights into the computational complexity of generating provably minimal descriptions. Appelt proposes a naive incremental model which is, historically, a precursor to Dale and Reiter (1995).

Appelt, D. (1987a). Reference and pragmatic identification. Proceedings of Theoretical Issues in Natural Language Processing, TINLAP-87.

Appelt, D. (1987b). Towards a plan-based theory of referring actions. In: G. Kempen (Ed.), Natural Language Generation: New Results in Artificial Intelligence, Psychology and Linguistics. Dordrecht: Nijhoff and NATO Scientific Affairs Division.

Dale, R., and Reiter, E. (1996). The role of the Gricean maxims in the generation of referring expressions. Proceedings of the AAAI-96 Spring Symposium on Computational Models of Conversational Implicature. A theoretical outline of the role of the Gricean maxims in GRE, with reference to the Dale and Reiter (1995) Incremental Algorithm. The argument is that, rather than directives on human communication, the Gricean maxims are post hoc descriptions of aspects of rational communicative behaviour. Hence, rather than directly modelling the maxims as constraints, GRE algorithms should satisfy their observations if they are sufficiently goal-directed.

Heeman, P.A., and Hirst, G. (1995). Collaborating on referring expressions. Computational Linguistics, 21(3).

Kronfeld, A. (1986). Donnellan's distinction and a computational model of reference. Proceedings of the >24^th Annual Meeting of the Association for Computational Linguistics, ACL-86.

Kronfeld, A. (1987). Goals of referring acts. Proceedings of Theoretical Issues in Natural Language Processing, TINLAP-87.

Kronfeld, A. (1989). Conversationally relevant descriptions. Proceedings of the 27^th Annual Meeting of the Association for Computational Linguistics, ACL-89 Argues for a distinction between the functional relevance of references, whereby they distinguish their referents, and their conversational relevance, which is related to whether their content is relevant to the current discourse. One of the earliest insights into the problems of relevance and perspective in reference.

Kronfeld, A. (1990). Reference and Computation: An Essay in Applied Philosophy of Language. Cambridge: CUP.

O'Donnell, M., Cheng, H., and Hitzeman, J. (1998). Integrating referring and informing in NP planning. Proceedings of the COLING-ACL workshop on the Computational Treatment of Nominals, ACL-98. Describes the GRE component of the ILEX system. Extends the standard model of GRE as 'generation of identifying descriptions' to NPs which contain attributes that are informative, but are not necessary for identification. The model is based on systemic-functional grammar. During the formation of a referring NP, the informing task influences decisions on which attributes to include in the description, choice of head noun, as well as the form that the final NP is realised as (deictic, definite etc).

Paris, C.L., and McKeown, K.R. (1987). Discourse strategies for describing complex physical objects. In: G. Kempen (Ed.), Natural Language Generation: New Results in Artificial Intelligence, Psychology and Linguistics. Dordrecht: Nijhoff and NATO Scientific Affairs Division.

Reiter, E. (1990b). Generating descriptions that exploit a user's domain knowledge. In: R. Dale, C. Mellish, and M. Zock (Eds.), Current Research in Natural Language Generation. New York & London: Academic Press

Empirical Approaches and Evaluation

Gupta, S., and Stent, A. J. (2005). Automatic evaluation of referring expression generation using corpora. Proceedings of the 1^st Workshop on Using Corpora in NLG, Birmingham, UK. Evaluates a number of algorithms for GRE, including Dale and Reiter's (1995) Incremental algorithm and Siddharthan and Copestake's (2004) GRE algorithm. Algorithms were also augmented with a function for realising modifiers pre- and post-nominally. Evaluation was automatic, and carried out on the COCONUT and MAPTASK corpora. The output of the algorithms was compared to a baseline, which always selected type information and then arbitrarily added modifiers until the description was distinguishing. The baseline performed best on MAPTASK, but the Dale/Reiter and Siddharthan/Copestake algorithms performed better on the COCONUT data, in which domain objects tend to be more complex, requiring attribute selection.

Jordan, P., and Walker, M. (2000). Learning attribute selections for non-pronominal expressions. Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, ACL-00.

Jordan, P., and Walker, M. (2005). Learning content selection rules for generating object descriptions in dialogue. Journal of Artificial Intelligence Research, 24: 157--194. Evaluates three competing models of GRE: Dale and Reiter's (1995) incremental model; Brennan and Clark's (1996) conceptual pacts model, and an alternative Intentional Influences model proposed by Jordan (2000), which proposes that attribute selection in a referring expression is a function of current communicative intentions and task constraints. The models were evaluated by training a machine learner on descriptions from the COCONUT corpus, annotated with the relevant information. Comparison of the machine learner's output with descriptions in the corpus was analysed. When comparing the models, Intentnaional Influences provides the best fit to the data (42.4%), although a combination of all 3 models performs best (60%).

Relations, Spatial/Scene Descriptions, Reference to Events

Arbib, M.A., Conklin, E.J., and Hill, C. (1986). From Schema Theory to Language. Oxford: OUP.

Presents an integrated psycholinguistic, neuro-cognitive and computational approach to language, based on a schema-theoretic model of cooperative computation that seeks to integrate information from different modalities, including vision. Part II contains a description of a scene description generation system built by Conklin, with some discussion of the way salience dynamics were incorporated in the content-selection process for object description.

Conklin, E.J., and McDonald, D.D. (1982). Salience: The key to the selection problem in natural language generation. Proceedings of the 20th Annual Meeting of the Association for Computational Linguistics, ACL-82.

Dale, R., and Haddock, N. (1991). Generating referring expressions containing relations. Proceedings of the 5th Conference of the European Chapter of the ACL, EACL-91. A constraint-based approach to generating distinguishing descriptions containing relations. Algorithm takes three data structures as input: (a) a referent stack with the intended referents; (b) a property set for the intended referent; (c) a constraint network containing properties for the description and the set of domain variables constituting the distractor set. Algorithm proceeds by recursively updating the constraint network until a distinguishing description is found. The search procedure is depth-first. Problems occur when the algorithm keeps trying to recursively identify the referent and the relatum, generating descriptions such as the cup on the floor which is holding the cup which is... Proposed solution: a heuristic that prevents objects from being mentioned more than once in a description.

Horacek, H. (1995). More on generating referring expressions. Proceedings of the 5th European Workshop on Natural Language Generation.

Horacek, H. (1996). A new algorithm for generating referential descriptions. Proceedings of the European Conference on Artificial Intelligence, ECAI-96. Attempts to bridge the proposals of Dale and Reiter (1995) and Dale and Haddock (1991) into a unified algorithm that combines depth-first and breadth-first search, overcoming some of the limitations in the previous proposals by including (a) Categorial expectations, i.e. the contextually motivated expectation of the category of the intended referent (which is used to rule out distractors); (b) proposing a unified treatment for atomic and relational descriptors; (c) imposing a parameter max on the depth of search (avoiding the infinite recursion found by Dale and Haddock) and a predicate salient for particularly salient descriptors that warrant inclusion even if they have no discriminatory power; (d) combining depth-first and breadth-first search via iterative deepening, favouring flat over embedded descriptions. Algorithm is composed of two sub-routines: describe-by-relation and describe-by-attribute, both of which maintain a constraint network. Input is a communicative goal to identify an intended referent. Successfully generates descriptions such as (1) the table on which there are two bottles without going into infinite looping. Does not incorporate an account of negation and disjunction.

Neumann, B., and Novak, H-J. (1983). Event models for recognition and natural language description of events in real-world image sequences. Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI-83. Neumann, B. (1984). Natural language description of time-varying scenes. FBI-HH-B-105/84, Fachbereich Informatik, Universitat Hamburg.

Novak, H-J. (1987).Strategies for generating coherent descriptions of object movements in street scenes. In: Kempen, G. (Ed.), Natural Language Generation: New Results in Artificial Intelligence, Psychology and Linguistics. Dordrecht: Nijhoff & NATO Scientific Affairs Division. Describes the NAOS system, which generates descriptions of visual street scenes containing objects and events, after event recognition occurs through micro-analysis of the visual scene. Generation system makes use of an event hierarchy, temporal distinctions between event types (durative/non-durative/inchoative), and case frames a` la Fillmore (1968). The GRE module REF is based on an open-world database. Two strategies for GRE are proposed: (a) The system generates referring expressions of an object based on its properties, taking into account whether an object of the same type has already been introduced (which triggers the use of 'other'-anaphora). (b) Objects that have the same properties are distinguished by ordinal numerals (the first X, the second X...).

Novak, H-J. (1988). Generating referring phrases in a dynamic environment. In: M. Zock, and G. Sabah (Eds.), Advances in Natural Language Generation, Vol. II. USA: Pinter

Oberlander, J., and Dale, R. (1991). Generating expressions referring to eventualities. Proceedings of the 15th Annual Meeting of the Cognitive Science Society. Extends the GRE algorithm proposed in Dale (1989) to cover reference to events, taking into account the event/process distinction. While the ontology in Dale (1989) distinguishes mass and count, this algorithm extends the ontology using the analogy between mass/count and event/process proposed by Bach (1986).

Walltz, D. (1981).Generating and understanding scene descriptions. In A. Joshi, B. Webber, and I. Sag (Eds.), Elements of Discourse Understanding. Cambridge: CUP.

Salience and context-sensitive GRE

Krahmer, E., and Theune, M. (2002).Efficient generation of descriptions in context. In: K. van Deemter, and R. Kibble (Eds.), Information Sharing. Stanford: CSLI.

An extension of Dale and Reiter's (1995) Incremental Algorithm to take context into account in the generation of reduced anaphoric descriptions and pronouns. The modified Incremental Algorithm uses a salience metric to identify which entities are salient in a given discourse segment, treating only these as the distractors of the intended referent in context. The salience metric is based on a combination of Centring Theory and the Prague theory of discourse focus. The paper also contains an experimental evaluation of the hypotheses that the algorithm seeks to model.

Pattabhiraman, T., and Cercone, N. (1990). Selection: Salience, relevance and the coupling between domain-level tasks and text planning. Proceedings of the 5th International Workshop on Natural Language Generation.

Stevenson, R. (2002). The role of salience in the production of referring expressions: A psycholinguistic perspective. In: K. van Deemter, and R. Kibble (Eds.), Information Sharing. Stanford: CSLI.