standup.joke
Class Generator

java.lang.Object
  extended by standup.joke.Generator

public class Generator
extends Object

The main class that handles joke generation. It implements the algorithms described in Chapters 4 and 5 of the Technical Specification doc (i.e. Joke Generation and Surface Generation). It provides access to this engine through the generateJoke(String, JokeConstraints) method. This method should typically not be called directly -- Instead, the Backend.getNewJoke() method should be used, as it accounts for the user profile's constraints, and updates the user's generated joke log.

Author:
Ruli Manurung

Field Summary
private  List<Clause> clauses
          A List of all output specification function Clauses accessible to this Generator.
private  Hashtable<String,Boolean> exclusionTableSetupFlag
           
private  String masterType
          The label for the special 'master' JokeType.
private  Hashtable<String,Schema> schemas
          A Hashtable containing all Schemas accessible to this Generator.
private  List<Template> templates
          A List of all Templates accessible to this Generator.
private  JokeTypeSet types
          A JokeTypeSet representing all available JokeTypes.
 
Constructor Summary
Generator()
          Default constructor that uses the various XML joke resource files specified under the /standup/resources/xml directory (i.e.
Generator(URL url)
          Constructor that uses the various joke resources specified by the given URL.
 
Method Summary
 void doSTPFiltering(Schema ss)
          Experimenting with question-driven generation!
 void doSTPFiltering(Schema ss, float minimumSchemaFScoreThreshold, float maximumSchemaFScoreThreshold)
          This method computes the 'minmaxmin' FScore value of clause instantiations given a schema instantiation and a minimum and maximum FScore threshold for schema instantiations.
private  WordStruct elaborateBody(UnifiableList templateBody, String parentTemplateNodeID, int childNumber, String templateLabel, JokeGraph jokeGraph)
          The main template filling algorithm.
private  StructElement elaborateTemplateItem(String parentTemplateNodeID, int childNumber, String templateLabel, JokeGraphNodeKeyword nodeLex, JokeGraph g)
          Elaborates a lexical element, i.e.
private  StructElement elaborateTemplateItem(String parentTemplateNodeID, int childNumber, UnifiableCompound b, JokeGraph g)
          Recursively performs template filling on a template item that specifies another template
private  StructElement elaborateTemplateItem(UnifiableConstant bString, Unifiable bsuccNODE, JokeGraph g)
           
(package private)  float findMaxMinFScoreValue(List<List<Keyword>> instantiations)
           
private  float findMinMaxMinFScoreValueForOSF2(UnifiableCompound osf)
          An alternative implementation that builds one long SQL query using the SQL UNION operator.
 JokeStructure generateJoke(String newJokeID, JokeConstraints constraints)
          Generates a new JokeStructure that satisfies the constraints arguments, with the ID specified by newJokeID.
private  JokeStructure generateJokeStructure(String newJokeID, Schema schema, List<Keyword> lex, List<String> templatesQuestion, List<String> templatesAnswer, double phonSimValue, JokeConstraints constraints)
          This is the core algorithm which instantiates a schema's lexical preconditions with suitable lexemes, binds the output specification function clauses, builds the initial JokeGraph, and passes it on to surface generation.
private  float getClauseFScore(Schema s, List<Keyword> lex)
           
 List<Clause> getCompatibleClauses(Unifiable func, int arity)
          Returns a list of Clauses that are 'compatible' with the given functor and arity, i.e they share the same functor and arity.
private  List<Template> getCompatibleTemplates(UnifiableCompound templateSpecifier)
          Given a (dereferenced) template specifier, this method returns a list of compatible templates.
private  String getExclusionStatement(String schemaTableName, UnifiableListVar lexemeVars, List<Keyword> instantiations)
          Returns an SQL statement string that updates the exclusion table
private static String[] getIDs(List<Keyword> instantiations)
          Returns an array of the lexeme IDs (enclosed in single quotes).
 JokeTypeSet getJokeTypes()
          Returns a JokeTypeSet of all JokeTypes accessible to this Generator.
 JokeType getMasterJokeType()
          Returns the 'master' JokeType, i.e.
 Schema getSchema(String label)
          Returns the Schema specified by the label argument, or null if no Schema exists with the specified label.
 Set<String> getSchemaLabels()
          Returns a Set of all schema labels accessible to this Generator.
private  void initResources(URL url)
          Sets up the various joke resources required for generation, i.e.
private  UnifiableCompound obtainTemplateSpecifier(UnifiableCompound osf, JokeGraph jokeGraph, JokeConstraints constraints)
          Given an output specification function, this method finds an appropriate clause, instantiates it with values that satisfy the given JokeConstraints, updates the jokegraph, and returns the resulting template specifier.
private  List<UnifiableCompound> obtainTemplateSpecifiers(List<UnifiableCompound> osfs, JokeGraph jokeGraph, JokeConstraints constraints)
           Performs clause instantiation.
(package private) static JokeTypeSet peekAtJokeTypes()
           
(package private) static String[] peekAtSchemaLabels()
           
private static List<Clause> readClauses(String filename)
          Reads the output specification function clauses definitions from the given file and returns the resulting list<Clause>.
private static JokeTypeSet readJokeTypes(String filename)
          Reads the joke type definitions from the given file and returns the resulting JokeTypeSet.
private static Hashtable<String,Schema> readSchemas(String filename)
          Reads the schema definitions from the given file and returns a Hashtable<String,Schema>, where the keys are the schema labels, and the values are the Schemas themselves.
private static List<Template> readTemplates(String filename)
          Reads the template definitions from the given file and returns the resulting List<Template>.
 void resetExclusionTableSetupFlags()
           
 void setupExclusionTable(JokeSet jokelog, String schemaLabel)
           
 void setupExclusionTables(JokeSet jokelog)
           
private  WordStruct surfaceGenerate(List<UnifiableCompound> outputSpecificationFunction, String template, JokeGraph jokeGraph, JokeConstraints constraints)
          Takes an output specification function, instantiated with nodes in the given JokeGraph, and generates a WordStruct using the given template and satisfying the given JokeConstraints.
private  WordStruct templateFill(UnifiableCompound templateSpecifier, String parentTemplateNodeID, int childNumber, JokeGraph jokeGraph)
          Performs template filling, i.e.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

types

private JokeTypeSet types
A JokeTypeSet representing all available JokeTypes.


masterType

private String masterType
The label for the special 'master' JokeType. The actual JokeType itself can be obtained using getMasterJokeType().


schemas

private Hashtable<String,Schema> schemas
A Hashtable containing all Schemas accessible to this Generator. The keys are the schema labels.


templates

private List<Template> templates
A List of all Templates accessible to this Generator.


clauses

private List<Clause> clauses
A List of all output specification function Clauses accessible to this Generator.


exclusionTableSetupFlag

private Hashtable<String,Boolean> exclusionTableSetupFlag
Constructor Detail

Generator

public Generator()
          throws GeneratorException
Default constructor that uses the various XML joke resource files specified under the /standup/resources/xml directory (i.e. within the JAR).

Throws:
GeneratorException

Generator

public Generator(URL url)
          throws GeneratorException
Constructor that uses the various joke resources specified by the given URL.

Parameters:
url - a URL that specifies the various joke resources
Throws:
GeneratorException
Method Detail

initResources

private void initResources(URL url)
                    throws GeneratorException
Sets up the various joke resources required for generation, i.e. types, templates, clauses, and schemas, using the XML files indicated by the given url argument.

Parameters:
url - a URL that specifies the various joke resources
Throws:
GeneratorException

peekAtJokeTypes

static JokeTypeSet peekAtJokeTypes()

peekAtSchemaLabels

static String[] peekAtSchemaLabels()

readClauses

private static List<Clause> readClauses(String filename)
Reads the output specification function clauses definitions from the given file and returns the resulting list<Clause>.

Parameters:
filename - The name of the file containing the clause definitions (currently assumed to be in /standup/xml/resources).
Returns:
a list of Clauses.

readJokeTypes

private static JokeTypeSet readJokeTypes(String filename)
Reads the joke type definitions from the given file and returns the resulting JokeTypeSet.

Parameters:
filename - The name of the file containing the joke type definitions (currently assumed to be in /standup/xml/resources).
Returns:
the set of joke types

readTemplates

private static List<Template> readTemplates(String filename)
Reads the template definitions from the given file and returns the resulting List<Template>.

Parameters:
filename - The name of the file containing the template definitions (currently assumed to be in /standup/xml/resources).
Returns:
a list of Templates.

readSchemas

private static Hashtable<String,Schema> readSchemas(String filename)
                                             throws GeneratorException
Reads the schema definitions from the given file and returns a Hashtable<String,Schema>, where the keys are the schema labels, and the values are the Schemas themselves.

Parameters:
filename - The name of the file containing the schema definitions (currently assumed to be in /standup/xml/resources).
Returns:
a hashtable that maps schema labels to schemas.
Throws:
GeneratorException

getSchema

public Schema getSchema(String label)
Returns the Schema specified by the label argument, or null if no Schema exists with the specified label.

Parameters:
label - label of the desired schema.
Returns:
Schema with the given label, or null if none exists.

getSchemaLabels

public Set<String> getSchemaLabels()
Returns a Set of all schema labels accessible to this Generator.

Returns:
a Set of all schema labels

getJokeTypes

public JokeTypeSet getJokeTypes()
Returns a JokeTypeSet of all JokeTypes accessible to this Generator.

Returns:
a JokeTypeSet of all JokeTypes

getMasterJokeType

public JokeType getMasterJokeType()
Returns the 'master' JokeType, i.e. the JokeType that consists of all implemented SchemaTemplatePairs.

Returns:
The 'master' JokeType

generateJoke

public JokeStructure generateJoke(String newJokeID,
                                  JokeConstraints constraints)
Generates a new JokeStructure that satisfies the constraints arguments, with the ID specified by newJokeID.

This is the method that starts the joke generation process. We can impose arbitrary numbers of constraints (as long as the SQL server can accept the length of the query string!).

Parameters:
newJokeID - the unique ID to be assigned to the resulting JokeStructure
constraints - the various JokeConstraints that the resulting JokeStructure must satisfy
Returns:
a new joke

generateJokeStructure

private JokeStructure generateJokeStructure(String newJokeID,
                                            Schema schema,
                                            List<Keyword> lex,
                                            List<String> templatesQuestion,
                                            List<String> templatesAnswer,
                                            double phonSimValue,
                                            JokeConstraints constraints)
This is the core algorithm which instantiates a schema's lexical preconditions with suitable lexemes, binds the output specification function clauses, builds the initial JokeGraph, and passes it on to surface generation. See Algorithm 5 in Chapter 4 of the Technical Specification doc.

Parameters:
newJokeID - The unique ID assigned to the newly created joke
schema - The Schema to be used
lex - The Schema instantiations
templatesQuestion - A List of valid question template labels to be used
templatesAnswer - A List of valid answer template labels to be used
phonSimValues - A List of the phonetic similarity values of the instantiations in lex
constraints - The JokeConstraints used to generate this joke (still needed for clause instantiation)

surfaceGenerate

private WordStruct surfaceGenerate(List<UnifiableCompound> outputSpecificationFunction,
                                   String template,
                                   JokeGraph jokeGraph,
                                   JokeConstraints constraints)
Takes an output specification function, instantiated with nodes in the given JokeGraph, and generates a WordStruct using the given template and satisfying the given JokeConstraints. Also updates the JokeGraph.

Parameters:
outputSpecificationFunction - the output specification function to be generated
template - the template header(?) to be generated
jokeGraph - the joke graph being constructed
constraints - the constraints to be satisfied
Returns:

obtainTemplateSpecifiers

private List<UnifiableCompound> obtainTemplateSpecifiers(List<UnifiableCompound> osfs,
                                                         JokeGraph jokeGraph,
                                                         JokeConstraints constraints)

Performs clause instantiation. Given a list of output specification functions, a set of joke constraints, and the currently built joke graph, this method finds an appropriate clause and instantiates it -- updating the jokegraph in the process. It returns a list of template specifiers where the variables have been bound to corresponding JokeGraphNodeLexs in the graph.

Parameters:
osfs - list of output specification functions
jokeGraph - the intermediate joke graph
constraints - all constraints to be satisfied
Returns:

obtainTemplateSpecifier

private UnifiableCompound obtainTemplateSpecifier(UnifiableCompound osf,
                                                  JokeGraph jokeGraph,
                                                  JokeConstraints constraints)
Given an output specification function, this method finds an appropriate clause, instantiates it with values that satisfy the given JokeConstraints, updates the jokegraph, and returns the resulting template specifier. osf must be dereferenced to the joke graph nodes built so far, so that we don't end up building new nodes for the same (schema instantiated) lexemes/wordforms. Likewise, the returned template specifier also has its variables bound to nodes in the joke graph.

Parameters:
osf - this should be a dereferenced OSF, where the arguments are bound to JokeGraphNodeLexs...
jokeGraph -
constraints -
Returns:

getCompatibleClauses

public List<Clause> getCompatibleClauses(Unifiable func,
                                         int arity)
Returns a list of Clauses that are 'compatible' with the given functor and arity, i.e they share the same functor and arity. This is made public only because JokeHandcrafter needs it, and is currently in a different package.

Parameters:
func -
arity -
Returns:

templateFill

private WordStruct templateFill(UnifiableCompound templateSpecifier,
                                String parentTemplateNodeID,
                                int childNumber,
                                JokeGraph jokeGraph)
Performs template filling, i.e. realisation, of the given template specifier, and updates the jokegraph. To record the structure in the jokegraph, this method needs information of its 'parent' node, i.e. the node representing the WordStruct that 'encloses' the newly generated WordStruct.

Parameters:
parentTemplateNodeID - the ID of the parent node of the resulting struct
childNumber - the newly created WordStruct will be the childNumber-th child
templateSpecifier -
jokeGraph -
Returns:

getCompatibleTemplates

private List<Template> getCompatibleTemplates(UnifiableCompound templateSpecifier)
Given a (dereferenced) template specifier, this method returns a list of compatible templates.

Parameters:
templateSpecifier -
Returns:

elaborateBody

private WordStruct elaborateBody(UnifiableList templateBody,
                                 String parentTemplateNodeID,
                                 int childNumber,
                                 String templateLabel,
                                 JokeGraph jokeGraph)
The main template filling algorithm. It iterates through the (instantiated) template body and expands each element accordingly, updating the joke graph as it goes along.

Parameters:
templateBody - the template body to be processed
parentTemplateNodeID - the ID of the JokeGraphNode representing the parent WordStruct
childNumber - the index denoting this resulting WordStruct's position relative to its siblings
templateLabel - the template's label
jokeGraph - the joke graph being built
Returns:

elaborateTemplateItem

private StructElement elaborateTemplateItem(String parentTemplateNodeID,
                                            int childNumber,
                                            UnifiableCompound b,
                                            JokeGraph g)
Recursively performs template filling on a template item that specifies another template

Parameters:
parentTemplateNodeID -
childNumber -
b -
g -
Returns:

elaborateTemplateItem

private StructElement elaborateTemplateItem(String parentTemplateNodeID,
                                            int childNumber,
                                            String templateLabel,
                                            JokeGraphNodeKeyword nodeLex,
                                            JokeGraph g)
Elaborates a lexical element, i.e. adds a joke graph edge from the parent template node to the lexical node, and returns a WordStruct containing the lexeme/wordform/wordstring.

Parameters:
parentTemplateNodeID -
childNumber -
templateLabel -
nodeLex -
g -
Returns:

elaborateTemplateItem

private StructElement elaborateTemplateItem(UnifiableConstant bString,
                                            Unifiable bsuccNODE,
                                            JokeGraph g)

doSTPFiltering

public void doSTPFiltering(Schema ss)
Experimenting with question-driven generation!

Parameters:
constraints -

doSTPFiltering

public void doSTPFiltering(Schema ss,
                           float minimumSchemaFScoreThreshold,
                           float maximumSchemaFScoreThreshold)
This method computes the 'minmaxmin' FScore value of clause instantiations given a schema instantiation and a minimum and maximum FScore threshold for schema instantiations. These last 2 parameters were introduced to speed up computing the really large tables, where we aren't really interested in values of, e.g. fs<0.1

Parameters:
ss -
minimumSchemaFScoreThreshold -

getClauseFScore

private float getClauseFScore(Schema s,
                              List<Keyword> lex)

findMinMaxMinFScoreValueForOSF2

private float findMinMaxMinFScoreValueForOSF2(UnifiableCompound osf)
An alternative implementation that builds one long SQL query using the SQL UNION operator. It also removes the DISTINCT modifier from the individual Clause queries, as the UNION operator carries this out already.

Parameters:
osf -
Returns:

findMaxMinFScoreValue

float findMaxMinFScoreValue(List<List<Keyword>> instantiations)

setupExclusionTables

public void setupExclusionTables(JokeSet jokelog)

resetExclusionTableSetupFlags

public void resetExclusionTableSetupFlags()

setupExclusionTable

public void setupExclusionTable(JokeSet jokelog,
                                String schemaLabel)

getExclusionStatement

private String getExclusionStatement(String schemaTableName,
                                     UnifiableListVar lexemeVars,
                                     List<Keyword> instantiations)
Returns an SQL statement string that updates the exclusion table

Parameters:
schemaTableName -
lexemeVars -
instantiations -
Returns:

getIDs

private static String[] getIDs(List<Keyword> instantiations)
Returns an array of the lexeme IDs (enclosed in single quotes). WordForm IDs are discarded, as they are not needed to uniquely identify an instantiation.

Parameters:
instantiations -
Returns: