clarion.system
Class QBPNet

java.lang.Object
  extended by clarion.system.AbstractImplicitModule
      extended by clarion.system.AbstractTrainableImplicitModule
          extended by clarion.system.AbstractNeuralNet
              extended by clarion.system.BPNet
                  extended by clarion.system.AbstractRuntimeTrainableBPNet
                      extended by clarion.system.QBPNet
All Implemented Interfaces:
InterfaceHandlesFeedback, InterfaceHandlesNewInput, InterfaceHasMatchCalculator, InterfaceRuntimeTrainable, InterfaceTracksMatchStatistics, InterfaceTrainable, InterfaceUsesQLearning

public class QBPNet
extends AbstractRuntimeTrainableBPNet
implements InterfaceHandlesNewInput, InterfaceUsesQLearning, InterfaceHasMatchCalculator

This class implements a Q-learning backpropagating neural network within CLARION. It extends the AbstractRuntimeTrainableBPNet class and implements the InterfaceHandlesNewInput, InterfaceUsesQLearning, and InterfaceHasMatchCalculator interfaces.

Usage:

A Q-learning backpropagating neural network uses the Q-learning reinforcement algorithm to train the network and is capable of performing learning during runtime.

Note that the weight vectors and thresholds for the network can be hardcoded if they were recorded from a previous training session (just like a standard backpropagating neural network).

The Input for the neural network is a collection of dimension-values where each value represents one node.

If the network is being used as an action network in the ACS and you are using goals or specialized working memory chunks, remember that the input space must also contain all dimension-value pairs within those chunks that differ from the sensory information space.

Once the input space has been defined it cannot be changed. At any given forward pass through the network, the network can accept an arbitrary collection of dimension-values as input, but it will only adjust the activations of the inputs that were specified during initialization.

The nodes in the output layer are represented as output chunks.

The general procedure when using this class is:

  1. setInput
  2. forwardPass
  3. setChosenAction
  4. setFeedback
  5. setNewState
  6. backwardPass
  7. Goto Step 1

Note: The current implementation of CLARION does not allow for multiple action dimensions on the bottom level as stipulated in the CLARION tutorial. However, actions that contain multiple action dimensions are still possible by simply specifying action chunks on the top level that contain multiple activated action dimensions. This is the case because the output of the bottom level only specifies actions and not action dimension-values.

This class contains both global (static) and local constants. The default is to use the local constants. If you want to change any of the global constants, you need to do so before any instances of this class are initialized.

Version:
6.0.4
Author:
Nick Wilson

Field Summary
 double DISCOUNT
          The discount factor for q-learning.
 double GLOBAL_DISCOUNT
          The discount factor for q-learning.
private  AbstractMatchCalculator LocalMatchCalculator
          The match calculator used for updating match statistic within this class.
protected  DimensionValueCollection NewInput
          The new input after the chosen output is performed (if network is an action network that leads to a new state) represented as a collection.
 
Fields inherited from class clarion.system.AbstractRuntimeTrainableBPNet
Feedback, GLOBAL_POSITIVE_MATCH_THRESHOLD, NM, PM, POSITIVE_MATCH_THRESHOLD
 
Fields inherited from class clarion.system.BPNet
GLOBAL_LEARNING_RATE, GLOBAL_MOMENTUM, GLOBAL_RZERO, LEARNING_RATE, MOMENTUM, RZERO
 
Fields inherited from class clarion.system.AbstractNeuralNet
GLOBAL_LOWER_INIT_THRESHOLD, GLOBAL_LOWER_INIT_WEIGHT, GLOBAL_UPPER_INIT_THRESHOLD, GLOBAL_UPPER_INIT_WEIGHT, Hidden, HiddenThresholds, HiddenToOutputWeights, InputToHiddenWeights, LOWER_INIT_THRESHOLD, LOWER_INIT_WEIGHT, OutputThresholds, UPPER_INIT_THRESHOLD, UPPER_INIT_WEIGHT
 
Fields inherited from class clarion.system.AbstractTrainableImplicitModule
DesiredOutput
 
Fields inherited from class clarion.system.AbstractImplicitModule
ACTUATION_TIME, ChosenOutput, DECISION_TIME, GLOBAL_ACTUATION_TIME, GLOBAL_DECISION_TIME, GLOBAL_PERCEPTION_TIME, InputAsCollection, Output, PERCEPTION_TIME
 
Constructor Summary
QBPNet(java.util.Collection<Dimension> InputSpace, int NumHidden, AbstractOutputChunkCollection<? extends AbstractOutputChunk> Outputs)
          Initializes a backpropagating neural network that uses Q-Learning for training the network.
 
Method Summary
 void backwardPass()
          Updates the neural network using Q-Learning.
 boolean checkMatchCriterion()
          Checks to see if the positive match criterion is satisfied given the state before action a is performed, the state after action a is performed, any immediate feedback received, and the index of the action performed.
 double getDiscount()
          Gets the discount factor that is used as part of the Q-learning algorithm (see Sun Tutorial, 2003).
 AbstractMatchCalculator getMatchCalculator()
          Gets the match calculator used by the class.
 double getMaxQ()
          Gets the value of Max(Q(y,b)) where y is equal to the new state.
 java.util.Collection<Dimension> getNewInput()
          Returns the new input in the form of a dimension-value collection.
 void setMatchCalculator(AbstractMatchCalculator MatchCalculator)
          Sets the match calculator.
 void setNewInput(java.util.Collection<Dimension> input)
          Sets the activations for the new input to the specified input.
 
Methods inherited from class clarion.system.AbstractRuntimeTrainableBPNet
getFeedback, getNM, getPM, incrementNM, incrementPM, resetMatchStatistics, setFeedback, setNM, setPM, updateMatchStatistics
 
Methods inherited from class clarion.system.BPNet
computeHiddenActivation, computeOutputActivation, modifyHiddenToOutput, modifyInputToHidden
 
Methods inherited from class clarion.system.AbstractNeuralNet
forwardPass, getHiddenThresholds, getHtoOWeightMatrix, getItoHWeightMatrix, getNumHidden, getOutputThresholds, hardcodeWeights
 
Methods inherited from class clarion.system.AbstractTrainableImplicitModule
getSumSqErrors, setDesiredOutput, setDesiredOutput
 
Methods inherited from class clarion.system.AbstractImplicitModule
getChosenOutput, getInput, getNumInput, getNumOutput, getOutput, getOutput, getResponseTime, setChosenOutput, setInput, setInput, setInput
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface clarion.system.InterfaceUsesQLearning
getChosenOutput
 
Methods inherited from interface clarion.system.InterfaceTrainable
getSumSqErrors, setDesiredOutput, setDesiredOutput
 
Methods inherited from interface clarion.system.InterfaceHandlesFeedback
getFeedback, setFeedback
 

Field Detail

GLOBAL_DISCOUNT

public double GLOBAL_DISCOUNT
The discount factor for q-learning.


DISCOUNT

public double DISCOUNT
The discount factor for q-learning.


LocalMatchCalculator

private AbstractMatchCalculator LocalMatchCalculator
The match calculator used for updating match statistic within this class.


NewInput

protected DimensionValueCollection NewInput
The new input after the chosen output is performed (if network is an action network that leads to a new state) represented as a collection.

Constructor Detail

QBPNet

public QBPNet(java.util.Collection<Dimension> InputSpace,
              int NumHidden,
              AbstractOutputChunkCollection<? extends AbstractOutputChunk> Outputs)
Initializes a backpropagating neural network that uses Q-Learning for training the network.

The Input for the neural network is a collection of dimension-values where each value represents one node.

If the network is being used as an action network in the ACS and you are using goals or specialized working memory chunks, remember that the input space must also contain all dimension-value pairs within those chunks that differ from the sensory information space.

Once input has been set it cannot be changed. At any given forward pass through the network, the network can accept an arbitrary collection of dimension-values as input, but it will only adjust the activations of the inputs that were specified during initialization.

The nodes in the output layer are represented as output chunks.

Parameters:
InputSpace - A collection of dimension-value pairs to set as the input nodes.
NumHidden - The number of hidden nodes.
Outputs - The chunks to associate with the output layer.
Method Detail

backwardPass

public void backwardPass()
Updates the neural network using Q-Learning. This method should not be called before the setFeedback, setChosenAction, and setNewState methods have been called.

Specified by:
backwardPass in interface InterfaceTrainable
Overrides:
backwardPass in class BPNet

getNewInput

public java.util.Collection<Dimension> getNewInput()
Returns the new input in the form of a dimension-value collection. The collection returned is unmodifiable and is meant for reporting the internal state only.

Specified by:
getNewInput in interface InterfaceHandlesNewInput
Returns:
An unmodifiable collection of dimension-value pairs representing the new input nodes of the network.

setNewInput

public void setNewInput(java.util.Collection<Dimension> input)
Sets the activations for the new input to the specified input. If the new input is used to update the weights then this method should be called before the backwardPass method is called.

Specified by:
setNewInput in interface InterfaceHandlesNewInput
Parameters:
input - The new input from which to set the activations on the new input.

checkMatchCriterion

public boolean checkMatchCriterion()
Checks to see if the positive match criterion is satisfied given the state before action a is performed, the state after action a is performed, any immediate feedback received, and the index of the action performed. This check is usually performed before the backwardPass method is called but after the setFeedback, setChosenAction, and setNewState methods have been called.

Specified by:
checkMatchCriterion in interface InterfaceHandlesFeedback
Returns:
True if the positive match criterion has been satisfied, otherwise false.

getMaxQ

public double getMaxQ()
Gets the value of Max(Q(y,b)) where y is equal to the new state.

Specified by:
getMaxQ in interface InterfaceUsesQLearning
Returns:
The maximum Q value at the new state.

getDiscount

public double getDiscount()
Description copied from interface: InterfaceUsesQLearning
Gets the discount factor that is used as part of the Q-learning algorithm (see Sun Tutorial, 2003).

Specified by:
getDiscount in interface InterfaceUsesQLearning
Returns:
The discount factor.

getMatchCalculator

public AbstractMatchCalculator getMatchCalculator()
Description copied from interface: InterfaceHasMatchCalculator
Gets the match calculator used by the class. The match calculator returned by this method is usually passed directly into the class's updateMatchStatistics method.

Specified by:
getMatchCalculator in interface InterfaceHasMatchCalculator
Returns:
The match calculator.

setMatchCalculator

public void setMatchCalculator(AbstractMatchCalculator MatchCalculator)
Description copied from interface: InterfaceHasMatchCalculator
Sets the match calculator.

Specified by:
setMatchCalculator in interface InterfaceHasMatchCalculator
Parameters:
MatchCalculator - The match calculator to assign to the class.