|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectclarion.common.BPNet
clarion.common.QBPNet
Field Summary | |
protected int |
CONTROL_OUTPUT_DIM_NUM
the # of NACS-control dims. |
protected double |
discount
the discount and reinforcement. |
protected short[] |
fullControlOutputDVs
the dim-val info of NACS-control outputs. |
protected short[] |
fullControlOutputOffsets
the start offsets of each NACS-control dim in an one-dimenisonal array. |
protected short[] |
fullOutputDVs
D-V info of bottom (IDN) outputs. |
protected short[] |
fullOutputOffsets
start position of each (bottom) output dim in one dimensional aray. |
protected int |
netIdx
the net index |
protected int |
netType
network type : EX, GS or WM. |
protected int |
OUTPUT_DIM_NUM
|
protected short[] |
outputLocations
used to locate each type of output in the overall outputs. |
protected short[] |
outputNums
#s of each type of output in its physical full length. |
protected short[][][] |
performedAction
action performed. |
protected double[] |
preState
current state and previous state |
protected double |
reinforcement
the discount and reinforcement. |
protected double[] |
state
current state and previous state |
protected int |
subsysIdx
subsystem: ACS or MCS. |
Fields inherited from class clarion.common.BPNet |
BLAT, BLDT, BLPT, BLRT, debug, DesiredOutput, Errors, Eta, global, Hidden, HiddenDeriv, HiddenErrors, HiddenMomentum, HiddenThresholds, HiddenToOutputMomentum, HiddenToOutputWeights, HINITTHRESHOLD, HINITWEIGHT, HtoOWeights, Input, inputDVs, InputToHiddenMomentum, InputToHiddenWeights, ItoHWeights, LINITTHRESHOLD, LINITWEIGHT, Momentum, nHidden, nInput, NM, nOutput, Output, OutputDeriv, outputDVs, OutputMomentum, outputOffsets, OutputThresholds, PM, RZero, SUCC_C3, SUCC_C4, succRate, sumSqErrors |
Constructor Summary | |
QBPNet(int inputNum,
int nIdx,
Global g)
constructor. |
Method Summary | |
void |
backwardPass(double[][][] qValue)
Updates the neural network using simplified QBP version. |
void |
backwardPass(double[][][] qValue1,
double[][][] qValue2)
Updates the neural network using QBP version. |
protected void |
computeErrors(double[][][] qValue)
computes the errors of output layer of simplified QBP network. |
protected void |
computeErrors(double[][][] qValue1,
double[][][] qValue2)
computes the errors of output layer of QBP network. |
void |
getMaxQVals(double[] fillMe)
Returns the max Q-values of each output dimension. |
double |
getQDiscount()
returns the discount. |
double |
getQVal(int index)
returns the Q-value of a particular output unit. |
void |
getQVals(double[] fillMe)
returns all of the Q-values (the outputs). |
double |
getReinforcement()
returns the reinforcement. |
void |
getState(int[] fillMe)
returns current state. |
void |
reinit()
reinitialize this network. |
void |
setInput()
Sets the state as input to BPNet. |
void |
setPerformedAction(short[][][] action)
Sets the performed action. |
void |
setPreState()
Sets the previous state as input to BPNet. |
void |
setReinforcement(double value)
sets currently received reinforcement. |
void |
setState(double[] curState)
sets current state using an array of activations of all of the dimensional values. |
void |
setState(int[] curState)
sets current state using an array of active dimensional values. |
void |
update(short[][][] action,
double feedback,
double[][][] qValue)
update routine used for simplified Q-Learning. |
void |
update(short[][][] action,
double feedback,
double[][][] qVal1,
double[][][] qVal2)
update routine used for Q-Learning. |
Methods inherited from class clarion.common.BPNet |
backwardPass, calcRT, computeErrors, computeHiddenActivation, computeHiddenErrors, computeOutputActivation, forwardPass, getInput, getnHidden, getnInput, getNM, getnOutput, getOutput, getOutput, getOutput, getOutput, getOutput, getPM, getResponseTime, getSuccRate, getSumSqErrors, modifyHiddenToOutput, modifyInputToHidden, reinitWeights, resetMatches, restoreInitWeights, setDesiredOutput, setDesiredOutput, setDesiredOutput, setDesiredOutput, setInput, setInput, setInput, setLearningRate, setMomentum, updateMatches |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
protected int subsysIdx
protected int netType
protected int netIdx
protected int OUTPUT_DIM_NUM
protected int CONTROL_OUTPUT_DIM_NUM
protected short[] fullOutputDVs
protected short[] fullOutputOffsets
protected short[] fullControlOutputDVs
protected short[] fullControlOutputOffsets
protected short[] outputNums
protected short[] outputLocations
protected short[][][] performedAction
protected double discount
protected double reinforcement
protected double[] state
protected double[] preState
Constructor Detail |
public QBPNet(int inputNum, int nIdx, Global g)
inputNum
- the number of input unit.nIdx
- the net index.Method Detail |
public void reinit()
public void getState(int[] fillMe)
fillMe
- the array to store current state.public double getQDiscount()
public double getReinforcement()
public double getQVal(int index)
index
- index on the particular output unit.
public void getQVals(double[] fillMe)
fillMe
- the array to store the outputs.public void getMaxQVals(double[] fillMe)
fillMe
- the array to store the maximal Q-values.public void setState(int[] curState)
curState
- the array used to set current state.public void setState(double[] curState)
curState
- the array used to set current state.public void setPreState()
public void setInput()
public void setReinforcement(double value)
value
- the value used to set reinforcement.public void setPerformedAction(short[][][] action)
action
- the performed action to set.public void update(short[][][] action, double feedback, double[][][] qValue)
action
- the performed action.feedback
- the currently received reinforcement.qValue
- the array of Q-values the action has.public void update(short[][][] action, double feedback, double[][][] qVal1, double[][][] qVal2)
action
- the performed action.feedback
- the currently received reinforcement.qVal1
- the array of maximal Q-values after the action is performed.qVal2
- the array of Q-values the action has before the action is
performed.public void backwardPass(double[][][] qValue1, double[][][] qValue2)
public void backwardPass(double[][][] qValue)
qValue
- the array of Q-values the action has.protected void computeErrors(double[][][] qValue1, double[][][] qValue2)
protected void computeErrors(double[][][] qValue)
qValue
- the array of Q-values the action has.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |