Scroll Top
19th Ave New York, NY 95822, USA

GridWorld Object Model

Overview

This page contains the documentation of the object model related to the Grid World environment.
There are 4 main sections:

  • Environment which describes the GridWorld itself
  • Cell that represents each cell of the Grid, it can also be referred to as State
  • Action that represents the different actions that can be done in a certain Cell or State
  • Functions, are helper functions that are provided to the script editor for convenience, so that the task of the user is made easier

 

Environment Object

The environment object represents the GridWorld and contains methods that allow access to individual cells.

getAllActions()
Purpose: Returns an array of IAction, listing all types of actions in GridWorld environment.

getAvailableActions(state:IState)

Purpose: Returns an array of IAction, listing the actions available at the given state or cell.

Parameters:
state – Represents the cell or the state for which the list of available actions should be returned.

getNextState(state: IState, action: IAction): { state: IState, reward: number, transitionProba:number }

Purpose: Execute the given action at the given state and an object containing the next state, the reward and the transition probability.

Parameters:
state – The current state to which an action will be applied.
action – The action that should be applied to the given state.

Result:
state – The resulting state from the operation.
reward – The reward (can be positive or negative) that has been acquired by moving to the new state.
transitionProba –  Available only in Dynamic Programming method because the transition probabilities are known.


getState(row: number, col: number): GridWorldCell
Purpose: Returns the cell/state at the given row and column of the GridWorld

Parameters:
row – Zero based index of the row.
col – Zero based index of the column.

Result:
Returns a GridWorldCell that implements IState interface.

Cell object

Cell object represents an individual cell in the GridWorld. It is implementing an interface called IState.

isAccessible()
Purpose: Returns true if the cell/state is terminal accessible. Inaccessible states are the ones that the RL agent can’t access, like the Wall state for example.

 

isTerminal()
Purpose: Returns true if the cell/state is terminal. Terminal state are the Win state and Loss state

 

getQ(actionId:number)
Purpose: Returns the action value Q of this state and the given action represented by its id 

Parameters:

actionId – The id of the action, to be applied on this state.

setQ(actionid: number, value: number)
Purpose: Sets the action value Q of this state and the given action represented by its id 

Parameters:

actionId – The id of the action, to be applied on this state.
value – The Q value to be set to this state and the actionId.

getQs()
Purpose: Returns an array of numbers representing the Q value for each action. The index of the array represents the actionId, such as:
0 is UP action, 1 is RIGHT action, 2 is DOWN action, 3 is LEFT action. 

getV()
Purpose: Returns the state value V of this state.

 

setV(val:number)
Purpose: Sets the state value V of this state. This is used when a new value is computed.
 
 

Action object

Action object represents the action that can be used at a cell or state. In GridWorld there are four actions that implement IAction interface: UpAction, RightAction, DownAction, LeftAction.

getId()
Purpose: Returns the value of action. UpAcion id is 0, RightAction id is 1, DownAction id is 2, LeftAction id is 3.

 

getSymbol()
Purpose: Returns a letter representing the action. UpAcion symbol is ‘U’, RightAction symbol is ‘R’, DownAction symbol is ‘D’, LeftAction symbol  is ‘L’.

 

getTitle()
Purpose: Returns the label of action. UpAcion title is ‘Up’, RightAction title is ‘Right’, DownAction title is ‘Down’, LeftAction title is ‘Left’.

 

functions

getActionsFromPolicyFunc(state)
Purpose: Returns the actions to be used at the given state depending on the current policy. Meaning it can be totally random, or based on Greedy, or Epsilon Greedy policies.

getPolicyForAction(cell, actionId)
Purpose: Returns the probability of action ‘actionId’ at ‘cell’. It represents the function π(s,a).

Parameters:
cell – Represents the cell or the state for which the policy will be verified.
actionId – Id of the action to check the policy for. 

Example:
let policy = getPolicyForAction(cell, action.getId());

getStartStateFunc()
Purpose: Returns a starting state, for the algorithm to begin with. This is useful in MonteCarlo  method. It takes into account the parameter ‘Exploring Start’

 

WhatsApp