Packages

package rl

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. trait Decider[A, R, M[_]] extends AnyRef

    Trait for things that can choose some Monadic result.

  2. trait Learner[A, R, This <: Learner[A, R, This]] extends AnyRef
  3. trait Policy[A, R, This <: Policy[A, R, This]] extends Learner[A, R, This] with Decider[A, R, Generator]

    This is how agents actually choose what comes next.

    This is how agents actually choose what comes next. This is a stochastic policy. We have to to be able to match this up with a state that has the same monadic return type, but for now it's hardcoded.

    A - Action R - reward This - policy

  4. trait State[A, Reward] extends AnyRef

    A world should probably have a generator of states and actions...

    A world should probably have a generator of states and actions... and then you can use that to get to the next thing. The state here is going to be useful in the Markov model; for the bandit we only have a single state, not that useful.

  5. trait StochasticDecider[A, R] extends Decider[A, R, Generator]
  6. final case class Time(value: Long) extends AnyVal with Product with Serializable
  7. trait ValueFunction extends AnyRef

    TODO Figure out how to make something that tracks action and state values here.

Value Members

  1. object Game

    Playing the game, currently.

    Playing the game, currently. This is my test harness.

    What we REALLY NEED here is both the top and bottom graphs, getting it done.

    The top graph is the average reward across GAMES per step.

    So we really want to march them ALL forward and grab the average reward...

  2. object Plot
  3. object Policy
  4. object State

    Then we have a bandit...

    Then we have a bandit... a single state thing.

  5. object Time extends Serializable
  6. object Util
  7. object ValueFunction

Ungrouped