ReinforcementLearningTrajectories

Design

The relationship of several concepts provided in this package:

┌───────────────────────────────────┐
│ Trajectory                        │
│ ┌───────────────────────────────┐ │
│ │ EpisodesBuffer wrapping a     | |
| | AbstractTraces                │ │
│ │             ┌───────────────┐ │ │
│ │ :trace_A => │ AbstractTrace │ │ │
│ │             └───────────────┘ │ │
│ │                               │ │
│ │             ┌───────────────┐ │ │
│ │ :trace_B => │ AbstractTrace │ │ │
│ │             └───────────────┘ │ │
│ │  ...             ...          │ │
│ └───────────────────────────────┘ │
│          ┌───────────┐            │
│          │  Sampler  │            │
│          └───────────┘            │
│         ┌────────────┐            │
│         │ Controller │            │
│         └────────────┘            │
└───────────────────────────────────┘

`Trajectory`

A Trajectory contains 3 parts:

A container to store data. (Usually an AbstractTraces)
A sampler to determine how to sample a batch from container
A controller to decide when to sample a new batch from the container

Typical usage:

julia> t = Trajectory(
               container = Traces(a=Int[], b=Bool[]), 
               sampler = BatchSampler(3), 
               controller = InsertSampleRatioController(1.0, 3, 0, 0)
           );

julia> push!(t, (a=1,));

julia> for i in 1:5
           push!(t, (a=i, b=iseven(i)))
       end

julia> for batch in t
           println(batch)
       end
(a = [1, 3, 1], b = Bool[1, 1, 1])
(a = [4, 1, 4], b = Bool[0, 0, 0])
(a = [1, 4, 1], b = Bool[1, 0, 0])
(a = [1, 1, 4], b = Bool[1, 0, 0])

Traces

Traces
MultiplexTraces
CircularSARTTraces
NormalizedTraces

Samplers

BatchSampler
MetaSampler
MultiBatchSampler
EpisodesSampler

Controllers

InsertSampleRatioController
AsyncInsertSampleRatioController

Please refer tests for common usage. (TODO: generate docs and add links to above data structures)

Acknowledgement

This async version is mainly inspired by deepmind/reverb.

Name		Name	Last commit message	Last commit date
Latest commit History 336 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReinforcementLearningTrajectories

Design

`Trajectory`

Acknowledgement

About

Uh oh!

Releases 23

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

License

JuliaReinforcementLearning/ReinforcementLearningTrajectories.jl

Folders and files

Latest commit

History

Repository files navigation

ReinforcementLearningTrajectories

Design

Trajectory

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 23

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

`Trajectory`

Packages