Need to revise definitions/implementation of examples_only, test_mode, etc.

There are various situations where we don't want AL to go through a full feedback loop because we don't want AL to produce actions, receive feedback, etc. 

From the perspective of AL_Train, typically:
1(A). An act() request is sent to AL for an action <- (i.e. AL needs to self-explain the next step)
2(F). AL's next action(s) is received and feedback is sent back to AL
or
3(D). AL has no next actions and an example is sent back to AL

Right now:
  -examples_only: causes A to happen and D to happen regardless of AL's response
  -test_mode: causes A. to happen, but not F. or D.

But we would also like a way for D. to happen without A.

These cases (at least) should be possible, lets call them "feedback_modes", potentially the user could just choose among these mutually exclusive options instead of setting flags preventing the three illegal ones:
 -full/default:   A, F, D   <-Normal ITS training loop
 -no_hints :     A, F, _   <- Warning: Infinite Loop (Really only works if AL has finite action space + tries random things)
 -predict_observe:  A, _, D  <- Demonstrations are always given  
 -observe_only:    _, _, D  <- Demonstrations are always given  
 -test:                A, _, _   <- Moves to next item on first incorrect
 -stepwise_test:     A, _, _   <- Moves to next step (without sending demonstration) on incorrect


(_, _, _), (_,F,D), (_,F,_) are impossible

There is an added complexity if we incorporate other levels of hint beyond bottomout. Additionally no_hints would probably require some kind of empty hint response to be given to AL that would prompt the agent to guess.






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Need to revise definitions/implementation of examples_only, test_mode, etc. #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Need to revise definitions/implementation of examples_only, test_mode, etc. #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions