-
Notifications
You must be signed in to change notification settings - Fork 11
Adaptive Training Example
To use this example you must have the following Apprentice Learner repositories installed:
- AL_Core - (https://github.com/apprenticelearner/AL_Core)
- AL_Train - (https://github.com/apprenticelearner/AL_Train)
- AL_Outerloop - (https://github.com/apprenticelearner/AL_Outerloop)
In addition to interactive training and fix-curriculum training, the Apprentice Learner can also be used to simulate instructional policy learning systems (e.g., Bayesian Knowledge Tracing or model-based Reinforcement Learning). For this purpose, we have created a third AL repository that controls how a policy learner interfaces with AL agents.
To see an example of how adaptive policy training works, navigate to the AL_Outerloop repository. You'll see three different policy controllers. One of them, AL_outerloop/controllers/random.py, chooses problems at random. You can see this by inspecting this file in your text editor and taking a look at the next_problem method. You can test this by running the command (Note all of the adaptive training examples take a while to complete):
altrain examples/outer_loop_test_random.json --outer-loopNext, the StreakController AL_outerloop/controllers/streak.py implements a strategy where learners have to maintain a streak of correct responses to problem steps to be considered to have mastered them. You can demo this controller with the following command:
altrain examples/outer_loop_test_streak.json --outer-loopThe BKT controller AL_outerloop/controllers/bkt.py implements Bayesian Knowledge Tracing (BKT) to try to choose problems more intelligently. You can demo this controller with the following command:
altrain examples/outer_loop_test_bkt.json --outer-loopBoth the Streak and BKT controllers focus on problems that they believe involve skills that the student hasn't mastered. They do this by mapping the student's interactions with each interface component to skills and then estimating whether a student has mastered each of the skills based on their right and wrong inputs to those interface components. Because each controller needs to know the relation between skills and problems, applying it to a new domain takes a little more work (see al_outerloop/examples/outer_loop_test_bkt.json for the parameters that are applied in the controller), but it can focus the student's practice. There are lots of variations on what one could do with the BKT controller, such as using AL to set the parameters of a BKT controller to use with new students or experimenting with how different parameters impact the student's final performance and the number of problems to attain that performance.