Newton's implementation during RLOS Fest 2020#10
Open
newtonmwai wants to merge 2 commits into
Open
Conversation
jackgerrits
requested changes
Aug 31, 2020
Member
jackgerrits
left a comment
There was a problem hiding this comment.
Thanks for opening this Newton! A couple of things:
- Can you please remove the few
pycfiles that are in this PR - The license on this repo was added after this PR was opened - can you confirm you are okay with merging with that license (standard BSD 3 clause)
- Are there any outstanding items here that still need to be done?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
-Added implementation of DR, and DR in episodic settings to the estimator library
-Simulator interface that allows evaluation of target policy against logging policy
-Support Custom Vowpal Wabbit Policies
-Generate a random logging policy and target policy to use for evaluation
-Transforming supervised dataset into a CB dataset
-Transforming a supervised predictor into a stochastic policy using custom softening
-Visualization of comparison
To finalize:
-fix friendly and adversarial softening - not working quite as expected incorrect results
-fix episodic DR