Jvmncs/dumber delta flash example#2
Open
jvmncs wants to merge 2 commits into
Open
Conversation
6a438ea to
b339d55
Compare
| ): | ||
| self.router_ip = router_ip if router_ip is not None else self.args.sglang_router_ip | ||
| self.router_port = router_port if router_port is not None else self.args.sglang_router_port | ||
| self.router_ip = router_ip if router_ip is not None else getattr(self.args, "sglang_router_ip", None) |
There was a problem hiding this comment.
i think this is default None so there is no need to do getattr
| self.router_ip = router_ip if router_ip is not None else self.args.sglang_router_ip | ||
| self.router_port = router_port if router_port is not None else self.args.sglang_router_port | ||
| self.router_ip = router_ip if router_ip is not None else getattr(self.args, "sglang_router_ip", None) | ||
| self.router_port = router_port if router_port is not None else getattr(self.args, "sglang_router_port", None) |
| DeltaParam = None | ||
| DeltaSpec = None | ||
|
|
||
| class DeltaEncoding(str, Enum): |
There was a problem hiding this comment.
i dont think this is needed if you are using the latest slime docker slimerl/slime:nightly-dev-20260527a
| help="Port of the SGLang router", | ||
| ) | ||
| parser.add_argument( | ||
| "--sglang-router-url", |
There was a problem hiding this comment.
ideally dont have slime arg to start with sglang cuz it will parse to sglang ServerArg by removing sglang. which will make ServerArg for sglang to have router-url. if passing router-url is intended then there is no need to add this extra argument slime will automatically parse
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
adds a dumb, heavily vibecoded, minimal example of doing delta compression between slime trainer Modal function and a Flash rollout server. most of the logic is in the modal app.
slime side changes:
sglang side changes:
autoinference deployment side:
update_weights_from_diskand then callsvol.reload()before forwarding to the engine.there's a lot I don't like here but it's just proof that it works the way we expect. some particular call outs that won't generalize past max_containers=1:
http_serverwith an arbitrary proxy like this? does flash proxy make any assumptions about its port being the engine?rollout-external-engine-addrsis going to be dynamic over time, we can't set it at config-time.