sml-sync has become, as far as I can tell, an important part of the data scientists' workflow at ASI. As such, it clearly has intrinsic value beyond its original goal as a proof-of-concept for a sync-based remote editing workflow.
There is a desire to advertise sml-sync as part of SherlockML to potential customers. I think we are now at the stage where it makes sense to elaborate a roadmap towards that goal.
This issue is an attempt at formalising what still needs to be done before I feel confident recommending sml-sync to a potential external client. The intent is to provoke a discussion that will lead to a clear roadmap. In my mind, this is a roadmap towards version 1.0.
Current blockers
These are the pains currently associated with sml-sync:
- Installation and setup remain too complicated.
- sml-sync remains unstable, primarily because it depends on unstable external dependencies (Python prompt toolkit).
- sml-sync can be dangerous: it will happily delete your home directory if you ask it to.
- Limited platform support:
- only works on Python 3.5+
- only really tested on macOS
1. Installation and setup
We should just be able to do pip install sml-sync. We need to:
- get sml-sync on Pypi
- make the issues associated with installing a branch of Python prompt toolkit go away. The easiest way to do this is probably to vendor prompt toolkit.
To make setup easier, we could offer some form of GUI for editing the configuration: something that automatically checks whether the projects exist, or whether the remote directories exist etc.
2. Stability
The main source of instability is our dependence on a moving target through Python prompt tookit. This will go away if we vendor Python prompt toolkit.
There are also issues (predominantly with the Paramiko session, I think) when the user loses network connection.
Having the unit tests run on Travis with supported Python versions would also be useful.
3. Danger
sml-sync remains dangerous. This PR goes some way towards mitigating that by showing the list of changes, and this issue would help by avoiding syncing from obviously bad directories.
I doubt that we can detect all bad scenarios, so our best bet is to
- make it very clear to the user what effect their actions will take by being more explicit about actions and by reviewing the copy in help screens etc.
- support actions smaller than syncing an entire directory tree (e.g. syncing individual files)
4. Limited support
Supporting Python2 would be too much of a burden at this stage.
Testing on Travis will help give some confidence that this works on Linux platforms as well. It'd nevertheless be good to come up with a workflow to test this from Ubuntu (e.g. on SherlockML or using Vagrant).
We should test sml-sync on the Linux subsystem on Windows to see if we can make it work.
Actions
I have created a 1.0.0 milestone and the following issues:
It would be useful for people (Jan, Andy, users) to think about what sml-sync is currently missing. If you think of anything, or if you disagree with any of the above, write a comment here!
sml-sync has become, as far as I can tell, an important part of the data scientists' workflow at ASI. As such, it clearly has intrinsic value beyond its original goal as a proof-of-concept for a sync-based remote editing workflow.
There is a desire to advertise
sml-syncas part of SherlockML to potential customers. I think we are now at the stage where it makes sense to elaborate a roadmap towards that goal.This issue is an attempt at formalising what still needs to be done before I feel confident recommending sml-sync to a potential external client. The intent is to provoke a discussion that will lead to a clear roadmap. In my mind, this is a roadmap towards version 1.0.
Current blockers
These are the pains currently associated with sml-sync:
1. Installation and setup
We should just be able to do
pip install sml-sync. We need to:To make setup easier, we could offer some form of GUI for editing the configuration: something that automatically checks whether the projects exist, or whether the remote directories exist etc.
2. Stability
The main source of instability is our dependence on a moving target through Python prompt tookit. This will go away if we vendor Python prompt toolkit.
There are also issues (predominantly with the Paramiko session, I think) when the user loses network connection.
Having the unit tests run on Travis with supported Python versions would also be useful.
3. Danger
sml-sync remains dangerous. This PR goes some way towards mitigating that by showing the list of changes, and this issue would help by avoiding syncing from obviously bad directories.
I doubt that we can detect all bad scenarios, so our best bet is to
4. Limited support
Supporting Python2 would be too much of a burden at this stage.
Testing on Travis will help give some confidence that this works on Linux platforms as well. It'd nevertheless be good to come up with a workflow to test this from Ubuntu (e.g. on SherlockML or using Vagrant).
We should test sml-sync on the Linux subsystem on Windows to see if we can make it work.
Actions
I have created a 1.0.0 milestone and the following issues:
Vendor Python prompt toolkitIt would be useful for people (Jan, Andy, users) to think about what sml-sync is currently missing. If you think of anything, or if you disagree with any of the above, write a comment here!