-
Notifications
You must be signed in to change notification settings - Fork 52
Add flatten command #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flatten command #85
Conversation
|
Could we include a link to the nested bundle docs somewhere, at least in the help message? https://github.com/operator-framework/operator-registry#manifest-format |
|
Also would be nice if dir structure is already flat to not raise an error but log only INFO message |
To clarify this a little bit: we discussed with @ralphbean that in order to avoid confusion among OMPS users, the service should be able to handle both flattened and nested content, using the same endpoint. To be able to do this, OMPS either
OR
|
86f89bb to
2a9a333
Compare
|
@MartinBasti @csomh @kevinrizza Thank you guys very much for the very useful comments. I have addressed all of them and please review my changes when you are free. |
| api.nest(args.source_dir, args.registry_dir) | ||
|
|
||
| # Parse the flatten command | ||
| def flatten(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something is missing from the usage string of flatten -h:
$ operator-courier flatten -h
usage: operator-courier [-h] source_dir dest_dirThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csomh This is strange... I am able to see the description on my side.
(operator-courier) 10:42 > operator-courier git:(add-flatten-command) ✗ operator-courier flatten -h
usage: operator-courier [-h] source_dir dest_dir
Given a directory with different versions of operator bundles (CRD, CSV,
package), this command extracts versioned CSVs and the latest version of each
CRD along with the package file and creates a new flat directory of yaml
files. See https://github.com/operator-framework/operator-registry#manifest-
format to find out more about how nested bundles should be structured.
positional arguments:
source_dir Path of the source directory that contains different versions of
operator bundles (CRD, CSV, package)
dest_dir The new flat directory that contains extracted bundle files
optional arguments:
-h, --help show this help message and exit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JEREMYLINLIN, it's not about the description, it's the usage string. For operator-courier flatten -h my expectation would be
operator-courier flatten [-h] source_dir dest_dir
but it's:
operator-courier [-h] source_dir dest_dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csomh Gotcha. But it seems to be a general issue for all other subcommands. I fiddled around for some time and wasn't able to find a quick fix. Maybe we should open up an issue and make a separate PR for this? @kevinrizza
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point, but let's not check in more bad code. Let's do this one right and fix the others in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinrizza May I hard code the right usage for flatten for this time?
def flatten(self):
parser = argparse.ArgumentParser(
usage='operator-courier flatten [-h] source_dir dest_dir',
description='Given a directory with ...')Because after reading some docs about subcommands and doing some experiments, it seems the current subcommands implementation is not the standard way of doing it. And it will be very hard to make the flatten help message work using the "right way" without breaking others.
I can also make a PR to fix the CLI subcommands before fixing this PR, but this may take a while to get through while the current PR is more urgent.
I would very strongly prefer the latter. If we aren't going to successfully create the new directory, we should just do nothing and log something indicating that happened. |
2a9a333 to
64095c9
Compare
|
@JEREMYLINLIN: GitHub didn't allow me to request PR reviews from the following users: csomh. Note that only operator-framework members and repo collaborators can review this PR, and authors cannot review their own PRs. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| for csv_file_name, csv_entries in create_csv_dict(csv_paths).items(): | ||
| for (version, csv_path) in csv_entries: | ||
| basename, ext = os.path.splitext(csv_file_name) | ||
| file_paths_to_copy.append((csv_path, f'{basename}-v{version}{ext}')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we're renaming the CSV files to avoid naming overlap, is that right? In that case, this format for filename is nice.
Will the resulting file ext be of the format .clusterserviceversion.yaml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we're renaming the CSV files to avoid naming overlap, is that right?
Yes.
Will the resulting file ext be of the format .clusterserviceversion.yaml?
It will append a version string before the extension. You can check the test cases to get a better sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add the version in two spots of the file name? Is it because the first version was part of the file name?
To keep with conventions for yaml naming, the file format for CSVs should look something like:
{basename}{separator}v{version}.clusterserviceversion.yaml
For example:
etcd-operator.v0.9.2.clusterserviceversion.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah but that's not guaranteed @SamiSousa , it's just a convention. we don't want this utility to fail just because someone broke a convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SamiSousa See this conversation and this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinrizza Where would this fail based on a convention? It seems like we're renaming the files anyways, so in a way aren't we just setting our own convention?
It looks like we're pulling the basename from the original filename, so that would explain why the version pops up twice. I don't mind if we have the version back to back like so:
etcd-operator.v0.9.2-v0.9.2.clusterserviceversion.yaml
The reason I'm insisting is because other tools may naively be depending on this filenaming convention to discern which files are CSVs, which are CRDs, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there's nothing stopping anyone from naming their files whatever they want. This command literally just renames them so we don't get collisions. No other tool besides courier should ever touch these temp files -- if they do that is a mistake and should be corrected.
64095c9 to
4e330fb
Compare
|
lgtm, thank you @JEREMYLINLIN 👍 |
|
@csomh No problem. Thank you so much for reviewing my PR and providing great feedback. |
| if not os.path.isfile(item_path): | ||
| logger.warning('Ignoring %s as it is not a regular file.', item) | ||
|
|
||
| with open(item_path, 'r') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this PR and I got following error (because I had binary (zip) file in that directory by accident):
$ operator-courier flatten orig dest
'utf-8' codec can't decode byte 0xd8 in position 14: invalid continuation byte
There is no traceback, but I assume that this line the source of that error.
push and verify commands reads only files with *.yaml and *.yml extensions, IMHO flatten should work consistently with other commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 This is valuable feedback. I think there should be two takeways from this: we certainly limit our parsing and copying in this command to just yaml files. And we should also explicitly log and return an error if the folder contains files which are not of that type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MartinBasti Thanks for the feedback! I've updated the code to handle that situation.
4e330fb to
574777d
Compare
MartinBasti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
|
Actually coverage could be updated, https://coveralls.io/builds/22511114/source?filename=operatorcourier/flatten.py but it could be a new PR to not block this feature |
operatorcourier/flatten.py
Outdated
| from typing import Dict, Tuple | ||
| from shutil import copyfile | ||
| import semver | ||
| import operatorcourier.identify as identify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about?
from operatorcourier import identify
from operatorcourier import errorsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MartinBasti. Updated.
574777d to
7768ed6
Compare
|
/lgtm |
A new operator-courier CLI command called
flattenwhich, given a directory of operator bundles, it will extract the versioned csvs and latest version of each crd along with the package file and create a new flat directory of yaml files.