Conversation
guidopetri
left a comment
There was a problem hiding this comment.
hey seb, @tanho63 asked me to look over this so i did. i probably went overboard so i apologize lol.
only one real actionable comment, the rest are non actionable / nits, so feel free to ignore. thanks for contributing :)
| abbr: a string or list of strings of abbreviations, full team names, or team nicknames. | ||
| current_location: If `True` (the default), the abbreviation of the most recent team | ||
| location will be used. | ||
| keep_non_matches: If `TRUE` (the default) an element of `abbr` that can't |
| A string list with the length of `abbr` and cleaned team abbreviations\ | ||
| if they are included in `team_abbr_mapping()` or `team_abbr_mapping_norelocate()`\ | ||
| (depending on the value of `current_location`). Non matches may be replaced\ | ||
| with `Nome` (depending on the value of `keep_non_matches`). |
| Otherwise it will be replaced with `None`. | ||
|
|
||
| Returns: | ||
| A string list with the length of `abbr` and cleaned team abbreviations\ |
There was a problem hiding this comment.
nit: you don't need backslashes at the end of the lines here
There was a problem hiding this comment.
I actually do need those linebreakers for the rendering of the documentation website.
There was a problem hiding this comment.
Ah gotcha. I didn't look at the docs website!
| if isinstance(abbr, str): | ||
| abbr = abbr.split() |
There was a problem hiding this comment.
no action: Not particularly pythonic, but given that we are aiming for feature parity with the R version, this is.... fine
There was a problem hiding this comment.
I will actually change this. It was stupid as I just wanted to make strings a list of strings.
There was a problem hiding this comment.
I don't think it's stupid! If we are aiming for feature parity with R then we should have the same kind of inputs allowed. Now, whether this should be allowed at all (in both the R and python version of this function), well. That's a point we can talk about :P
| if isinstance(abbr, str): | ||
| abbr = abbr.split() | ||
|
|
||
| # error if abbr is no list |
There was a problem hiding this comment.
nit: a lot of these comments are extraneous and can be removed (e.g. all the ones until L45 can be summarized by "arg validation" or even skipped entirely imo)
There was a problem hiding this comment.
I agree for most of these comments the code would be self-explanatory but since Tan and me aren't particularly experienced python coders I think we need some hints for future us.
| # mapping is a polars df. Convert it to a dictionary. We could do the conversion with | ||
| # a polars join but the code below is just easier to read. |
There was a problem hiding this comment.
no action: imo, preferable to do this as a polars operation if you expect that people will use long lists of abbr, since it will be more performant that way. But I also think this is fine as is
There was a problem hiding this comment.
I had it in polars first and hated the syntax. But now I think I might have to do it because it will be easier to make it work in a polars workflow, I.e. in map_elements or map_batches
| # out dropped nonmatches. We replace the None values here if the user wants to keep them | ||
| if keep_non_matches is True: | ||
| for index, item in enumerate(out): | ||
| if item is None: | ||
| out[index] = abbr[index] |
There was a problem hiding this comment.
Instead of replacing the dropped non-matches here, maybe something better would be in L59 above:
if keep_non_matches: # (also preferable to just use "implicit" comparison here, especially since you've validated that this is already a bool
out = [map_dict.get(key.upper(), key) for key in abbr] # use the default value of `key` if it does not find a match
else:
out = [map_dict.get(key.upper()) for key in abbr] # same as what you have in L59There was a problem hiding this comment.
This is so much better lol.
Still requires another loop to identify non-matches in the original abbr list but I prefer this solution.
| map_dict = dict(mapping.iter_rows()) | ||
|
|
||
| # lookup with .get method because it replaces nonmatches with a default value (None) | ||
| out = [map_dict.get(key.upper()) for key in abbr] |
There was a problem hiding this comment.
nit: out is not a very descriptive variable name
|
|
||
| def clean_team_abbrs( | ||
| abbr: str | list[str], current_location: bool = True, keep_non_matches: bool = True | ||
| ) -> list: |
There was a problem hiding this comment.
nit: this returns a list[str] specifically
There was a problem hiding this comment.
How would I approach a usecase where I want my function to work both inside polars and independently of polars?
It would have to handle str, list[str] and pl.Series as inputs, which isn't particularly hard to do. But it would also require to return either str (e.g. in map_elements), list[str], and pl.Series (maybe this could also be a list).
So the return type would depend on the input type. Which I believe isn't pythonic?
There was a problem hiding this comment.
I don't think it's very pythonic no. But I've also definitely seen functions in large python libraries that do this too so I don't think it's too bad if you want to do that.
The way that I'd define it in that case is actually using the overload syntax, where you would essentially define 3 function "headers" (one for each input type, returning the same output type) and then in your "actual" function you would type it with the 3 input types / returning the 3 input types (let me know if this isn't clear, sorry). Depending on the python version you're using / targeting it might also be possible to use typing generics? though I haven't tried that myself.
| new_abbr = nfl.clean_team_abbrs(x) | ||
| new_abbr_drop = nfl.clean_team_abbrs(x, keep_non_matches=False) | ||
| old_abbr = nfl.clean_team_abbrs(x, current_location=False) |
There was a problem hiding this comment.
nit: possibly could use pytest.parametrize() here instead to pass in the args / expected output and have this test be shorter/simpler, but also incredibly nitpicky of me
Thank you for your feedback @guidopetri! It's highly appreciated in my journey of doing some basic python dev. |
|
For sure, feel free to tag me when you're ready for another review |
next part of #36
Tests proof that it works