|
| 1 | +""" |
| 2 | + LearnAPI.getobs(model, LearnAPI.fit, I, data...) |
| 3 | +
|
| 4 | +Return a subsample of `data` consisting of all observations with indices in `I`. Here |
| 5 | +`data` is data of the form expected in a call like `LearnAPI.fit(model, verbosity, |
| 6 | +data...; metadata...)`. |
| 7 | +
|
| 8 | +Always returns a tuple of the same length as `data`. |
| 9 | +
|
| 10 | + LearnAPI.getobs(model, operation, I, data...) |
| 11 | +
|
| 12 | +Return a subsample of `data` consisting of all observations with indices in `I`. Here |
| 13 | +`data` is data of the form expected in a call of the specified `operation`, e.g., in a |
| 14 | +call like `LearnAPI.predict(model, data...)`, if `operation = LearnAPI.predict`. Possible |
| 15 | +values for `operation` are: $DOC_OPERATIONS_LIST. |
| 16 | +
|
| 17 | +Always returns a tuple of the same length as `data`. |
| 18 | +
|
| 19 | +# New model implementations |
| 20 | +
|
| 21 | +Implementation is optional. If implemented, then ordinarily implemented for each signature |
| 22 | +of `fit` and operation implemented for `model`. |
| 23 | +
|
| 24 | +$(DOC_IMPLEMENTED_METHODS(:reformat)) |
| 25 | +
|
| 26 | +The subsample returned must be acceptable in place of `data` in the call function named in |
| 27 | +the second argument. |
| 28 | +
|
| 29 | +## Example implementation |
| 30 | +
|
| 31 | +Suppose that `MyClassifier` is a model type for simple supervised classification, with |
| 32 | +`LearnAPI.fit(model::MyClassifier, verbosity, A, y)` and `predict(model::MyClassifier, |
| 33 | +fitted_params, A)` implemented assuming the target `y` is an ordinary abstract vector and |
| 34 | +the features `A` is an abstract matrix with columns as observations. Then the following is |
| 35 | +a valid implementation of `getobs`: |
| 36 | +
|
| 37 | +```julia |
| 38 | +LearnAPI.getobs(::MyClassifier, ::typeof(LearnAPI.fit), I, A, y) = |
| 39 | + (view(A, :, I), view(y, I)) |
| 40 | +LearnAPI.getobs(::MyClassifier, ::typeof(LearnAPI.predict), I, A) = (view(A, :, I),) |
| 41 | +``` |
| 42 | +
|
| 43 | +""" |
| 44 | +function getobs end |
| 45 | + |
| 46 | +""" |
| 47 | + LearnAPI.reformat(model, LearnAPI.fit, user_data...; metadata...) |
| 48 | +
|
| 49 | +Return the model-specific representations `(data, metadata)` of user-supplied `(user_data, |
| 50 | +user_metadata)`, for consumption, after splatting, by `LearnAPI.fit`, `LearnAPI.update!` |
| 51 | +or `LearnAPI.ingest!`. |
| 52 | +
|
| 53 | + LearnAPI.reformat(model, operation, user_data...) |
| 54 | +
|
| 55 | +Return the model-specific representation `data` of user-supplied `user_data`, for |
| 56 | +consumption, after splatting, by the specified `operation`, dispatched on `model`. Here |
| 57 | +`operation` is one of: $DOC_OPERATIONS_LIST. |
| 58 | +
|
| 59 | +The following sample workflow illustrates the use of both versions of `reformat`above: |
| 60 | +
|
| 61 | +```julia |
| 62 | +data, metadata = LearnAPI.reformat(model, LearnAPI.fit, X, y; class_weights=dic) |
| 63 | +fitted_params, state, fit_report = LearnAPI.fit(model, 0, data...; metadata...) |
| 64 | +
|
| 65 | +test_data = LearnAPI.reformat(model, LearnAPI.predict, Xtest) |
| 66 | +ŷ, predict_report = LearnAPI.predict(model, fitted_params, test_data...) |
| 67 | +``` |
| 68 | +
|
| 69 | +# New model implementations |
| 70 | +
|
| 71 | +Implementation of `reformat` is optional. The fallback simply slurps the supplied |
| 72 | +data/metadata. You will want to implement for each `fit` or operation signature |
| 73 | +implemented for `model`. |
| 74 | +
|
| 75 | +$(DOC_IMPLEMENTED_METHODS(:reformat, overloaded=true)) |
| 76 | +
|
| 77 | +Ideally, any potentially expensive transformation of user-supplied data that is carried |
| 78 | +out during training only once, at the beginning, should occur in `reformat` instead of |
| 79 | +`fit`/`update!`/`ingest!`. |
| 80 | +
|
| 81 | +Note that the first form of `reformat`, for operations, should always return a tuple, |
| 82 | +because the output is splat in calls to the operation (see the sample workflow |
| 83 | +above). Similarly, in the return value `(data, metadata)` for the `fit` variant, `data` is |
| 84 | +always a tuple and `metadata` always a named tuple (or `Base.Pairs` object). If there is |
| 85 | +no metadata, a `NamedTuple()` can be returned in its place. |
| 86 | +
|
| 87 | +## Example implementation |
| 88 | +
|
| 89 | +Suppose that `MyClassifier` is a model type for simple supervised classification, with |
| 90 | +`LearnAPI.fit(model::MyClassifier, verbosity, A, y; names=...)` and |
| 91 | +`predict(model::MyClassifier, fitted_params, A)` implemented assuming that the target `y` |
| 92 | +is an ordinary vector, the features `A`is a matrix with columns as observations, and |
| 93 | +`names` are the names of the features. Then, supposing users supply features in tabular |
| 94 | +form, but target as expected, then we provide the following implementation of `reformat`: |
| 95 | +
|
| 96 | +```julia |
| 97 | +using Tables |
| 98 | +function LearnAPI.reformat(::MyClassifier, ::typeof(LearnAPI.fit), X, y) |
| 99 | + names = Tables.schema(Tables.rows(X)).names |
| 100 | + return ((Tables.matrix(X)', y), (; names)) |
| 101 | +end |
| 102 | +LearnAPI.reformat(::MyClassifier, ::typeof(LearnAPI.predict), X) = (Tables.matrix(X)',) |
| 103 | +``` |
| 104 | +""" |
| 105 | +reformat(::Any, ::Any, data...; model_data...) = (data, model_data) |
0 commit comments