Skip to content

Commit b397770

Browse files
authored
Merge pull request #22 from JuliaAI/dev
For a 0.1.0 release
2 parents aa393c0 + b460b71 commit b397770

31 files changed

+1370
-1351
lines changed

Project.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ authors = ["Anthony D. Blaom <anthony.blaom@gmail.com>"]
44
version = "0.1.0"
55

66
[deps]
7+
InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
78
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
89

910
[extras]

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,20 @@
11
# LearnAPI.jl
22

3-
A Julia interface for training and applying machine learning models.
3+
A base Julia interface for machine learning and statistics
44

55

66
**Devlopement Status:**
77

88
- [X] Detailed proposal stage ([this
9-
documentation](https://juliaai.github.io/LearnAPI.jl/dev/))
10-
- [ ] Initial feedback stage (opened mid-January, 2023)
9+
documentation](https://juliaai.github.io/LearnAPI.jl/dev/)).
10+
- [ ] Initial feedback stage (opened mid-January, 2023). General feedback can be provided at [this Julia Discourse thread](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048/20).
1111
- [ ] Implement feedback and finish "To do" list (below)
1212
- [ ] Proof of concept implementation
1313
- [ ] Polish
1414
- [ ] Registration
1515

16+
You can join a discussion on the LearnAPI proposal at [this](https://discourse.julialang.org/t/ann-learnapi-jl-proposal-for-a-basement-level-machine-learning-api/93048) Julia Discourse thread.
17+
1618
To do:
1719

1820
- [ ] Add methods to create/save persistent representation of learned parameters

docs/make.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,14 @@ makedocs(;
99
format=Documenter.HTML(prettyurls = get(ENV, "CI", nothing) == "true"),
1010
pages=[
1111
"Overview" => "index.md",
12+
"Goals and Approach" => "goals_and_approach.md",
1213
"Anatomy of an Implementation" => "anatomy_of_an_implementation.md",
1314
"Reference" => "reference.md",
1415
"Fit, update and ingest" => "fit_update_and_ingest.md",
1516
"Predict and other operations" => "operations.md",
1617
"Accessor Functions" => "accessor_functions.md",
1718
"Optional Data Interface" => "optional_data_interface.md",
18-
"Model Traits" => "model_traits.md",
19+
"Algorithm Traits" => "algorithm_traits.md",
1920
"Common Implementation Patterns" => "common_implementation_patterns.md",
2021
"Testing an Implementation" => "testing_an_implementation.md",
2122
],

docs/src/accessor_functions.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Accessor Functions
22

33
> **Summary.** While byproducts of training are ordinarily recorded in the `report`
4-
> component of the output of `fit`/`update!`/`ingest!`, some families of models report an
5-
> item that is likely shared by multiple model types, and it is useful to have common
4+
> component of the output of `fit`/`update!`/`ingest!`, some families of algorithms report an
5+
> item that is likely shared by multiple algorithm types, and it is useful to have common
66
> interface for accessing these directly. Training losses and feature importances are two
77
> examples.
88

docs/src/algorithm_traits.md

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# Algorithm Traits
2+
3+
> **Summary.** Traits allow one to promise particular behaviour for an algorithm, such as:
4+
> *This algorithm supports per-observation weights, which must appear as the third
5+
> argument of `fit`*, or *This algorithm's `transform` method predicts `Real` vectors*.
6+
7+
Algorithm traits are functions whose first (and usually only) argument is an algorithm. In
8+
a new implementation, a single-argument trait is declared following this pattern:
9+
10+
```julia
11+
LearnAPI.is_pure_julia(algorithm::MyAlgorithmType) = true
12+
```
13+
14+
!!! important
15+
16+
The value of a trait must be the same for all algorithms of the same type,
17+
even if the types differ only in type parameters. There are exceptions for
18+
some traits, if
19+
`is_wrapper(algorithm) = true` for all instances `algorithm` of some type
20+
(composite algorithms). This requirement occasionally requires that
21+
an existing algorithm implementation be split into separate LearnAPI
22+
implementations (e.g., one for regression and another for classification).
23+
24+
The declaration above has the shorthand
25+
26+
```julia
27+
@trait MyAlgorithmType is_pure_julia=true
28+
```
29+
30+
Multiple traits can be declared like this:
31+
32+
33+
```julia
34+
@trait(
35+
MyAlgorithmType,
36+
is_pure_julia = true,
37+
pkg_name = "MyPackage",
38+
)
39+
```
40+
41+
### Special two-argument traits
42+
43+
The two-argument version of [`LearnAPI.predict_output_scitype`](@ref) and
44+
[`LearnAPI.predict_output_scitype`](@ref) are the only overloadable traits with more than
45+
one argument. They cannot be declared using the `@trait` macro.
46+
47+
## Trait summary
48+
49+
**Overloadable traits** are available for overloading by any new LearnAPI
50+
implementation. **Derived traits** are not, and should not be called by performance
51+
critical code
52+
53+
### Overloadable traits
54+
55+
In the examples column of the table below, `Table`, `Continuous`, `Sampleable` are names owned by the
56+
package [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl/).
57+
58+
| trait | fallback value | return value | example |
59+
|:-------------------------------------------------|:----------------------|:--------------|:--------|
60+
| [`LearnAPI.functions`](@ref)`(algorithm)` | `()` | implemented LearnAPI functions (traits excluded) | `(:fit, :predict)` |
61+
| [`LearnAPI.preferred_kind_of_proxy`](@ref)`(algorithm)` | `LearnAPI.None()` | an instance `tp` of `KindOfProxy` for which an implementation of `LearnAPI.predict(algorithm, tp, ...)` is guaranteed. | `LearnAPI.Distribution()` |
62+
| [`LearnAPI.position_of_target`](@ref)`(algorithm)` | `0` | ¹ the positional index of the **target** in `data` in `fit(..., data...; metadata)` calls | 2 |
63+
| [`LearnAPI.position_of_weights`](@ref)`(algorithm)` | `0` | ¹ the positional index of **per-observation weights** in `data` in `fit(..., data...; metadata)` | 3 |
64+
| [`LearnAPI.descriptors`](@ref)`(algorithm)` | `()` | lists one or more suggestive algorithm descriptors from `LearnAPI.descriptors()` | (:classifier, :probabilistic) |
65+
| [`LearnAPI.is_pure_julia`](@ref)`(algorithm)` | `false` | is `true` if implementation is 100% Julia code | `true` |
66+
| [`LearnAPI.pkg_name`](@ref)`(algorithm)` | `"unknown"` | name of package providing core code (may be different from package providing LearnAPI.jl implementation) | `"DecisionTree"` |
67+
| [`LearnAPI.pkg_license`](@ref)`(algorithm)` | `"unknown"` | name of license of package providing core code | `"MIT"` |
68+
| [`LearnAPI.doc_url`](@ref)`(algorithm)` | `"unknown"` | url providing documentation of the core code | `"https://en.wikipedia.org/wiki/Decision_tree_learning"` |
69+
| [`LearnAPI.load_path`](@ref)`(algorithm)` | `"unknown"` | a string indicating where the struct for `typeof(algorithm)` is defined, beginning with name of package providing implementation | `FastTrees.LearnAPI.DecisionTreeClassifier` |
70+
| [`LearnAPI.is_wrapper`](@ref)`(algorithm)` | `false` | is `true` if one or more properties (fields) of `algorithm` may be an algorithm | `true` |
71+
| [`LearnAPI.human_name`](@ref)`(algorithm)` | type name with spaces | human name for the algorithm; should be a noun | "elastic net regressor" |
72+
| [`LearnAPI.iteration_parameter`](@ref)`(algorithm)` | `nothing` | symbolic name of an iteration parameter | :epochs |
73+
| [`LearnAPI.fit_keywords`](@ref)`(algorithm)` | `()` | tuple of symbols for keyword arguments accepted by `fit` (corresponding to metadata) | `(:class_weights,)` |
74+
| [`LearnAPI.fit_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `fit(algorithm, verbosity, data...)`² | `Tuple{Table(Continuous), AbstractVector{Continuous}}` |
75+
| [`LearnAPI.fit_observation_scitype`](@ref)`(algorithm)` | `Union{}`| upper bound on `scitype(observation)` for `observation` in `data` and `data` in `fit(algorithm, verbosity, data...)`² | `Tuple{AbstractVector{Continuous}, Continuous}` |
76+
| [`LearnAPI.fit_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `type(data)` in `fit(algorithm, verbosity, data...)`² | `Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}` |
77+
| [`LearnAPI.fit_observation_type`](@ref)`(algorithm)` | `Union{}`| upper bound on `type(observation)` for `observation` in `data` and `data` in `fit(algorithm, verbosity, data...)`* | `Tuple{AbstractVector{<:Real}, Real}` |
78+
| [`LearnAPI.predict_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `predict(algorithm, fitted_params, data...)`² | `Table(Continuous)` |
79+
| [`LearnAPI.predict_output_scitype`](@ref)`(algorithm, kind_of_proxy)` | `Any` | upper bound on `scitype(first(predict(algorithm, kind_of_proxy, ...)))` | `AbstractVector{Continuous}` |
80+
| [`LearnAPI.predict_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `predict(algorithm, fitted_params, data...)`² | `AbstractMatrix{<:Real}` |
81+
| [`LearnAPI.predict_output_type`](@ref)`(algorithm, kind_of_proxy)` | `Any` | upper bound on `typeof(first(predict(algorithm, kind_of_proxy, ...)))` | `AbstractVector{<:Real}` |
82+
| [`LearnAPI.transform_input_scitype`](@ref)`(algorithm)` | `Union{}` | upper bound on `scitype(data)` in `transform(algorithm, fitted_params, data...)`² | `Table(Continuous)` |
83+
| [`LearnAPI.transform_output_scitype`](@ref)`(algorithm)` | `Any` | upper bound on `scitype(first(transform(algorithm, ...)))` | `Table(Continuous)` |
84+
| [`LearnAPI.transform_input_type`](@ref)`(algorithm)` | `Union{}` | upper bound on `typeof(data)` in `transform(algorithm, fitted_params, data...)`² | `AbstractMatrix{<:Real}}` |
85+
| [`LearnAPI.transform_output_type`](@ref)`(algorithm)` | `Any` | upper bound on `typeof(first(transform(algorithm, ...)))` | `AbstractMatrix{<:Real}` |
86+
87+
¹ If the value is `0`, then the variable in boldface type is not supported and not
88+
expected to appear in `data`. If `length(data)` is less than the trait value, then `data`
89+
is understood to exclude the variable, but note that `fit` can have multiple signatures of
90+
varying lengths, as in `fit(algorithm, verbosity, X, y)` and `fit(algorithm, verbosity, X, y,
91+
w)`. A non-zero value is a promise that `fit` includes a signature of sufficient length to
92+
include the variable.
93+
94+
² Assuming no [optional data interface](@ref data_interface) is implemented. See docstring
95+
for the general case.
96+
97+
98+
### Derived Traits
99+
100+
The following convenience methods are provided but intended for overloading:
101+
102+
| trait | return value | example |
103+
|:-------------------------------------|:------------------------------------------|:-----------|
104+
| `LearnAPI.name(algorithm)` | algorithm type name as string | "PCA" |
105+
| `LearnAPI.is_algorithm(algorithm)` | `true` if `functions(algorithm)` is not empty | `true` |
106+
| [`LearnAPI.predict_output_scitype`](@ref)(algorithm) | dictionary of upper bounds on the scitype of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) |
107+
| [`LearnAPI.predict_output_type`](@ref)(algorithm) | dictionary of upper bounds on the type of predictions, keyed on subtypes of [`LearnAPI.KindOfProxy`](@ref) |
108+
109+
110+
## Reference
111+
112+
```@docs
113+
LearnAPI.functions
114+
LearnAPI.preferred_kind_of_proxy
115+
LearnAPI.position_of_target
116+
LearnAPI.position_of_weights
117+
LearnAPI.descriptors
118+
LearnAPI.is_pure_julia
119+
LearnAPI.pkg_name
120+
LearnAPI.pkg_license
121+
LearnAPI.doc_url
122+
LearnAPI.load_path
123+
LearnAPI.is_wrapper
124+
LearnAPI.fit_keywords
125+
LearnAPI.human_name
126+
LearnAPI.iteration_parameter
127+
LearnAPI.fit_scitype
128+
LearnAPI.fit_type
129+
LearnAPI.fit_observation_scitype
130+
LearnAPI.fit_observation_type
131+
LearnAPI.predict_input_scitype
132+
LearnAPI.predict_output_scitype
133+
LearnAPI.predict_input_type
134+
LearnAPI.predict_output_type
135+
LearnAPI.transform_input_scitype
136+
LearnAPI.transform_output_scitype
137+
LearnAPI.transform_input_type
138+
LearnAPI.transform_output_type
139+
```

0 commit comments

Comments
 (0)