JuliaAI
diff --git a/‎docs/make.jl‎
Lines changed: 2 additions & 1 deletion b/‎docs/make.jl‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/src/anatomy_of_an_implementation.md‎
Lines changed: 20 additions & 38 deletions b/‎docs/src/anatomy_of_an_implementation.md‎
Lines changed: 20 additions & 38 deletions
diff --git a/‎docs/src/common_implementation_patterns.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/src/common_implementation_patterns.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/src/fit_update_and_ingest.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/src/fit_update_and_ingest.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/src/index.md‎
Lines changed: 20 additions & 21 deletions b/‎docs/src/index.md‎
Lines changed: 20 additions & 21 deletions
@@ -13,6 +13,8 @@ makedocs(;
         "Reference" => "reference.md",
         "Fit, update and ingest" => "fit_update_and_ingest.md",
         "Predict and other operations" => "operations.md",
+        "Accessor Functions" => "accessor_functions.md",
+        "Optional Data Interface" => "optional_data_interface.md",
         "Model Traits" => "model_traits.md",
         "Common Implementation Patterns" => "common_implementation_patterns.md",
         "Testing an Implementation" => "testing_an_implementation.md",
@@ -26,4 +28,3 @@ deploydocs(
     devbranch="dev",
     push_preview=false,
 )
-
 
@@ -6,7 +6,7 @@
 > `transform`). In this example we also implement an **accessor function**, called
 > `feature_importance`, returning the absolute values of the linear coefficients. The
 > ridge regressor has a target variable and `predict` makes literal predictions of the
-> target (rather than, say, probablistic predictions); this behaviour is flagged by the
+> target (rather than, say, probabilistic predictions); this behavior is flagged by the
 > `predict_proxy` model trait.  Other traits articulate the model's training data type
 > requirements and the input/output type of `predict`.
 
@@ -35,7 +35,7 @@ nothing # hide
 ```
 
 The subtyping `MyRidge <: LearnAPI.Model` is optional but recommended where it is not
-otherwise disruptive (it allows models to be displayed in a standard way, for example).
+otherwise disruptive.
 
 Instances of `MyRidge` are called **models** and `MyRidge` is a **model type**.
 
@@ -75,7 +75,7 @@ function LearnAPI.fit(model::MyRidge, verbosity, X, y)
         feature_importances =
                 [features[j] => abs(coefficients[j]) for j in eachindex(features)]
         sort!(feature_importances, by=last) |> reverse!
-        verbosity > 1 && @info "Features in order of importance: $(first.(feature_importances))"
+        verbosity > 0 && @info "Features in order of importance: $(first.(feature_importances))"
         report = (; feature_importances)
 
         return fitted_params, state, report
@@ -92,15 +92,15 @@ Regarding the return value of `fit`:
   or [`LearnAPI.ingest!`](@ref) method (see [Fit, update! and ingest!](@ref)).
 
 - The `report` is for other byproducts of training, apart from the learned parameters (the
-  ones will need to provide `predict` below).
+  ones we'll need to provide `predict` below).
 
-Our `fit` method assumes that `X` is a table (satifies the [Tables.jl
+Our `fit` method assumes that `X` is a table (satisfies the [Tables.jl
 spec](https://github.com/JuliaData/Tables.jl)) whose rows are the observations; and it
 will need need `y` to be an `AbstractFloat` vector. A model implementation is free to
 dictate the representation of data that `fit` accepts but articulates its requirements
 using appropriate traits; see [Training data types](@ref) below. We recommend against data
 type checks internal to `fit`; this would ordinarily be the responsibility of a higher
-level API, using those trasits. 
+level API, using those traits. 
 
 
 ## Operations
@@ -146,7 +146,7 @@ Another example of an accessor function is [`LearnAPI.training_losses`](@ref).
 
 Our model has a target variable, in the sense outlined in [Scope and undefined
 notions](@ref scope), and `predict` returns an object with exactly the same form as the
-target. We indicate this behaviour by declaring
+target. We indicate this behavior by declaring
 
 ```@example anatomy
 LearnAPI.predict_proxy(::Type{<:MyRidge}) = LearnAPI.TrueTarget()
@@ -166,10 +166,6 @@ for details.
 `LearnAPI.predict_proxy` is an example of a **model trait**. A complete list of traits
 and the contracts they imply is given in [Model Traits](@ref).
 
-> **MLJ only.** The values of all traits constitute a model's **metadata**, which is
-> recorded in the searchable MLJ Model Registry, assuming the implementation-providing
-> package is registered there.
-
 We also need to indicate that a target variable appears in training (this is a supervised
 model). We do this by declaring *where* in the list of training data arguments (in this
 case `(X, y)`) the target variable (in this case `y`) appears:
@@ -206,7 +202,7 @@ nothing # hide
 ## Training data types
 
 Since LearnAPI.jl is a basement level API, one is discouraged from including explicit type
-checks in an implementation of `fit`. Instead one uses traits to make promisises about the
+checks in an implementation of `fit`. Instead one uses traits to make promises about the
 acceptable type of `data` consumed by `fit`. In general, this can be a promise regarding
 the ordinary type of `data` or the [scientific
 type](https://github.com/JuliaAI/ScientificTypes.jl) of `data` (but not
@@ -238,37 +234,23 @@ Or, in other words:
   AbstractVector{Continuous}` - meaning that it is an abstract vector with `<:AbstractFloat`
   elements.
 
-## Input/output types for operations
+## Input types for operations
 
-An optional promise that an operation, such as `predict`, returns an object of given
-scientific type is articulated in this way:
+An optional promise about what `data` is guaranteed to work in a call like
+`predict(model, fitted_params, data...)` is articulated this way:
 
 ```@example anatomy
-@trait predict_output_scitype=AbstractVector{<:Continuous}
-nothing # hide
-```
-
-If `predict` had instead returned probability distributions that implement the
-`Distributions.pdf` interface, then one could instead make the declaration
-
-```julia
-@trait MyRidge predict_output_scitype=AbstractVector{Density{<:Continuous}}
+@trait MyRidge predict_input_scitype = Tuple{AbstractVector{<:Continuous}}
 ```
 
-Similarly, there exists a trait called [`LearnAPI.predict_output_type`](@ref) for making promises
-on the ordinary type returned by an operation.
-
-Finally, we'll make a promise about what `data` is guaranteed to work in a call like
-`predict(model, fitted_params, data...)`. Note that `data` is always a `Tuple`, even if it
-has only one component (the typical case).
-
-```@example anatomy
-@trait MyRidge predict_input_scitype = (; predict=Tuple{AbstractVector{<:Continuous}})
-```
+Note that `data` is always a `Tuple`, even if it has only one component (the typical
+case), which explains the `Tuple` on the right-hand side.
 
 Optionally, we may express our promise using regular types, using the
 [`LearnAPI.predict_input_type`](@ref) trait.
 
+One can optionally make promises about the outut of an operation. See [Model Traits](@ref)
+for details.
 
 ## [Illustrative fit/predict workflow](@id workflow)
 
@@ -286,21 +268,21 @@ X = (; a, b, c) |> Tables.rowtable
 y = 2a - b + 3c + 0.05*rand(n)
 nothing # hide
 ```
-Instantiate a model with relevant hyperparameters:
+Instantiate a model with relevant hyperparameters (which is all the object stores):
 
 ```@example anatomy
 model = MyRidge(lambda=0.5)
 ```
 
-Train the model:
+Train the model (the `0` means do so silently):
 
 ```@example anatomy
 import LearnAPI: fit, predict, feature_importances
 
-fitted_params, state, fit_report = fit(model, 1, X[train], y[train])
+fitted_params, state, fit_report = fit(model, 0, X[train], y[train])
 ```
 
-Inspect the learned paramters and report:
+Inspect the learned parameters and report:
 
 ```@example anatomy
 @info "training outcomes" fitted_params fit_report
 
@@ -21,7 +21,7 @@ implementations fall into one (or more) of the following informally understood p
 - [Incremental Models](@ref)
 
 - [Static Transformers](@ref): Transformations that do not learn but which have
-  hyper-parameters and/or deliver ancilliary information about the transformation
+  hyper-parameters and/or deliver ancillary information about the transformation
 
 - [Dimension Reduction](@ref): Transformers that learn to reduce feature space dimension
 
 
@@ -13,7 +13,7 @@
 
 All three methods above return a triple `(fitted_params, state, report)` whose components
 are explained under [`LearnAPI.fit`](@ref) below.  Items that might be returned in
-`report` include: feature rankings/importances, SVM support vectors, clustering centres,
+`report` include: feature rankings/importances, SVM support vectors, clustering centers,
 methods for visualizing training outcomes, methods for saving learned parameters in a
 custom format, degrees of freedom, deviances. Precisely what `report` includes might be
 controlled by model hyperparameters, especially if there is a performance cost to it's
 
@@ -10,28 +10,28 @@ A basic Julia interface for training and applying machine learning models </span
 
 ## Quick tours
 
-- For developers wanting to **IMPLEMEMT** LearnAPI: [Anatomy of
-  an Implementation](@ref).
-
 - To see how to **USE** models implementing LearnAPI: [Basic fit/predict
   workflow](@ref workflow).
 
+- For developers wanting to **IMPLEMENT** LearnAPI: [Anatomy of
+  an Implementation](@ref).
+
 ## Approach
 
 Machine learning algorithms, also called *models*, have a complicated
-taxonomy. Grouping models, or modelling tasks, into a relatively small number of types,
-such as "classifier" and "clusterer", and attempting to impose uniform behaviour within
+taxonomy. Grouping models, or modeling tasks, into a relatively small number of types,
+such as "classifier" and "clusterer", and attempting to impose uniform behavior within
 each group, is challenging. In our experience developing the [MLJ
 ecosystem](https://github.com/alan-turing-institute/MLJ.jl), this either leads to
 limitations on the models that can be included in a general interface, or additional
 complexity needed to cope with exceptional cases. Even if a complete user interface for
 machine learning might benefit from such groupings, a basement-level API for ML should, in
 our view, avoid them.
 
-In a addition to basic methods, like `fit` and `predict`, LearnAPI provides a large number
+In addition to basic methods, like `fit` and `predict`, LearnAPI provides a number
 of optional model
 [traits](https://ahsmart.com/pub/holy-traits-design-patterns-and-best-practice-book/),
-each promising a specific kind of behaviour, such as "The predictions of this model are
+each promising a specific kind of behavior, such as "The predictions of this model are
 probability distributions".  There is no abstract type model hierarchy.
 
 Our preceding remarks notwithstanding, there is, for certain applications involving a
@@ -48,12 +48,12 @@ not supervised, can generalize to new data observations, or not generalize.
 ## Methods
 
 In LearnAPI.jl a *model* is just a container for the hyper-parameters of some machine
-learning algorithm, and that's all. It does not include learned parameters.
+learning algorithm, and does not typically include learned parameters.
 
 The following methods, dispatched on model type, are provided:
 
 - `fit`, for regular training, overloaded if the model generalizes to new data, as in
-  classical supervised learning
+  classical supervised learning; the principal output of `fit` is the learned parameters
 
 - `update!`, for adding model iterations, or responding efficiently to other
   post-`fit`changes in hyperparameters
@@ -66,11 +66,11 @@ The following methods, dispatched on model type, are provided:
 - common **accessor functions**, such as `feature_importances` and `training_losses`, for
   extracting, from training outcomes, information common to some models
 
-- **model traits**, such as `target_proxies(model)`, for promising specific behaviour
+- **model traits**, such as `predict_output_type(model)`, for promising specific behavior
 
-There is flexibility about how much of the interface is implemented by a given model
-object `model`. A special trait `functions(model)` declares what has been explicitly
-implemented to work with `model`, excluding traits.
+There is flexibility about how much of the interface is implemented by a given model type.
+A special trait `functions(model)` declares what has been explicitly implemented to work
+with `model`, excluding traits.
 
 Since this is a functional-style interface, `fit` returns model `state`, in addition to
 learned parameters, for passing to the optional `update!` and `ingest!` methods. These
@@ -89,10 +89,10 @@ formalize:
 - An object which generates ordered sequences of individual **observations** is called
   **data**. For example a `DataFrame` instance, from
   [DataFrames.jl](https://dataframes.juliadata.org/stable/), is considered data, the
-  observatons being the rows. A matrix can be considered data, but whether the
+  observations being the rows. A matrix can be considered data, but whether the
   observations are rows or columns is ambiguous and not fixed by LearnAPI.
 
-- Each machine learning model's behaviour is governed by a number of user-specified
+- Each machine learning model's behavior is governed by a number of user-specified
   **hyperparameters**. The regularization parameter in ridge regression is an
   example. Hyperparameters are data-independent. For example, the number of target classes
   is not a hyperparameter.
@@ -119,21 +119,20 @@ for the general user - such as a table (dataframe) or the path to a directory co
 image files - and a performant, model-specific representation of that data, such as a
 matrix or image "data loader". When retraining using the same data with new
 hyper-parameters, one wants to avoid recreating the model-specific representation, and,
-accordingly, a higher level ML interface may want to cache model-specific
+accordingly, a higher level ML interface may want to cache such
 representations. Furthermore, in resampling (e.g., performing cross-validation), a higher
 level interface wants only to resample the model-specific representation, so it needs to
 know how to do that. To meet these two ends, LearnAPI provides two additional **data
 methods** dispatched on model type:
 
-- `reformat(model, ...)`, for converting from a user data representation to a peformant model-specific
-  representation
+- `reformat(model, ...)`, for converting from a user data representation to a performant model-specific representation, whose output is for use in `fit`, `predict`, etc. above
 
 - `getobs(model, ...)`, for extracting a subsample of observations of the model-specific
   representation
 
 It should be emphasized that LearnAPI is itself agnostic to particular representations of
-data or the particular methods of accessing observations within them. Each `model` is free
-to choose its own data interface.
+data or the particular methods of accessing observations within them. By overloading these
+methods, Each `model` is free to choose its own data interface.
 
 See [Optional data Interface](@ref data_interface) for more details. 
 
@@ -158,6 +157,6 @@ interface is the [Reference](@ref reference) section.
 
 **Note.** In the future, LearnAPI.jl may become the new foundation for the
 [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) toolbox created by the same
-developers. However, LearnAPI.jl is meant as a general purpose, standalone, lightweight,
+developers. However, LearnAPI.jl is meant as a general purpose, stand-alone, lightweight,
 low level API for machine learning algorithms (and has no reference to the "machines" used
 there).