Skip to content

Commit 79869d8

Browse files
authored
Sensitivity svm (#201)
* sensitivity reg * local model, Plots explicit * direct assign * block separation * fix block * fix computations * comments * improved SVM
1 parent ba669f5 commit 79869d8

File tree

1 file changed

+25
-24
lines changed

1 file changed

+25
-24
lines changed

docs/src/examples/sensitivity-analysis-svm.jl

Lines changed: 25 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
# This notebook illustrates sensitivity analysis of data points in a [Support Vector Machine](https://en.wikipedia.org/wiki/Support-vector_machine) (inspired from [@matbesancon](http://github.com/matbesancon)'s [SimpleSVMs](http://github.com/matbesancon/SimpleSVMs.jl).)
66

7-
# For reference, Section 10.1 of https://online.stat.psu.edu/stat508/book/export/html/792 gives an intuitive explanation of what it means to have a sensitive hyperplane or data point. The general form of the SVM training problem is given below (without regularization):
7+
# For reference, Section 10.1 of https://online.stat.psu.edu/stat508/book/export/html/792 gives an intuitive explanation of what it means to have a sensitive hyperplane or data point. The general form of the SVM training problem is given below (with $\ell_2$ regularization):
88

99
# ```math
1010
# \begin{split}
@@ -19,25 +19,27 @@
1919
# - `X`, `y` are the `N` data points
2020
# - `w` is the support vector
2121
# - `b` determines the offset `b/||w||` of the hyperplane with normal `w`
22-
# - `ξ` is the soft-margin loss.
23-
22+
# - `ξ` is the soft-margin loss
23+
# - `λ` is the $\ell_2$ regularization.
24+
#
2425
# This tutorial uses the following packages
2526

2627
using JuMP # The mathematical programming modelling language
2728
import DiffOpt # JuMP extension for differentiable optimization
2829
import Ipopt # Optimization solver that handles quadratic programs
2930
import Plots # Graphing tool
30-
import LinearAlgebra: dot, norm, normalize!
31+
import LinearAlgebra: dot, norm
3132
import Random
3233

3334
# ## Define and solve the SVM
3435

35-
# Construct separable, non-trivial data points.
36+
# Construct two clusters of data points.
3637

3738
N = 100
3839
D = 2
40+
3941
Random.seed!(62)
40-
X = vcat(randn(N ÷ 2, D), randn(N ÷ 2, D) .+ [4.5, 2.0]')
42+
X = vcat(randn(N ÷ 2, D), randn(N ÷ 2, D) .+ [2.0, 2.0]')
4143
y = append!(ones(N ÷ 2), -ones(N ÷ 2))
4244
λ = 0.05;
4345

@@ -86,11 +88,10 @@ wv = value.(w)
8688

8789
bv = value(b)
8890

89-
svm_x = [0.0, 5.0] # arbitrary points
91+
svm_x = [-2.0, 4.0] # arbitrary points
9092
svm_y = (-bv .- wv[1] * svm_x )/wv[2]
9193

9294
p = Plots.scatter(X[:,1], X[:,2], color = [yi > 0 ? :red : :blue for yi in y], label = "")
93-
Plots.yaxis!(p, (-2, 4.5))
9495
Plots.plot!(p, svm_x, svm_y, label = "loss = $(round(loss, digits=2))", width=3)
9596

9697
# ## Gradient of hyperplane wrt the data point coordinates
@@ -101,25 +102,27 @@ Plots.plot!(p, svm_x, svm_y, label = "loss = $(round(loss, digits=2))", width=3)
101102

102103
# How does a change in coordinates of the data points, `X`,
103104
# affects the position of the hyperplane?
104-
# This is achieved by finding gradients of `w`, `b` with respect to `X[i]`,
105-
# 2D coordinates of the data points.
105+
# This is achieved by finding gradients of `w` and `b` with respect to `X[i]`.
106106

107107
# Begin differentiating the model.
108108
# analogous to varying θ in the expression:
109109
# ```math
110110
# y_{i} (w^T (X_{i} + \theta) + b) \ge 1 - \xi_{i}
111111
# ```
112112
= zeros(N)
113-
dX = zeros(N, D);
114113
for i in 1:N
115-
dX[i, :] = ones(D) # set
116114
for j in 1:N
117-
MOI.set(
118-
model,
119-
DiffOpt.ForwardInConstraint(),
120-
cons[j],
121-
y[j] * dot(dX[j,:], index.(w)),
122-
)
115+
if i == j
116+
## we consider identical perturbations on all x_i coordinates
117+
MOI.set(
118+
model,
119+
DiffOpt.ForwardInConstraint(),
120+
cons[j],
121+
y[j] * sum(w),
122+
)
123+
else
124+
MOI.set(model, DiffOpt.ForwardInConstraint(), cons[j], 0.0 * sum(w))
125+
end
123126
end
124127
DiffOpt.forward(model)
125128
dw = MOI.get.(
@@ -133,19 +136,17 @@ for i in 1:N
133136
b,
134137
)
135138
∇[i] = norm(dw) + norm(db)
136-
dX[i, :] = zeros(D) # reset the change made at the beginning of the loop
137139
end
138140

139-
normalize!(∇);
140-
141141
# We can visualize the separating hyperplane sensitivity with respect to the data points.
142-
# Note that the norm of the gradients are normalized and all the small numbers
143-
# were converted into 1/10 of the largest value to show all the points of the set.
142+
# Note that all the small numbers were converted into 1/10 of the
143+
# largest value to show all the points of the set.
144144

145145
p3 = Plots.scatter(
146146
X[:,1], X[:,2],
147147
color = [yi > 0 ? :red : :blue for yi in y], label = "",
148-
markersize = 20 * max.(∇, 0.1 * maximum(∇)),
148+
markersize = 2 * (max.(1.8∇, 0.2 * maximum(∇))),
149149
)
150150
Plots.yaxis!(p3, (-2, 4.5))
151151
Plots.plot!(p3, svm_x, svm_y, label = "", width=3)
152+
Plots.title!("Sensitivity of the separator to data point variations")

0 commit comments

Comments
 (0)