This repository contains a from-scratch implementation of a feedforward neural network in Python using NumPy. The implementation supports:
- Fully connected layers
- Multiple activation functions (ReLU, tanh, sigmoid, softmax, identity)
- Binary and multiclass classification
- Regression (L2 loss)
- Mini-batch gradient descent training
The goal of this project is to demonstrate both the mechanics of neural networks and the behavior of optimization in practice.
- Modular neural network architecture
- Support for:
- Binary classification (sigmoid + log loss)
- Multiclass classification (softmax + cross-entropy)
- Regression (L2 loss)
- Mini-batch stochastic gradient descent
- Training diagnostics (loss curves, accuracy)
- Synthetic test suite covering multiple problem types
The core class is NeuralNetwork, which allows you to define, train, and evaluate a feedforward neural network.
NN = NeuralNetwork(input_dim, layer_spec, loss_func)input_dim: number of input featureslayer_spec: list describing the layersloss_func: one of:'binary_log_loss''multiclass_log_loss''l2'
Each layer is specified as:
[output_dimension, activation_function]Example:
layer_spec = [
[8, 'tanh'],
[8, 'tanh'],
[1, 'sigmoid']
]'ReLU''tanh''sigmoid''GeLU''softmax'(output layer only)'identity'(for regression output)
| Task | Output Activation | Loss Function |
|---|---|---|
| Binary classification | sigmoid | binary_log_loss |
| Multiclass classification | softmax | multiclass_log_loss |
| Regression | identity | l2 |
NN.fit(X, y, batch_sz=32, epochs=100, learning_rate=0.01)or
Y_hat = NN.fit_transform(X, y)Parameters:
batch_sz: mini-batch sizeepochs: number of passes through datalearning_rate: gradient step sizeshuffle: whether to shuffle data each epoch (default: True)
Y_hat = NN.transform(X_test)Returns a NumPy array of predictions.
NN.predict(x)Behavior depends on task:
- Binary classification → returns 0 or 1
- Multiclass → returns class index
- Regression → returns predicted value
Optional:
NN.predict(x, class_names=["cat", "dog"])loss_history = NN.get_loss_history()Returns loss per epoch.
import numpy as np
from neural_nets import NeuralNetwork
# Data
X = np.random.randn(1000, 2)
y = (X[:, 0] + X[:, 1] > 0).astype(int)
# Model
NN = NeuralNetwork(
input_dim=2,
layer_spec=[[4, 'ReLU'], [1, 'sigmoid']],
loss_func='binary_log_loss'
)
# Train
NN.fit(X, y, epochs=100)
# Predict
pred = NN.predict([0.2, -0.1])
print(pred)NN = NeuralNetwork(
input_dim=2,
layer_spec=[[8, 'tanh'], [3, 'softmax']],
loss_func='multiclass_log_loss'
)
NN.fit(X_train, Y_train, epochs=200)
pred_class = NN.predict([1.0, 0.5])NN = NeuralNetwork(
input_dim=1,
layer_spec=[[16, 'tanh'], [1, 'identity']],
loss_func='l2'
)
NN.fit(X_train, Y_train, epochs=200)
y_pred = NN.predict([0.3])You can print the network:
print(NN)Example output:
Neural Network Details
======================
Input Layer: ( 2 -> 8)
...
Total Network Parameters: 105
Loss Function: binary_log_loss
Trained = True
Training Loss = ...
Epochs Trained: ...
- Inputs must have shape
(n_samples, input_dim) - Internally, vectors are reshaped to column form
(input_dim, 1) - Training uses mini-batch gradient descent with backpropagation
- Weight initialization uses Xavier or He depending on activation
The model is evaluated on a collection of synthetic datasets designed to test different aspects of neural network behavior:
| Test | Purpose |
|---|---|
| Binary Gaussians | Linear/separable classification |
| XOR | Nonlinear decision boundary |
| Two Moons (small net) | Underfitting example |
| Two Moons (large net) | Nonlinear modeling capacity |
| Multiclass Gaussian clusters | Softmax classification |
| Sine Regression | Function approximation |
- Train Accuracy: 99.90%
- Test Accuracy: 99.85%
- Train Accuracy: 96.0%
- Test Accuracy: 96.64%
The loss curve exhibits oscillations near convergence:
- Caused by mini-batch stochastic gradient updates
- Occasional spikes reflect noisy gradient estimates
Learning rate and optimization strategy significantly affect convergence stability
- Train Accuracy: 87.1%
- Test Accuracy: 88.2%
- Train Accuracy: 97.3%
- Test Accuracy: 96.9%
| Scale | Train Accuracy | Test Accuracy |
|---|---|---|
| 0.5 (well-separated) | 100.00% | 99.98% |
| 1.0 (moderate overlap) | 97.09% | 96.89% |
| 1.5 (significant overlap) | 87.67% | 86.64% |
- Train MSE: 0.00818
- Test MSE: 0.00822
python neural_nets_test.pyThis will:
- Train all models
- Save plots
- Output metrics
- Generate a JSON summary (
NeuralNet_test_results.json)
{
"binary_gaussians": {
"training_accuracy": 0.999,
"test_accuracy": 0.9985
},
"xor": {
"training_accuracy": 0.96,
"test_accuracy": 0.9664
}
}This implementation demonstrates:
- Correct forward/backward propagation
- Proper behavior across classification and regression tasks
- Sensible generalization performance
- Realistic optimization dynamics under SGD
Notably, the XOR loss curve illustrates how stochastic gradient descent can produce oscillatory convergence behavior.
- Momentum / Adam optimizer
- Learning rate scheduling
- Regularization (L2, dropout)
- Vectorized batching improvements
- GPU acceleration (PyTorch rewrite)
- NumPy
- Matplotlib
- scikit-learn (for test data generation)
Brent Young
MIT License (or your preferred license)










