Skip to content

byoung77/Neural-Net-Implementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Network Implementation

This repository contains a from-scratch implementation of a feedforward neural network in Python using NumPy. The implementation supports:

  • Fully connected layers
  • Multiple activation functions (ReLU, tanh, sigmoid, softmax, identity)
  • Binary and multiclass classification
  • Regression (L2 loss)
  • Mini-batch gradient descent training

The goal of this project is to demonstrate both the mechanics of neural networks and the behavior of optimization in practice.


Features

  • Modular neural network architecture
  • Support for:
    • Binary classification (sigmoid + log loss)
    • Multiclass classification (softmax + cross-entropy)
    • Regression (L2 loss)
  • Mini-batch stochastic gradient descent
  • Training diagnostics (loss curves, accuracy)
  • Synthetic test suite covering multiple problem types

Using the Neural Network Class

The core class is NeuralNetwork, which allows you to define, train, and evaluate a feedforward neural network.

Basic Syntax

NN = NeuralNetwork(input_dim, layer_spec, loss_func)
  • input_dim: number of input features
  • layer_spec: list describing the layers
  • loss_func: one of:
    • 'binary_log_loss'
    • 'multiclass_log_loss'
    • 'l2'

Defining the Architecture

Each layer is specified as:

[output_dimension, activation_function]

Example:

layer_spec = [
    [8, 'tanh'],
    [8, 'tanh'],
    [1, 'sigmoid']
]

Supported Activations

  • 'ReLU'
  • 'tanh'
  • 'sigmoid'
  • 'GeLU'
  • 'softmax' (output layer only)
  • 'identity' (for regression output)

Valid Output Layer / Loss Combinations

Task Output Activation Loss Function
Binary classification sigmoid binary_log_loss
Multiclass classification softmax multiclass_log_loss
Regression identity l2

Training a Model

NN.fit(X, y, batch_sz=32, epochs=100, learning_rate=0.01)

or

Y_hat = NN.fit_transform(X, y)

Parameters:

  • batch_sz: mini-batch size
  • epochs: number of passes through data
  • learning_rate: gradient step size
  • shuffle: whether to shuffle data each epoch (default: True)

Making Predictions

Batch predictions

Y_hat = NN.transform(X_test)

Returns a NumPy array of predictions.


Single prediction

NN.predict(x)

Behavior depends on task:

  • Binary classification → returns 0 or 1
  • Multiclass → returns class index
  • Regression → returns predicted value

Optional:

NN.predict(x, class_names=["cat", "dog"])

Accessing Training Loss

loss_history = NN.get_loss_history()

Returns loss per epoch.


Example: Binary Classification

import numpy as np
from neural_nets import NeuralNetwork

# Data
X = np.random.randn(1000, 2)
y = (X[:, 0] + X[:, 1] > 0).astype(int)

# Model
NN = NeuralNetwork(
    input_dim=2,
    layer_spec=[[4, 'ReLU'], [1, 'sigmoid']],
    loss_func='binary_log_loss'
)

# Train
NN.fit(X, y, epochs=100)

# Predict
pred = NN.predict([0.2, -0.1])
print(pred)

Example: Multiclass Classification

NN = NeuralNetwork(
    input_dim=2,
    layer_spec=[[8, 'tanh'], [3, 'softmax']],
    loss_func='multiclass_log_loss'
)

NN.fit(X_train, Y_train, epochs=200)

pred_class = NN.predict([1.0, 0.5])

Example: Regression

NN = NeuralNetwork(
    input_dim=1,
    layer_spec=[[16, 'tanh'], [1, 'identity']],
    loss_func='l2'
)

NN.fit(X_train, Y_train, epochs=200)

y_pred = NN.predict([0.3])

Model Summary

You can print the network:

print(NN)

Example output:

Neural Network Details
======================
 Input Layer: ( 2 ->  8)
 ...
Total Network Parameters: 105
Loss Function: binary_log_loss
Trained = True
Training Loss = ...
Epochs Trained: ...

Notes

  • Inputs must have shape (n_samples, input_dim)
  • Internally, vectors are reshaped to column form (input_dim, 1)
  • Training uses mini-batch gradient descent with backpropagation
  • Weight initialization uses Xavier or He depending on activation

Test Suite Overview

The model is evaluated on a collection of synthetic datasets designed to test different aspects of neural network behavior:

Test Purpose
Binary Gaussians Linear/separable classification
XOR Nonlinear decision boundary
Two Moons (small net) Underfitting example
Two Moons (large net) Nonlinear modeling capacity
Multiclass Gaussian clusters Softmax classification
Sine Regression Function approximation

Results

Binary Classification (Well-Separated Gaussians)

  • Train Accuracy: 99.90%
  • Test Accuracy: 99.85%


XOR (Nonlinear Classification)

  • Train Accuracy: 96.0%
  • Test Accuracy: 96.64%

Optimization Note

The loss curve exhibits oscillations near convergence:

  • Caused by mini-batch stochastic gradient updates
  • Occasional spikes reflect noisy gradient estimates

Learning rate and optimization strategy significantly affect convergence stability


Two Moons Dataset

Small Network (2 → 2 → 1)

  • Train Accuracy: 87.1%
  • Test Accuracy: 88.2%


Larger Network (2 → 8 → 8 → 1)

  • Train Accuracy: 97.3%
  • Test Accuracy: 96.9%


Multiclass Classification (3 Gaussian Clusters)

Scale Train Accuracy Test Accuracy
0.5 (well-separated) 100.00% 99.98%
1.0 (moderate overlap) 97.09% 96.89%
1.5 (significant overlap) 87.67% 86.64%


Sine Regression

  • Train MSE: 0.00818
  • Test MSE: 0.00822


Running the Test Suite

python neural_nets_test.py

This will:

  • Train all models
  • Save plots
  • Output metrics
  • Generate a JSON summary (NeuralNet_test_results.json)

Example Output (JSON)

{
  "binary_gaussians": {
    "training_accuracy": 0.999,
    "test_accuracy": 0.9985
  },
  "xor": {
    "training_accuracy": 0.96,
    "test_accuracy": 0.9664
  }
}

Discussion

This implementation demonstrates:

  • Correct forward/backward propagation
  • Proper behavior across classification and regression tasks
  • Sensible generalization performance
  • Realistic optimization dynamics under SGD

Notably, the XOR loss curve illustrates how stochastic gradient descent can produce oscillatory convergence behavior.


Future Work

  • Momentum / Adam optimizer
  • Learning rate scheduling
  • Regularization (L2, dropout)
  • Vectorized batching improvements
  • GPU acceleration (PyTorch rewrite)

Dependencies

  • NumPy
  • Matplotlib
  • scikit-learn (for test data generation)

Author

Brent Young


License

MIT License (or your preferred license)

About

From-scratch neural network implementation in NumPy supporting regression, classification, and training diagnostics.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages