Machine Learning Algorithms in C

A comprehensive, production-ready implementation of popular machine learning algorithms in C, featuring robust error handling, performance optimizations, and best practices.

🚀 Features

Implemented Algorithms

Linear Regression - Gradient descent with early stopping
Ridge Regression - L2 regularization with gradient descent
K-Nearest Neighbors - Optimized for both regression and classification
Random Forest - Ensemble of decision trees with bootstrap sampling
XGBoost - Simplified gradient boosting implementation

📁 Project Structure

faster_ml/
├── algorithms.c      # Main implementation file
├── Makefile         # Build configuration
├── README.md        # This file
└── .gitignore       # Git ignore patterns

🛠️ Building and Running

Prerequisites

GCC compiler (version 4.9 or higher)
Make utility
Math library (linked automatically)

Quick Start

# Build release version
make

# Run the program
make test

# Or run directly
./ml_algorithms

Build Options

# Debug build with extra warnings
make debug
make test-debug

# Build with memory sanitizer (for debugging)
make sanitize
make test-sanitize

# Memory leak checking with valgrind
make memcheck

# Performance profiling
make profile

# Clean build artifacts
make clean

# Show all available targets
make help

📊 Algorithm Details

1. Linear Regression

Method: Gradient descent with early stopping
Features: Automatic learning rate, progress monitoring
Optimizations: Loop unrolling, early stopping with patience

2. Ridge Regression

Method: L2-regularized gradient descent
Features: Configurable regularization strength
Optimizations: Same as linear regression plus regularization

3. K-Nearest Neighbors

Method: Lazy learning with optimized distance calculation
Features: Supports both regression and classification
Optimizations:
- Loop unrolling in distance calculation
- Partial sorting for finding k nearest neighbors
- Configurable sorting threshold

4. Random Forest

Method: Ensemble of decision trees with bootstrap sampling
Features:
- Feature subsampling for diversity
- Configurable tree depth and minimum samples
- Progress reporting during training
Optimizations: Limited feature search for better performance

5. XGBoost

Method: Gradient boosting with decision trees
Features:
- Configurable learning rate
- Residual-based tree building
- Progress monitoring
Optimizations: Efficient residual calculation and tree building

🔧 API Usage

Basic Usage Example

#include <stdio.h>
#include <stdlib.h>

// Create dataset
int n_samples = 100;
int n_features = 2;
Dataset *ds = create_dataset(n_samples, n_features);

// Fill dataset with your data
// ... populate ds->data and ds->target ...

// Train Linear Regression
LinearRegression *lr = create_linear_regression(n_features);
train_linear_regression(lr, ds, 0.01, 1000);

// Make prediction
double test_sample[2] = {5.0, 3.0};
double prediction = predict_linear_regression(lr, test_sample);
printf("Prediction: %.2f\n", prediction);

// Cleanup
free_linear_regression(lr);
free_dataset(ds);

Error Handling

All functions include comprehensive error checking:

// Functions return NULL on error and print error messages
LinearRegression *lr = create_linear_regression(-1);  // Will return NULL
if (lr == NULL) {
    // Handle error
    return -1;
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
algorithms.c		algorithms.c
algorithms.h		algorithms.h
algorithms_impl.c		algorithms_impl.c
csv_parser.c		csv_parser.c
diabetes_012_health_indicators_BRFSS2015.csv		diabetes_012_health_indicators_BRFSS2015.csv
ml_headers.h		ml_headers.h
test_diabetes.c		test_diabetes.c
test_diabetes_small.c		test_diabetes_small.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Algorithms in C

🚀 Features

Implemented Algorithms

📁 Project Structure

🛠️ Building and Running

Prerequisites

Quick Start

Build Options

📊 Algorithm Details

1. Linear Regression

2. Ridge Regression

3. K-Nearest Neighbors

4. Random Forest

5. XGBoost

🔧 API Usage

Basic Usage Example

Error Handling

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Algorithms in C

🚀 Features

Implemented Algorithms

📁 Project Structure

🛠️ Building and Running

Prerequisites

Quick Start

Build Options

📊 Algorithm Details

1. Linear Regression

2. Ridge Regression

3. K-Nearest Neighbors

4. Random Forest

5. XGBoost

🔧 API Usage

Basic Usage Example

Error Handling

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages