Skip to content

Conversation

@prajwalshah19
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@shashank524 shashank524 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good use of classes

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds PCA (Principal Component Analysis) and GARCH (Generalized Autoregressive Conditional Heteroskedasticity) statistical models to the finmath library, along with comprehensive test coverage for the existing binomial option pricing functionality.

  • Implements PCA with standardization, dimensionality reduction, and reconstruction capabilities
  • Implements GARCH(1,1) model with maximum likelihood estimation via gradient descent
  • Adds extensive test cases for binomial tree option pricing including ITM/OTM scenarios, convergence tests, and volatility sensitivity

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
include/finmath/StatisticalModels/pca.h Header file defining the PCA class API with transform/inverse transform methods
src/cpp/StatisticalModels/pca.cpp PCA implementation with eigenvalue decomposition and data standardization
test/StatisticalModels/pca_test.cpp Test suite for PCA with simple 2D and random 3D data scenarios
include/finmath/StatisticalModels/garch.h Header file defining the GARCH class API with volatility prediction
src/cpp/StatisticalModels/garch.cpp GARCH(1,1) implementation with gradient descent optimization and AIC/BIC calculation
test/StatisticalModels/garch_test.cpp Test suite for GARCH with synthetic data and volatility clustering scenarios
test/OptionPricing/binomial_option_pricing_test.cpp Comprehensive test cases for binomial tree pricing including convergence and sensitivity tests
CMakeLists.txt Build configuration updates to integrate Eigen3 library and add new test executables

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (steps_ahead <= 0) return std::sqrt(conditional_variances_.tail(1)(0));

// Compute long-run variance
double long_run_var = omega_ / (1.0 - alpha_ - beta_);
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential division by zero when alpha_ + beta_ equals 1.0. The GARCH stationarity condition requires alpha_ + beta_ < 1, but the code allows values up to 0.99. If the sum equals or exceeds 1.0, the long-run variance calculation will result in division by zero or negative variance. Add a check to ensure alpha_ + beta_ stays well below 1.0 or handle this edge case.

Copilot uses AI. Check for mistakes.

// Sort eigenvalues and eigenvectors in descending order
Eigen::VectorXd eigenvalues = eigen_solver.eigenvalues().reverse();
Eigen::MatrixXd eigenvectors = eigen_solver.eigenvectors().rowwise().reverse();
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eigenvectors are being reversed incorrectly. The method rowwise().reverse() reverses each row individually, not the column order. To reverse the order of eigenvector columns (to match descending eigenvalues), use eigenvectors.rowwise().reverse() should be eigenvectors.colwise().reverse() or better yet, use eigenvectors.rightCols(data.cols()).rowwise().reverse() depending on the intended behavior. The current implementation produces incorrect principal components.

Copilot uses AI. Check for mistakes.
std::cout << "N=200: " << price3 << std::endl;

// Prices should converge as N increases
assert(std::abs(price2 - price1) > std::abs(price3 - price2));
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convergence assertion assert(std::abs(price2 - price1) > std::abs(price3 - price2)) assumes monotonic convergence, but binomial tree pricing can exhibit oscillating convergence behavior. This assertion may fail even when the method is working correctly. Consider using a more robust convergence test, such as checking that both differences are decreasing in magnitude or that all prices are within a reasonable range of each other.

Suggested change
assert(std::abs(price2 - price1) > std::abs(price3 - price2));
double conv_tolerance = 0.01;
assert(std::abs(price2 - price1) < conv_tolerance);
assert(std::abs(price3 - price2) < conv_tolerance);
assert(std::abs(price3 - price1) < 2 * conv_tolerance);

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +40
for (int t = 1; t < n; ++t) {
conditional_variances_(t) = omega_ +
alpha_ * returns(t-1) * returns(t-1) +
beta_ * conditional_variances_(t-1);
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential issue with negative conditional variances. While omega, alpha, and beta are constrained to be non-negative, the GARCH equation omega_ + alpha_ * returns(t-1)^2 + beta_ * conditional_variances_(t-1) could theoretically produce very small or negative values due to numerical precision issues, especially early in optimization. Consider adding a check to ensure conditional_variances_(t) is always positive (e.g., std::max(epsilon, ...) where epsilon is a small positive number).

Suggested change
for (int t = 1; t < n; ++t) {
conditional_variances_(t) = omega_ +
alpha_ * returns(t-1) * returns(t-1) +
beta_ * conditional_variances_(t-1);
constexpr double epsilon = 1e-8;
for (int t = 1; t < n; ++t) {
conditional_variances_(t) = std::max(
epsilon,
omega_ + alpha_ * returns(t-1) * returns(t-1) + beta_ * conditional_variances_(t-1)
);

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +22
for (int t = 1; t < n; ++t) {
double sigma_t = std::sqrt(omega + alpha * returns(t-1) * returns(t-1) + beta * sigma_t * sigma_t);
returns(t) = sigma_t * (rand() / double(RAND_MAX) - 0.5);
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable sigma_t is uninitialized when first used in the loop. It should be initialized before the loop or use conditional_variances_(t-1) which is properly initialized. This causes undefined behavior as sigma_t * sigma_t uses an uninitialized value.

Suggested change
for (int t = 1; t < n; ++t) {
double sigma_t = std::sqrt(omega + alpha * returns(t-1) * returns(t-1) + beta * sigma_t * sigma_t);
returns(t) = sigma_t * (rand() / double(RAND_MAX) - 0.5);
// Initialize previous volatility (unconditional variance or a small positive value)
double sigma_prev = std::sqrt(omega / (1.0 - alpha - beta));
for (int t = 1; t < n; ++t) {
double sigma_t = std::sqrt(omega + alpha * returns(t-1) * returns(t-1) + beta * sigma_prev * sigma_prev);
returns(t) = sigma_t * (rand() / double(RAND_MAX) - 0.5);
sigma_prev = sigma_t;

Copilot uses AI. Check for mistakes.
// Basic checks
assert(residuals.size() == n);
assert(aic < 0); // AIC should be negative for good fit
assert(bic < aic); // BIC should be more negative than AIC
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion assert(bic < aic) is incorrect. BIC (Bayesian Information Criterion) penalizes model complexity more than AIC and typically has a higher (less negative) value than AIC for the same model. The assertion should be assert(bic > aic) or removed entirely if the relationship is not guaranteed.

Suggested change
assert(bic < aic); // BIC should be more negative than AIC
assert(bic > aic); // BIC should be less negative (higher) than AIC

Copilot uses AI. Check for mistakes.
}

double GARCH::get_aic() const {
return -2 * calculate_likelihood() + 6; // 3 parameters
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AIC formula uses a constant 6 which should be 2 * 3 = 6 for 3 parameters. However, this assumes the model always has exactly 3 parameters. Consider using a more explicit calculation like 2 * 3 to make the formula clearer and easier to maintain if the number of parameters changes in future GARCH variants.

Suggested change
return -2 * calculate_likelihood() + 6; // 3 parameters
return -2 * calculate_likelihood() + 2 * 3; // 3 parameters

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants