-
Notifications
You must be signed in to change notification settings - Fork 3
pca, garch, and tests for binomial tree #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
shashank524
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good use of classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds PCA (Principal Component Analysis) and GARCH (Generalized Autoregressive Conditional Heteroskedasticity) statistical models to the finmath library, along with comprehensive test coverage for the existing binomial option pricing functionality.
- Implements PCA with standardization, dimensionality reduction, and reconstruction capabilities
- Implements GARCH(1,1) model with maximum likelihood estimation via gradient descent
- Adds extensive test cases for binomial tree option pricing including ITM/OTM scenarios, convergence tests, and volatility sensitivity
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| include/finmath/StatisticalModels/pca.h | Header file defining the PCA class API with transform/inverse transform methods |
| src/cpp/StatisticalModels/pca.cpp | PCA implementation with eigenvalue decomposition and data standardization |
| test/StatisticalModels/pca_test.cpp | Test suite for PCA with simple 2D and random 3D data scenarios |
| include/finmath/StatisticalModels/garch.h | Header file defining the GARCH class API with volatility prediction |
| src/cpp/StatisticalModels/garch.cpp | GARCH(1,1) implementation with gradient descent optimization and AIC/BIC calculation |
| test/StatisticalModels/garch_test.cpp | Test suite for GARCH with synthetic data and volatility clustering scenarios |
| test/OptionPricing/binomial_option_pricing_test.cpp | Comprehensive test cases for binomial tree pricing including convergence and sensitivity tests |
| CMakeLists.txt | Build configuration updates to integrate Eigen3 library and add new test executables |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (steps_ahead <= 0) return std::sqrt(conditional_variances_.tail(1)(0)); | ||
|
|
||
| // Compute long-run variance | ||
| double long_run_var = omega_ / (1.0 - alpha_ - beta_); |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential division by zero when alpha_ + beta_ equals 1.0. The GARCH stationarity condition requires alpha_ + beta_ < 1, but the code allows values up to 0.99. If the sum equals or exceeds 1.0, the long-run variance calculation will result in division by zero or negative variance. Add a check to ensure alpha_ + beta_ stays well below 1.0 or handle this edge case.
|
|
||
| // Sort eigenvalues and eigenvectors in descending order | ||
| Eigen::VectorXd eigenvalues = eigen_solver.eigenvalues().reverse(); | ||
| Eigen::MatrixXd eigenvectors = eigen_solver.eigenvectors().rowwise().reverse(); |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The eigenvectors are being reversed incorrectly. The method rowwise().reverse() reverses each row individually, not the column order. To reverse the order of eigenvector columns (to match descending eigenvalues), use eigenvectors.rowwise().reverse() should be eigenvectors.colwise().reverse() or better yet, use eigenvectors.rightCols(data.cols()).rowwise().reverse() depending on the intended behavior. The current implementation produces incorrect principal components.
| std::cout << "N=200: " << price3 << std::endl; | ||
|
|
||
| // Prices should converge as N increases | ||
| assert(std::abs(price2 - price1) > std::abs(price3 - price2)); |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convergence assertion assert(std::abs(price2 - price1) > std::abs(price3 - price2)) assumes monotonic convergence, but binomial tree pricing can exhibit oscillating convergence behavior. This assertion may fail even when the method is working correctly. Consider using a more robust convergence test, such as checking that both differences are decreasing in magnitude or that all prices are within a reasonable range of each other.
| assert(std::abs(price2 - price1) > std::abs(price3 - price2)); | |
| double conv_tolerance = 0.01; | |
| assert(std::abs(price2 - price1) < conv_tolerance); | |
| assert(std::abs(price3 - price2) < conv_tolerance); | |
| assert(std::abs(price3 - price1) < 2 * conv_tolerance); |
| for (int t = 1; t < n; ++t) { | ||
| conditional_variances_(t) = omega_ + | ||
| alpha_ * returns(t-1) * returns(t-1) + | ||
| beta_ * conditional_variances_(t-1); |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential issue with negative conditional variances. While omega, alpha, and beta are constrained to be non-negative, the GARCH equation omega_ + alpha_ * returns(t-1)^2 + beta_ * conditional_variances_(t-1) could theoretically produce very small or negative values due to numerical precision issues, especially early in optimization. Consider adding a check to ensure conditional_variances_(t) is always positive (e.g., std::max(epsilon, ...) where epsilon is a small positive number).
| for (int t = 1; t < n; ++t) { | |
| conditional_variances_(t) = omega_ + | |
| alpha_ * returns(t-1) * returns(t-1) + | |
| beta_ * conditional_variances_(t-1); | |
| constexpr double epsilon = 1e-8; | |
| for (int t = 1; t < n; ++t) { | |
| conditional_variances_(t) = std::max( | |
| epsilon, | |
| omega_ + alpha_ * returns(t-1) * returns(t-1) + beta_ * conditional_variances_(t-1) | |
| ); |
| for (int t = 1; t < n; ++t) { | ||
| double sigma_t = std::sqrt(omega + alpha * returns(t-1) * returns(t-1) + beta * sigma_t * sigma_t); | ||
| returns(t) = sigma_t * (rand() / double(RAND_MAX) - 0.5); |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable sigma_t is uninitialized when first used in the loop. It should be initialized before the loop or use conditional_variances_(t-1) which is properly initialized. This causes undefined behavior as sigma_t * sigma_t uses an uninitialized value.
| for (int t = 1; t < n; ++t) { | |
| double sigma_t = std::sqrt(omega + alpha * returns(t-1) * returns(t-1) + beta * sigma_t * sigma_t); | |
| returns(t) = sigma_t * (rand() / double(RAND_MAX) - 0.5); | |
| // Initialize previous volatility (unconditional variance or a small positive value) | |
| double sigma_prev = std::sqrt(omega / (1.0 - alpha - beta)); | |
| for (int t = 1; t < n; ++t) { | |
| double sigma_t = std::sqrt(omega + alpha * returns(t-1) * returns(t-1) + beta * sigma_prev * sigma_prev); | |
| returns(t) = sigma_t * (rand() / double(RAND_MAX) - 0.5); | |
| sigma_prev = sigma_t; |
| // Basic checks | ||
| assert(residuals.size() == n); | ||
| assert(aic < 0); // AIC should be negative for good fit | ||
| assert(bic < aic); // BIC should be more negative than AIC |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion assert(bic < aic) is incorrect. BIC (Bayesian Information Criterion) penalizes model complexity more than AIC and typically has a higher (less negative) value than AIC for the same model. The assertion should be assert(bic > aic) or removed entirely if the relationship is not guaranteed.
| assert(bic < aic); // BIC should be more negative than AIC | |
| assert(bic > aic); // BIC should be less negative (higher) than AIC |
| } | ||
|
|
||
| double GARCH::get_aic() const { | ||
| return -2 * calculate_likelihood() + 6; // 3 parameters |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AIC formula uses a constant 6 which should be 2 * 3 = 6 for 3 parameters. However, this assumes the model always has exactly 3 parameters. Consider using a more explicit calculation like 2 * 3 to make the formula clearer and easier to maintain if the number of parameters changes in future GARCH variants.
| return -2 * calculate_likelihood() + 6; // 3 parameters | |
| return -2 * calculate_likelihood() + 2 * 3; // 3 parameters |
No description provided.