Many approximations were suggested to circumvent the cubic complexity of kernel-based algorithms, allowing their application to large-scale datasets. One strategy is to consider the primal formulation of the learning problem by mapping the data to a higher-dimensional space using tensor-product structured polynomial and Fourier features. The curse of dimensionality due to these tensor-product features was effectively solved by a tensor network reparameterization of the model parameters. However, another important aspect of model training — identifying optimal feature hyperparameters — has not been addressed and is typically handled using the standard cross-validation approach.
In this paper, we introduce the Feature Learning (FL) model, which addresses this issue by representing tensor-product features as a learnable Canonical Polyadic Decomposition (CPD). By leveraging this CPD structure, we efficiently learn the hyperparameters associated with different features alongside the model parameters using an Alternating Least Squares (ALS) optimization method. We prove the effectiveness of the FL model through experiments on real data of various dimensionality and scale. The results show that the FL model can be consistently trained 3-5 times faster than and have the prediction quality on par with a standard cross-validated model.
In this work, we use 5 publicly available UCI regression datasets (Dua and Graff, 2017): Airfoil, Energy, Yacht, Concrete, Wine. In order to show and explore the behavior of the FL model on large scale data, we consider the Airline dataset (Hensman et al., 2013), contatining recordings of commercial airplane flight delays that occurred in 2008 in the USA.
Datasets Statistics:
| Airfoil | Energy | Yacht | Concrete | Wine | Airline | |
|---|---|---|---|---|---|---|
| Sample Size | 1502 | 768 | 308 | 1030 | 6497 | 5929413 |
| Data Dimensionality | 5 | 8 | 6 | 8 | 11 | 8 |
We use conda package manager to install required python packages. Once conda is installed, run the following command (while in the root of the repository):
conda env create -f environment.yml
This will create a new environment named general_env with all required packages already installed. You can install additional packages by running:
conda install <package name>
In order to read and run Jupyter Notebooks you may follow either of two options:
- [recommended] using notebook-compatibility features of IDEs, e.g. via
pythonandjupyterextensions of VS Code. - install jupyter notebook packages:
either with
conda install jupyterlabor withconda install jupyter notebook
-
Create and activate the virtual environment (Environment section):
conda activate general_env -
Run:
python load_data.py
to load all the datasets locally and configure internal directories.
-
Run
experiments.ipynbin VS Code using thegeneral_envenvironment or usejupyter lab.
If you find our work helpful, please consider citing the paper:
@inproceedings{pmlr-v258-saiapin25a,
title={Tensor Network Based Feature Learning Model},
author={Saiapin, Albert and Batselier, Kim},
booktitle={Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
pages={3277-3285},
year={2025}
}