From 8d7fed7725db5cda4a0b33cc03775d91b63d93ad Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Fri, 27 Feb 2026 19:21:03 +0000 Subject: [PATCH 1/8] WIP: Make it more specific to code --- docs/source/math_description.rst | 56 ++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 25 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index e6c4437..b067b5d 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -4,37 +4,43 @@ Mathematical abstraction in UM-Bridge ===================================== -In this section, we will describe UM-Bridge's interface mathematically. +In this section, we will describe UM-Bridge's interface mathematically. Note that both inputs and +ouputs are required to be list of lists in actual implementation, but we only consider a single +element within the outer list to simply notation. -Model Evaluation -================ -Let :math:`\mathcal{F}` denote the numerical model that maps the model input vector, :math:`\mathbf{x}` to -the output vector :math:`\mathbf{f(\mathbf{x})}`: +Let :math:`F` denote the numerical model that maps the model input vector, :math:`\mathbf{\theta}` +to the output vector :math:`\mathbf{F}(\mathbf{\theta})`: .. math:: - \mathcal{F}\, : \, - \mathbf{x} + F\, : \, + \mathbb{R}^n \;\longrightarrow\; - \mathbf{f}(\mathbf{x}), \quad - \mathbf{x} \in \mathbb{R}^d, \; - \mathbf{f}(\mathbf{x}) \in \mathbb{R}^n. + \mathbb{R}^m. + +Additionally, there may be an objective function :math:`L = L(\mathbf{F}(\mathbf{\theta}))`. UM-Bridge +allows the following four operations. + +Model Evaluation +================ +This is simply the so called forward map that takes an input +:math:`\mathbf{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n` and returns the output +:math:`\mathbf{F}(\mathbf{\theta}) = (F(\mathbf{\theta})_1, \ldots, F(\mathbf{\theta})_m) \in \mathbb{R}^m`. Gradient Evaluation =================== The ``gradient`` function evaluates the sensitivity of a scalar -objective, :math:`L(\mathbf{f}(\mathbf{x}))`, that depends on the model output, with respect to the model input. Using the -chain rule: +objective. Using the chain rule: .. math:: - \nabla_{\mathbf{x}}L - = \left(\frac{\partial \mathbf{f}}{\partial \mathbf{x}}\right)^{\!\top} + \nabla_{\mathbf{\theta}}L + = \left(\frac{\partial \mathbf{f}}{\partial \mathbf{\theta}}\right)^{\!\top} \boldsymbol{\lambda}, \qquad \boldsymbol{\lambda} = \frac{\partial L}{\partial \mathbf{f}}, -where :math:`\lambda` is known as the sensitivity vector. +where :math:`\mathbf{\lambda}` is known as the sensitivity vector. Applying Jacobian @@ -46,7 +52,7 @@ is given by .. math:: J = - \frac{\partial \mathbf{f}}{\partial \mathbf{x}} = + \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}} = \left[ \begin{array}{ccc} \dfrac{\partial \mathbf{f}}{\partial x_1} & \cdots & \dfrac{\partial \mathbf{f}}{\partial x_d} @@ -67,9 +73,9 @@ The output of this function for a chosen :math:`\mathbf{v} \in \mathbb{R}^{d}` i .. math:: \texttt{output} = J\,\mathbf{v} - = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}\,\mathbf{v}. + = \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}}\,\mathbf{v}. -Additionally, we can use this (or vice versa) to expression the ``gradient`` function by setting +Additionally, we can use this (or vice versa) to express the ``gradient`` function by setting :math:`\mathbf{v} = \mathbf{\lambda}`. @@ -81,13 +87,13 @@ the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is g .. math:: H = - \frac{\partial^2 L}{\partial \mathbf{x}\,\partial \mathbf{x}} - = \frac{\partial}{\partial \mathbf{x}} + \frac{\partial^2 L}{\partial \mathbf{\theta}\,\partial \mathbf{\theta}} + = \frac{\partial}{\partial \mathbf{\theta}} \left( - \frac{\partial \mathbf{f}}{\partial \mathbf{x}} + \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}} \right)^{\!\top} \boldsymbol{\lambda} = - H = \begin{bmatrix} + \begin{bmatrix} \dfrac{\partial^2 L}{\partial x_1^2} & \dfrac{\partial^2 L}{\partial x_1 \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial x_1 \partial x_n} \\[18pt] \dfrac{\partial^2 L}{\partial x_2 \partial x_1} & \dfrac{\partial^2 L}{\partial x_2^2} & \cdots & \dfrac{\partial^2 L}{\partial x_2 \partial x_n} \\[18pt] \vdots & \vdots & \ddots & \vdots \\[6pt] @@ -101,9 +107,9 @@ So the output for a chosen vector can be written as .. math:: H\,\mathbf{v} - = \frac{\partial^2 \mathcal{L}}{\partial \mathbf{x}\,\partial \mathbf{x}}\,\mathbf{v} = - \left[\frac{\partial}{\partial \mathbf{x}} + = \frac{\partial^2 \mathcal{L}}{\partial \mathbf{\theta}\,\partial \mathbf{\theta}}\,\mathbf{v} = + \left[\frac{\partial}{\partial \mathbf{\theta}} \left( - \frac{\partial \mathbf{f}}{\partial \mathbf{x}} + \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}} \right)^{\!\top} \boldsymbol{\lambda}\right]\,\mathbf{v}. From e481db99c33c8b893bd832a139d156424b1afcfe Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Sun, 1 Mar 2026 16:21:48 +0000 Subject: [PATCH 2/8] WIP:Hessian left --- docs/source/math_description.rst | 87 ++++++++++++++++++++------------ 1 file changed, 55 insertions(+), 32 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index b067b5d..691d1ef 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -5,8 +5,8 @@ Mathematical abstraction in UM-Bridge ===================================== In this section, we will describe UM-Bridge's interface mathematically. Note that both inputs and -ouputs are required to be list of lists in actual implementation, but we only consider a single -element within the outer list to simply notation. +ouputs are required to be a list of lists in the actual implementation, but we only consider a single +element within the outer list to simply the notation from hereon. Let :math:`F` denote the numerical model that maps the model input vector, :math:`\mathbf{\theta}` to the output vector :math:`\mathbf{F}(\mathbf{\theta})`: @@ -17,72 +17,96 @@ to the output vector :math:`\mathbf{F}(\mathbf{\theta})`: \;\longrightarrow\; \mathbb{R}^m. -Additionally, there may be an objective function :math:`L = L(\mathbf{F}(\mathbf{\theta}))`. UM-Bridge -allows the following four operations. +Additionally, there may be an objective function :math:`L = L(\mathbf{F}(\mathbf{\theta}))`. + +UM-Bridge allows the following four operations. Model Evaluation ================ This is simply the so called forward map that takes an input -:math:`\mathbf{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n` and returns the output +:math:`\mathbf{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n` and returns the model output :math:`\mathbf{F}(\mathbf{\theta}) = (F(\mathbf{\theta})_1, \ldots, F(\mathbf{\theta})_m) \in \mathbb{R}^m`. -Gradient Evaluation -=================== +Gradient of the objective +========================= -The ``gradient`` function evaluates the sensitivity of a scalar -objective. Using the chain rule: +The gradient function evaluates the sensitivity of the scalar objective. Using the chain rule: .. math:: + :name: eq:1 + \nabla_{\mathbf{\theta}}L - = \left(\frac{\partial \mathbf{f}}{\partial \mathbf{\theta}}\right)^{\!\top} + = \left(\frac{\partial \mathbf{F}}{\partial \mathbf{\theta}}\right)^{\!\top} \boldsymbol{\lambda}, \qquad - \boldsymbol{\lambda} = \frac{\partial L}{\partial \mathbf{f}}, + \boldsymbol{\lambda} = \frac{\partial L}{\partial \mathbf{F}}, where :math:`\mathbf{\lambda}` is known as the sensitivity vector. +Most UQ algorithms do not evaluate the full gradient vector but rather select a specific +component within the input (:math:`\theta_i`) and output vectors (:math:`F_j`). These indices are +chosen using ``inWrt`` and ``outWrt``, respectively, in the implementation. So :ref:`(1) ` becomes + +.. math:: + + \frac{\partial L}{\partial \theta_i} + = \frac{\partial F_j}{\partial \theta_i} + \lambda_j, + \qquad + \lambda_j = \frac{\partial L}{\partial F_j}, + +where :math:`\lambda_j` is the ``sens`` argument in the code. -Applying Jacobian -================= +Applying Jacobian to a vector +============================= -The ``apply_jacobian`` function evaluates the product of the model's Jacobian, :math:`J`, and a -vector, :math:`\mathbf{v}`, of the user's choice. The Jacobian of a vector-valued function +The apply jacobian function evaluates the product of the model's Jacobian, :math:`J`, and a +vector, :math:`\mathbf{v}`, of the user's choice (``vec``). The Jacobian of a vector-valued function is given by .. math:: J = - \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}} = + \frac{\partial \mathbf{F}}{\partial \mathbf{\theta}} = \left[ \begin{array}{ccc} - \dfrac{\partial \mathbf{f}}{\partial x_1} & \cdots & \dfrac{\partial \mathbf{f}}{\partial x_d} + \dfrac{\partial \mathbf{F}}{\partial \theta_1} & \cdots & \dfrac{\partial \mathbf{F}}{\partial \theta_n} \end{array} \right] = \begin{pmatrix} - \dfrac{\partial f_{1}}{\partial x_{1}} & \cdots & - \dfrac{\partial f_{1}}{\partial x_{d}} \\[12pt] + \dfrac{\partial F_{1}}{\partial \theta_{1}} & \cdots & + \dfrac{\partial F_{1}}{\partial \theta_{n}} \\[12pt] \vdots & \ddots & \vdots \\[4pt] - \dfrac{\partial f_{n}}{\partial x_{1}} & \cdots & - \dfrac{\partial f_{n}}{\partial x_{d}} + \dfrac{\partial F_{n}}{\partial \theta_{1}} & \cdots & + \dfrac{\partial F_{n}}{\partial \theta_{n}} \end{pmatrix} - \in \mathbb{R}^{n \times d}. + \in \mathbb{R}^{m \times n}. -The output of this function for a chosen :math:`\mathbf{v} \in \mathbb{R}^{d}` is then +For a chosen :math:`\mathbf{v} \in \mathbb{R}^{n}`, this is simply .. math:: - \texttt{output} - = J\,\mathbf{v} - = \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}}\,\mathbf{v}. + J\,\mathbf{v} + = \frac{\partial \mathbf{F}}{\partial \mathbf{\theta}}\,\mathbf{v}. -Additionally, we can use this (or vice versa) to express the ``gradient`` function by setting +Additionally, we can use this to express the gradient function by setting :math:`\mathbf{v} = \mathbf{\lambda}`. +However, we don't actually assemble the full Jacobian. We apply specific indices of the Jacobian, +:math:`J_{ji} = \frac{\partial F_j}{\partial \theta_i}`, to the vector instead. The output of this +action is then -Applying Hessian -================ +.. math:: + \texttt{output} = + J_{ji}\,\mathbf{v} + = \frac{\partial F_j}{\partial \theta_i}\,\mathbf{v}, + +where the the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` and ``outWrt``. + +Applying Hessian to a vector +============================ -This is a combination of the previous two sections: the output is still a matrix-vector product, but +This is a combination of the previous two sections: the action is still a matrix-vector product, but the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is given by .. math:: @@ -100,8 +124,7 @@ the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is g \dfrac{\partial^2 L}{\partial x_n \partial x_1} & \dfrac{\partial^2 L}{\partial x_n \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial x_n^2} \end{bmatrix}, -where :math:`L` is the objective function and :math:`\mathbf{\lambda}` is the sensitivity vector as defined in the ``gradient`` -section. +where :math:`L` is the objective function and :math:`\mathbf{\lambda}` is the sensitivity vector as defined previously. So the output for a chosen vector can be written as From 1817db4109a8bb1f99de02c23bfeb555e1d2cac1 Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Sun, 1 Mar 2026 23:31:02 +0000 Subject: [PATCH 3/8] To check correctness --- docs/source/math_description.rst | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index 691d1ef..c799f8d 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -61,7 +61,7 @@ where :math:`\lambda_j` is the ``sens`` argument in the code. Applying Jacobian to a vector ============================= -The apply jacobian function evaluates the product of the model's Jacobian, :math:`J`, and a +The apply Jacobian function evaluates the product of the model's Jacobian, :math:`J`, and a vector, :math:`\mathbf{v}`, of the user's choice (``vec``). The Jacobian of a vector-valued function is given by @@ -106,7 +106,7 @@ where the the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` Applying Hessian to a vector ============================ -This is a combination of the previous two sections: the action is still a matrix-vector product, but +The apply Hessian action is a combination of the previous two sections: the action is still a matrix-vector product, but the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is given by .. math:: @@ -126,7 +126,7 @@ the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is g where :math:`L` is the objective function and :math:`\mathbf{\lambda}` is the sensitivity vector as defined previously. -So the output for a chosen vector can be written as +So the product of :math:`H` and the chosen vector can be written as .. math:: H\,\mathbf{v} @@ -136,3 +136,15 @@ So the output for a chosen vector can be written as \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}} \right)^{\!\top} \boldsymbol{\lambda}\right]\,\mathbf{v}. + +Again, we don't evaluate the full Hessian in UM-Bridge. As in the apply Jacobian action, we select certain indices and +apply them the vector. Since :math:`H` contains the seconds derivative of :math:`L`, we require two indices for the input: +:math:`inWrt1` and :math:`inWrt2`. The output of this action is + +.. math:: + \texttt{output} = + \left( \frac{\partial}{\partial \theta_i} + \left[ \frac{\partial F_k}{\partial x_j} \, \lambda_k \right] + \, \mathbf{v} \right). + + From 7bc32aeb4d08fef58c302211c8779386769774c0 Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Tue, 3 Mar 2026 10:46:43 +0000 Subject: [PATCH 4/8] typo fix --- docs/source/math_description.rst | 60 ++++++++++++++++---------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index c799f8d..fdf164c 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -8,8 +8,8 @@ In this section, we will describe UM-Bridge's interface mathematically. Note tha ouputs are required to be a list of lists in the actual implementation, but we only consider a single element within the outer list to simply the notation from hereon. -Let :math:`F` denote the numerical model that maps the model input vector, :math:`\mathbf{\theta}` -to the output vector :math:`\mathbf{F}(\mathbf{\theta})`: +Let :math:`F` denote the numerical model that maps the model input vector, :math:`\boldsymbol{\theta}` +to the output vector :math:`\mathbf{F}(\boldsymbol{\theta})`: .. math:: F\, : \, @@ -17,15 +17,15 @@ to the output vector :math:`\mathbf{F}(\mathbf{\theta})`: \;\longrightarrow\; \mathbb{R}^m. -Additionally, there may be an objective function :math:`L = L(\mathbf{F}(\mathbf{\theta}))`. +Additionally, there may be an objective function :math:`L = L(\mathbf{F}(\boldsymbol{\theta}))`. UM-Bridge allows the following four operations. Model Evaluation ================ This is simply the so called forward map that takes an input -:math:`\mathbf{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n` and returns the model output -:math:`\mathbf{F}(\mathbf{\theta}) = (F(\mathbf{\theta})_1, \ldots, F(\mathbf{\theta})_m) \in \mathbb{R}^m`. +:math:`\boldsymbol{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n` and returns the model output +:math:`\mathbf{F}(\boldsymbol{\theta}) = (F(\boldsymbol{\theta})_1, \ldots, F(\boldsymbol{\theta})_m) \in \mathbb{R}^m`. Gradient of the objective @@ -36,13 +36,13 @@ The gradient function evaluates the sensitivity of the scalar objective. Using t .. math:: :name: eq:1 - \nabla_{\mathbf{\theta}}L - = \left(\frac{\partial \mathbf{F}}{\partial \mathbf{\theta}}\right)^{\!\top} + \nabla_{\boldsymbol{\theta}}L + = \left(\frac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}\right)^{\!\top} \boldsymbol{\lambda}, \qquad \boldsymbol{\lambda} = \frac{\partial L}{\partial \mathbf{F}}, -where :math:`\mathbf{\lambda}` is known as the sensitivity vector. +where :math:`\boldsymbol{\lambda}` is known as the sensitivity vector. Most UQ algorithms do not evaluate the full gradient vector but rather select a specific component within the input (:math:`\theta_i`) and output vectors (:math:`F_j`). These indices are @@ -50,11 +50,11 @@ chosen using ``inWrt`` and ``outWrt``, respectively, in the implementation. So : .. math:: - \frac{\partial L}{\partial \theta_i} - = \frac{\partial F_j}{\partial \theta_i} + \dfrac{\partial L}{\partial \theta_i} + = \dfrac{\partial F_j}{\partial \theta_i} \lambda_j, \qquad - \lambda_j = \frac{\partial L}{\partial F_j}, + \lambda_j = \dfrac{\partial L}{\partial F_j}, where :math:`\lambda_j` is the ``sens`` argument in the code. @@ -67,7 +67,7 @@ is given by .. math:: J = - \frac{\partial \mathbf{F}}{\partial \mathbf{\theta}} = + \frac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}} = \left[ \begin{array}{ccc} \dfrac{\partial \mathbf{F}}{\partial \theta_1} & \cdots & \dfrac{\partial \mathbf{F}}{\partial \theta_n} @@ -87,10 +87,10 @@ For a chosen :math:`\mathbf{v} \in \mathbb{R}^{n}`, this is simply .. math:: J\,\mathbf{v} - = \frac{\partial \mathbf{F}}{\partial \mathbf{\theta}}\,\mathbf{v}. + = \dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}\,\mathbf{v}. Additionally, we can use this to express the gradient function by setting -:math:`\mathbf{v} = \mathbf{\lambda}`. +:math:`\mathbf{v} = \boldsymbol{\lambda}`. However, we don't actually assemble the full Jacobian. We apply specific indices of the Jacobian, :math:`J_{ji} = \frac{\partial F_j}{\partial \theta_i}`, to the vector instead. The output of this @@ -99,7 +99,7 @@ action is then .. math:: \texttt{output} = J_{ji}\,\mathbf{v} - = \frac{\partial F_j}{\partial \theta_i}\,\mathbf{v}, + = \dfrac{\partial F_j}{\partial \theta_i}\,\mathbf{v}, where the the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` and ``outWrt``. @@ -111,40 +111,40 @@ the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is g .. math:: H = - \frac{\partial^2 L}{\partial \mathbf{\theta}\,\partial \mathbf{\theta}} - = \frac{\partial}{\partial \mathbf{\theta}} + \frac{\partial^2 L}{\partial \boldsymbol{\theta}\,\partial \boldsymbol{\theta}} + = \frac{\partial}{\partial \boldsymbol{\theta}} \left( - \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}} + \frac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}} \right)^{\!\top} \boldsymbol{\lambda} = \begin{bmatrix} - \dfrac{\partial^2 L}{\partial x_1^2} & \dfrac{\partial^2 L}{\partial x_1 \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial x_1 \partial x_n} \\[18pt] - \dfrac{\partial^2 L}{\partial x_2 \partial x_1} & \dfrac{\partial^2 L}{\partial x_2^2} & \cdots & \dfrac{\partial^2 L}{\partial x_2 \partial x_n} \\[18pt] + \dfrac{\partial^2 L}{\partial \theta_1^2} & \dfrac{\partial^2 L}{\partial \theta_1 \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_1 \partial \theta_n} \\[18pt] + \dfrac{\partial^2 L}{\partial \theta_2 \partial \theta_1} & \dfrac{\partial^2 L}{\partial \theta_2^2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_2 \partial \theta_n} \\[18pt] \vdots & \vdots & \ddots & \vdots \\[6pt] - \dfrac{\partial^2 L}{\partial x_n \partial x_1} & \dfrac{\partial^2 L}{\partial x_n \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial x_n^2} + \dfrac{\partial^2 L}{\partial \theta_n \partial \theta_1} & \dfrac{\partial^2 L}{\partial \theta_n \partial \theta_2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_n^2} \end{bmatrix}, -where :math:`L` is the objective function and :math:`\mathbf{\lambda}` is the sensitivity vector as defined previously. +where :math:`L` is the objective function and :math:`\boldsymbol{\lambda}` is the sensitivity vector as defined previously. So the product of :math:`H` and the chosen vector can be written as .. math:: H\,\mathbf{v} - = \frac{\partial^2 \mathcal{L}}{\partial \mathbf{\theta}\,\partial \mathbf{\theta}}\,\mathbf{v} = - \left[\frac{\partial}{\partial \mathbf{\theta}} + = \dfrac{\partial^2 L}{\partial \boldsymbol{\theta}\,\partial \boldsymbol{\theta}}\,\mathbf{v} = + \left[\dfrac{\partial}{\partial \boldsymbol{\theta}} \left( - \frac{\partial \mathbf{f}}{\partial \mathbf{\theta}} + \dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}} \right)^{\!\top} \boldsymbol{\lambda}\right]\,\mathbf{v}. Again, we don't evaluate the full Hessian in UM-Bridge. As in the apply Jacobian action, we select certain indices and -apply them the vector. Since :math:`H` contains the seconds derivative of :math:`L`, we require two indices for the input: -:math:`inWrt1` and :math:`inWrt2`. The output of this action is +apply them the vector. Since :math:`H` contains the second derivative of :math:`L`, we require two indices for the input: +``inWrt1`` and ``inWrt2``. The output of this action is .. math:: \texttt{output} = - \left( \frac{\partial}{\partial \theta_i} - \left[ \frac{\partial F_k}{\partial x_j} \, \lambda_k \right] - \, \mathbf{v} \right). + \left( \dfrac{\partial}{\partial \theta_i} + \left[ \dfrac{\partial F_k}{\partial \theta_j} \, \lambda_k \right] \right) + \, \mathbf{v}. From c704c08b9f51e3c90383a49ecd3bd2d0a7619679 Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Wed, 4 Mar 2026 10:20:55 +0000 Subject: [PATCH 5/8] Typo fixes --- docs/source/math_description.rst | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index fdf164c..3c8d9f9 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -6,13 +6,13 @@ Mathematical abstraction in UM-Bridge In this section, we will describe UM-Bridge's interface mathematically. Note that both inputs and ouputs are required to be a list of lists in the actual implementation, but we only consider a single -element within the outer list to simply the notation from hereon. +element within the outer list to simplify the notation from hereon. -Let :math:`F` denote the numerical model that maps the model input vector, :math:`\boldsymbol{\theta}` +Let :math:`\mathbf{F}` denote the numerical model that maps the model input vector, :math:`\boldsymbol{\theta}` to the output vector :math:`\mathbf{F}(\boldsymbol{\theta})`: .. math:: - F\, : \, + \mathbf{F}\, : \, \mathbb{R}^n \;\longrightarrow\; \mathbb{R}^m. @@ -77,8 +77,8 @@ is given by \dfrac{\partial F_{1}}{\partial \theta_{1}} & \cdots & \dfrac{\partial F_{1}}{\partial \theta_{n}} \\[12pt] \vdots & \ddots & \vdots \\[4pt] - \dfrac{\partial F_{n}}{\partial \theta_{1}} & \cdots & - \dfrac{\partial F_{n}}{\partial \theta_{n}} + \dfrac{\partial F_{m}}{\partial \theta_{1}} & \cdots & + \dfrac{\partial F_{m}}{\partial \theta_{n}} \end{pmatrix} \in \mathbb{R}^{m \times n}. @@ -101,7 +101,7 @@ action is then J_{ji}\,\mathbf{v} = \dfrac{\partial F_j}{\partial \theta_i}\,\mathbf{v}, -where the the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` and ``outWrt``. +where the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` and ``outWrt``. Applying Hessian to a vector ============================ @@ -118,7 +118,7 @@ the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is g \right)^{\!\top} \boldsymbol{\lambda} = \begin{bmatrix} - \dfrac{\partial^2 L}{\partial \theta_1^2} & \dfrac{\partial^2 L}{\partial \theta_1 \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_1 \partial \theta_n} \\[18pt] + \dfrac{\partial^2 L}{\partial \theta_1^2} & \dfrac{\partial^2 L}{\partial \theta_1 \partial \theta_2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_1 \partial \theta_n} \\[18pt] \dfrac{\partial^2 L}{\partial \theta_2 \partial \theta_1} & \dfrac{\partial^2 L}{\partial \theta_2^2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_2 \partial \theta_n} \\[18pt] \vdots & \vdots & \ddots & \vdots \\[6pt] \dfrac{\partial^2 L}{\partial \theta_n \partial \theta_1} & \dfrac{\partial^2 L}{\partial \theta_n \partial \theta_2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_n^2} From dd24b9e16d248d84eb7f8cb50937d3b4a6a9bd2b Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Wed, 11 Mar 2026 17:52:52 +0000 Subject: [PATCH 6/8] added more clarification and fixes some typo --- docs/source/math_description.rst | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index 3c8d9f9..e3a9fb0 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -42,7 +42,9 @@ The gradient function evaluates the sensitivity of the scalar objective. Using t \qquad \boldsymbol{\lambda} = \frac{\partial L}{\partial \mathbf{F}}, -where :math:`\boldsymbol{\lambda}` is known as the sensitivity vector. +where :math:`\boldsymbol{\lambda}` is known as the sensitivity vector and +:math:`\dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}` is actually the Jacobian of the +forward map. Most UQ algorithms do not evaluate the full gradient vector but rather select a specific component within the input (:math:`\theta_i`) and output vectors (:math:`F_j`). These indices are @@ -58,10 +60,15 @@ chosen using ``inWrt`` and ``outWrt``, respectively, in the implementation. So : where :math:`\lambda_j` is the ``sens`` argument in the code. +The output of this operation is a vector even though we are essentially selecting an element of the +Jacobian through ``inWrt`` and ``outWrt``. For the multiplication with ``sens`` to make sense in +accordance with :ref:`(1) `, ``sens`` must be a zero vector of length :math:`n` except at +the :math:`i^{th}` location where the value is :math:`\lambda_j`. + Applying Jacobian to a vector ============================= -The apply Jacobian function evaluates the product of the model's Jacobian, :math:`J`, and a +The apply Jacobian function evaluates the product of the transpose of the model's Jacobian, :math:`J^{\top}`, and a vector, :math:`\mathbf{v}`, of the user's choice (``vec``). The Jacobian of a vector-valued function is given by @@ -86,11 +93,11 @@ is given by For a chosen :math:`\mathbf{v} \in \mathbb{R}^{n}`, this is simply .. math:: - J\,\mathbf{v} - = \dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}\,\mathbf{v}. + J^{\!\top}\,\mathbf{v} + = \left( \dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}} \right) ^ {\!\top} \,\mathbf{v}. Additionally, we can use this to express the gradient function by setting -:math:`\mathbf{v} = \boldsymbol{\lambda}`. +:math:`\mathbf{v} = \boldsymbol{\lambda}` as mentioned before. However, we don't actually assemble the full Jacobian. We apply specific indices of the Jacobian, :math:`J_{ji} = \frac{\partial F_j}{\partial \theta_i}`, to the vector instead. The output of this @@ -103,6 +110,10 @@ action is then where the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` and ``outWrt``. +Unlike the gradient operation, the vector :math:`\mathbf{v}` is not required to be a zero vector except at certain index, +but it should still be of length :math:`n`, which is the size of the input dimension. This also applies to the following +action. + Applying Hessian to a vector ============================ From 99d82bb7ac9f633bf44a93c8ed8b75a40fcf2628 Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Thu, 12 Mar 2026 23:32:13 +0000 Subject: [PATCH 7/8] Rewrite math to accomodate list of lists --- docs/source/math_description.rst | 65 ++++++++++++++++---------------- 1 file changed, 33 insertions(+), 32 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index e3a9fb0..a59c5fd 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -4,18 +4,22 @@ Mathematical abstraction in UM-Bridge ===================================== -In this section, we will describe UM-Bridge's interface mathematically. Note that both inputs and -ouputs are required to be a list of lists in the actual implementation, but we only consider a single -element within the outer list to simplify the notation from hereon. +In this section, we will describe UM-Bridge's interface mathematically. Let :math:`\mathbf{F}` denote the numerical model that maps the model input vector, :math:`\boldsymbol{\theta}` -to the output vector :math:`\mathbf{F}(\boldsymbol{\theta})`: +to the output vector :math:`\mathbf{F}(\boldsymbol{\theta})`. We will use bold font to +indicate vectors. Note that both inputs and ouputs are required to be a list of lists in the actual +implementation. For a list of :math:`d` input vectors each with :math:`n` dimensions, we have .. math:: \mathbf{F}\, : \, - \mathbb{R}^n + \mathbb{R}^n \times d \;\longrightarrow\; - \mathbb{R}^m. + \mathbb{R}^m \times d. + +The arguments ``inWrt`` and ``outWrt`` in functions, where derivatives are involved, allow the user to +select particular indices (out of :math:`d` indices) at which the derivative should be evaluated with +respect to. However, more of this will be clarified in the respective sections. Additionally, there may be an objective function :math:`L = L(\mathbf{F}(\boldsymbol{\theta}))`. @@ -23,8 +27,9 @@ UM-Bridge allows the following four operations. Model Evaluation ================ -This is simply the so called forward map that takes an input -:math:`\boldsymbol{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n` and returns the model output + +This is simply the so called forward map that takes an element from the list of input vectors, +:math:`\boldsymbol{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n`, and returns the model output, :math:`\mathbf{F}(\boldsymbol{\theta}) = (F(\boldsymbol{\theta})_1, \ldots, F(\boldsymbol{\theta})_m) \in \mathbb{R}^m`. @@ -46,24 +51,24 @@ where :math:`\boldsymbol{\lambda}` is known as the sensitivity vector and :math:`\dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}` is actually the Jacobian of the forward map. -Most UQ algorithms do not evaluate the full gradient vector but rather select a specific -component within the input (:math:`\theta_i`) and output vectors (:math:`F_j`). These indices are -chosen using ``inWrt`` and ``outWrt``, respectively, in the implementation. So :ref:`(1) ` becomes +Since there are multiple choices due to the format of the input and output, we can select a specific +component within the input (:math:`\boldsymbol{\theta}_i \in \mathbb{R}^n`) and output list of +lists (:math:`\mathbf{F}_j \in \mathbb{R}^m`). These indices are chosen using ``inWrt`` and ``outWrt``, respectively, +in the implementation. + +So :ref:`(1) ` becomes .. math:: - \dfrac{\partial L}{\partial \theta_i} - = \dfrac{\partial F_j}{\partial \theta_i} - \lambda_j, + \dfrac{\partial L}{\partial \boldsymbol{\theta}_i} + = \dfrac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i} + \boldsymbol{\lambda}_j, \qquad - \lambda_j = \dfrac{\partial L}{\partial F_j}, + \boldsymbol{\lambda}_j = \dfrac{\partial L}{\partial \mathbf{F}_j}, -where :math:`\lambda_j` is the ``sens`` argument in the code. +where :math:`\boldsymbol{\lambda}_j` is the ``sens`` argument in the code. -The output of this operation is a vector even though we are essentially selecting an element of the -Jacobian through ``inWrt`` and ``outWrt``. For the multiplication with ``sens`` to make sense in -accordance with :ref:`(1) `, ``sens`` must be a zero vector of length :math:`n` except at -the :math:`i^{th}` location where the value is :math:`\lambda_j`. +The output of this operation is a vector because we are essentially doing a matrix vector product. Applying Jacobian to a vector ============================= @@ -99,21 +104,17 @@ For a chosen :math:`\mathbf{v} \in \mathbb{R}^{n}`, this is simply Additionally, we can use this to express the gradient function by setting :math:`\mathbf{v} = \boldsymbol{\lambda}` as mentioned before. -However, we don't actually assemble the full Jacobian. We apply specific indices of the Jacobian, -:math:`J_{ji} = \frac{\partial F_j}{\partial \theta_i}`, to the vector instead. The output of this +However, as before, we can choose an index each from the input and output to construct the Jacobian such that +:math:`J_{ji} = \frac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i}`. The output of this action is then .. math:: \texttt{output} = J_{ji}\,\mathbf{v} - = \dfrac{\partial F_j}{\partial \theta_i}\,\mathbf{v}, + = \dfrac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i}\,\mathbf{v}, where the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` and ``outWrt``. -Unlike the gradient operation, the vector :math:`\mathbf{v}` is not required to be a zero vector except at certain index, -but it should still be of length :math:`n`, which is the size of the input dimension. This also applies to the following -action. - Applying Hessian to a vector ============================ @@ -137,7 +138,7 @@ the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is g where :math:`L` is the objective function and :math:`\boldsymbol{\lambda}` is the sensitivity vector as defined previously. -So the product of :math:`H` and the chosen vector can be written as +So the product of :math:`H` and the chosen vector (of size :math:`n`) can be written as .. math:: H\,\mathbf{v} @@ -148,14 +149,14 @@ So the product of :math:`H` and the chosen vector can be written as \right)^{\!\top} \boldsymbol{\lambda}\right]\,\mathbf{v}. -Again, we don't evaluate the full Hessian in UM-Bridge. As in the apply Jacobian action, we select certain indices and -apply them the vector. Since :math:`H` contains the second derivative of :math:`L`, we require two indices for the input: +As in the apply Jacobian action, we can select certain indices from the list of lists to construct the Hessian. +Since :math:`H` contains the second derivative of :math:`L`, we require two indices from the input: ``inWrt1`` and ``inWrt2``. The output of this action is .. math:: \texttt{output} = - \left( \dfrac{\partial}{\partial \theta_i} - \left[ \dfrac{\partial F_k}{\partial \theta_j} \, \lambda_k \right] \right) + \left( \dfrac{\partial}{\partial \boldsymbol{\theta}_i} + \left[ \dfrac{\partial \mathbf{F}_k}{\partial \boldsymbol{\theta}_j} \, \boldsymbol{\lambda}_k \right] \right) \, \mathbf{v}. From 49c406cf6078ca0d6ddbce7069ea54a0dc2718e5 Mon Sep 17 00:00:00 2001 From: chun9l <97897047+chun9l@users.noreply.github.com> Date: Fri, 13 Mar 2026 11:04:01 +0000 Subject: [PATCH 8/8] typo fixews --- docs/source/math_description.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst index a59c5fd..bebc337 100644 --- a/docs/source/math_description.rst +++ b/docs/source/math_description.rst @@ -13,9 +13,9 @@ implementation. For a list of :math:`d` input vectors each with :math:`n` dimens .. math:: \mathbf{F}\, : \, - \mathbb{R}^n \times d + \mathbb{R}^{n \times d} \;\longrightarrow\; - \mathbb{R}^m \times d. + \mathbb{R}^{m \times d}. The arguments ``inWrt`` and ``outWrt`` in functions, where derivatives are involved, allow the user to select particular indices (out of :math:`d` indices) at which the derivative should be evaluated with @@ -60,8 +60,8 @@ So :ref:`(1) ` becomes .. math:: - \dfrac{\partial L}{\partial \boldsymbol{\theta}_i} - = \dfrac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i} + \nabla_{\boldsymbol{\theta}_i} + = \left( \dfrac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i} \right) ^ {\!\top} \boldsymbol{\lambda}_j, \qquad \boldsymbol{\lambda}_j = \dfrac{\partial L}{\partial \mathbf{F}_j}, @@ -156,7 +156,7 @@ Since :math:`H` contains the second derivative of :math:`L`, we require two indi .. math:: \texttt{output} = \left( \dfrac{\partial}{\partial \boldsymbol{\theta}_i} - \left[ \dfrac{\partial \mathbf{F}_k}{\partial \boldsymbol{\theta}_j} \, \boldsymbol{\lambda}_k \right] \right) + \left[ \left( \dfrac{\partial \mathbf{F}_k}{\partial \boldsymbol{\theta}_j} \right) ^ {\!\top} \, \boldsymbol{\lambda}_k \right] \right) \, \mathbf{v}.