Matrix-by-matrix derivative formula

$\begingroup$

I need to derive $\frac{\delta(X^{T}MX)}{\delta X}$, where $X$ and $M$ are $n \times n$ matrices.

I know that $\frac{\delta(AXB)}{\delta X}=B^{T} \otimes A$ but am having a hard time deriving what I need from that or from scratch.

$\endgroup$ 2

3 Answers

$\begingroup$

Start with a matrix function, then take the differential, then vectorize and identify the gradient.

$$\eqalign{ F &= X^TMX \cr dF &= dX^TMX + X^TMdX \cr {\rm vec}(dF) &= {\rm vec}(dX^TMX) + {\rm vec}(X^TMdX) \cr df&=(X^TM^T\otimes I){\rm vec}(dX^T) + (I\otimes X^TM){\rm vec}(dX)\cr &= \Big((X^TM^T\otimes I)K + (I\otimes X^TM)\Big)\,{\rm vec}(dX) \cr \frac{\partial f}{\partial x} &= (X^TM^T\otimes I)K + (I\otimes X^TM) \cr }$$ where $K$ is the Commutation Matrix for Kronecker products.

$\endgroup$ $\begingroup$

Let $f(X) = X^T M X$. Then for a variation $\epsilon Y$, with $\epsilon$ a real number, we have by direct calculation $$ f(X+\epsilon Y) = f(X) + \epsilon \left( Y^T M X + X^T M Y \right) + \epsilon^2 \left( Y^T M Y \right) $$ Therefore, we can compute the directional derivative as follows: $$ \nabla_Y f(X) := \lim_{\epsilon \to 0} \frac{f(X+\epsilon Y)-f(X)}{\epsilon} = Y^T MX + X^T M Y. $$ Hence, the derivative $\nabla f$ at $X$ is the linear map $$ \nabla f(X): Y \mapsto Y^T M X + X^T M Y $$

$\endgroup$ $\begingroup$

$\Phi: M_n \times M_n \to M_n$ given by $\Phi(X,Y) = X^T M Y$ is a bilinear form on a finite dimensional nvs, so $\Phi$ is bounded. Hence, $\Phi$ is differentiable and:

$$D \Phi(X,Y)(H,K) = \Phi(H,Y) + \Phi(X,K) = H^T M Y + X^T M K$$

for all $X,Y,H,K \in M_n$.

Let $q: M_n \to M_n$, $q(X) = \Phi(X,X)$. Then, $q$ is differentiable and for all $X, H \in M_n$,

$$Dq(X)H = D\Phi(X,X)(H,H) = H^T M X + X^T M H$$

$\endgroup$

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like