# UMass LIVING

## August 29, 2009

### (Sep 03 3-5pm) Pre-meeting Overview: Convex Relaxations for MAP Estimation (Kumar, Kolmogorov, Torr)

Filed under: Uncategorized — umassliving @ 5:03 pm
Tags: ,

This week we’ll be covering Pawan, Kolmogorov, and Torr, “An Analysis of Convex Relaxations for MAP Estimation of Discrete MRFs”, JMLR 2009.

A related paper is Kumar, Torr, and Zisserman, “Solving Markov Random Fields using Second Order Cone Programming Relaxations“, CVPR 2006, which may be useful as a quick introduction as well as an example of how to apply these methods to a vision problem (object recognition), but the main focus of the meeting will be on the JMLR paper.

The paper begins by describing the integer programming formulation of MAP estimation, equivalently energy minimization, hence the integer program and all relaxations are minimization problems in the paper.  Next, the paper gives three relaxations of the integer program – LP-S, the linear programming relaxation (the main one we dealt with in Wainwright); QP-RL, a quadratic programming relaxation; and SOCP-MS, a second order cone programming relaxation.  One of the focuses of the paper is to compare these different relaxations to one another, and describes how to do so in the next section, through the notion of dominance.  The paper goes on to prove that LP-S dominates SOCP-MS, and prove that SOCP-MS and QP-RL are equivalent.

The next focus of the paper is to prove that for a class of relaxations (on trees and certain types of cycles) LP-S strictly dominates SOCP-MS (and consequently QP-RL).  The paper then gives two additional types of constraints that can be added to the SOCP relaxation, such that the new SOCP relaxation dominates LP-S for some problems and has a feasibility region not a super-set of the LP-S feasibility region.

The majority of the paper is in the proofs, so beyond understanding the basic set-up, I think that’s where we should be spending our time.  In particular, if there are any steps that you do not follow, post them in the comments and we can try to work through it during the meeting.

# Notes:

Going from SDP to SOCP relaxation:

One fact that seems to be used without an explicit proof is $A \succeq 0 \Leftrightarrow A \bullet C \geq 0 \quad \forall C \succeq 0$, where $A \bullet C = \sum_{ij} A_{ij}C_{ij}$ is the Frobenius inner product on matrices.  Here’s a quick proof, where the second direction uses part of the CVPR paper.

First, note that $x x^T \succeq 0$, since for any vector $a$, $a^T x x^T a = (a^T x)^2 \geq 0$.  Therefore, $A \bullet C \geq 0 \quad \forall C \succeq 0 \Rightarrow A \bullet x x^T \geq 0 \quad \forall x$ $\Rightarrow \sum_{ij} A_{ij} x_i x_j \geq 0 \Rightarrow x^T A x \geq 0 \Rightarrow A \succeq 0$.

To go the reverse direction, note that any symmetric positive semidefinite matrix $C$ can be written as $UU^T$ using the Cholesky decomposition.  Let $A \succeq 0$, and $C = UU^T$ be any positive semidefinite matrix.  $A \bullet C = A \bullet UU^T = \sum_{ijk} A_{ij} U_{ik} U^T_{kj}$ $= \sum_{ijk} U^T_{ki} A_{ij} U_{jk} = \sum_k u^T_k A u_k \geq 0$, where $u_k$ is the $k$th column of $U$ (not row, as stated in the CVPR paper), and the last inequality uses the fact that $A \succeq 0$.

Recall that in the SDP relaxation, our positive semidefinite constraint was $X - xx^T \succeq 0$.  From the CVPR paper, we then have, for any positive semidefinite matrix $C = UU^T$, $(X - xx^T) \bullet C \geq 0 \Leftrightarrow xx^T \bullet C \leq X \bullet C$ $\Leftrightarrow x^T UU^T x \leq X \bullet C \Leftrightarrow \|U^T x\|^2 \leq X \bullet C$.

Finally, note that the final inequality is equivalent to the following SOCP constraint: $\|u\| \leq t$ where $u = \left( \begin{array}{c} 1 - X \bullet C \\ 2U^T x\end{array} \right)$ and $t = 1 + X \bullet C$, which can be verified by squaring both sides.

At this point, the SOCP relaxation is equivalent to the SDP relaxation.  However, the SOCP relaxation is made more tractable (at the cost of removing constraints) by only considering a subset of psd matrices $C$.  The typical set of $C$ used is those with all zeros except for a 2 x 2 submatrix corresponding to two vertices connected by an edge in the MRF.

1. While working out the SDP to SOCP relaxation, I found a proof for something we were unsure of earlier, that $N_1[\mu] \succeq 0 \Leftrightarrow \text{cov}(X) \succeq 0$, back in chapter 7 of Wainwright. (We only proved the left to right direction.) I’ve gone ahead and updated the meeting summary – https://umassliving.wordpress.com/2009/07/31/jul-30-meeting-summary-wainwright-jordan-ch-7/ – with the proof.