Computational Symplectic Capacities

Graduate student team-based research workshop supported by the Institute for Computational and Experimental Research in Mathematics (ICERM), and by the European Research Council (ERC)

About the Project

In May 2015 a group of graduate students gathered for the Computational Symplectic Topology Workshop's first seat, in which they started work on the symplectic capacities project. The goal of this project was to design a computer program that calculates a symplectic invariant called symplectic capacity, specifically the symplectic capacity known as Ekeland-Hofer-Zehnder capacity for convex domains in the classical phase space.

The development had started off by implementing an algorithm given in the PhD thesis of Göing-Jaeschke from 1998 (see [GJ]). The team then adapted it according to the project's needs, also using new computational tools now available, such as some of MATLAB's built-in functions.

Once this was achieved, the team went on to use this computational tool and pursue with ambition numerical evidence to some famous long standing conjectures regarding symplectic capacities, such as Viterbo's conjecture, which is detailed in the background section of this site; and moreover, attempt to demystify several aspect of the behaviour of the capcity function.

This website is meant to succinctly explain about the project and its results.
The implementation of the project is written in MATLAB, and its code is based here (github), including test results.

Theoretical Background

We give a brief survey of the required theoretical background from symplectic topology, a full theoretical exposition can be found in the books by McDuff and Salamon ([McDS]) or by Hofer and Zehnder ([HZ2]).

Symplectic Manifolds and Symplectic Capacities

A symplectic manifold $(M,\om)$ is a smooth manifold $M$ equipped with a closed non-degenerate 2-form $\om$, called a symplectic form. A first example would be the standard symplectic vector space $\R^{2n}$, with coordinates $q_1, \ldots, q_n, p_1,\ldots, p_n$, equipped with the standard symplectic form $\om_0 = \sum_{i=1}^n dp_i \wedge dq_i$. Locally, by Darboux theorem, any symplectic manifold looks like the standard $(\R^{2n},\om_0)$. Thus, there are no local invariants in symplectic geometry, as opposed to the situation in Riemannian geometry. However, numerical global invariants do exist, symplectic capacities are a significant example of such symplectic measurements.

This theory of symplectic capacities started with the fundamental work of Gromov on pseudoholomorphic curves ([G]), followed then by the influential works of Ekeland and Hofer ([EH]) and Hofer and Zehnder ([HZ1]) on variational theory in Hamiltonian systems, and the work of Viterbo on generating functions ([V2]).

We deal with symplectic capacities of submanifolds of $\R ^{2n}$ equipped with the standard symplectic form, so we'll only describe this case.

Let us use the following notations:

A symplectic capacity on $\R^{2n}$ is a function $c$ defined on open subsets of $\R^{2n}$, taking non-negative values, with the following properties:
  • (Monotonicity) If $U \subset V$, then $c(U) \leq c(V)$,
  • (Invariance) If $\vphi$ is a symplectomorphism, then $c(\vphi(U)) = c(U)$,
  • (Homogeneity) For all $\alpha > 0$, $c(\alpha U) = \alpha^2 c(U)$,
  • (Non-triviality) $c (B_{2n}(1)) > 0$ and $c(Z_{2n}(1)) < \infty$.
A symplectic capacity $c$ is said to be normalized, if $c (B_{2n}(1)) = c(Z_{2n}(1)) = \pi $, and henceforth all symplectic capacities in this text are assumed to be normalized.

It is a-priori unclear whether a symplectic capacity exists, but currently there is already a list of symplectic capacities introduced in various ways. The concept was introduced by I. Ekeland and H. Hofer in 1990, although, in fact, the first symplectic capacity (the symplectic width) was constructed by M. Gromov, related to the Non-Squeezing theorem he proved in 1985.

The project focuses on a certain capacity, called the Ekeland-Hofer-Zehnder capacity, $c_{EHZ}$ defined for convex domains in $\R^{2n}$, which is also normalized.
Let $K$ be a compact strictly convex set in $\R^{2n}$, such that $0$ is in the interior of $K$ and the boundary $\de K$ is smooth of class $C^2$. Denote $H(x) = \| x \|_K^2$, where $\| x \|_K = inf \{ \lambda>0 : x \in \lambda K \} $ is the semi-norm with respect to the convex body $K$, where by convex body we mean here and henceforth a compact convex set with non-empty interior in $\R^{2n}$. (It is a semi-norm in the sense that it is only positively homogeneous, since $K$ might not be symmetric; Being a semi-norm would be enough for our purposes). By definition, $K = \{ x\in \R^{2n} : \|x\|_K \leq 1 \}$ and $\de K = H^{-1} (1)$. Note also that the Hamiltonian vector field $X_H = J \nabla H$ defines a flow on $\de K$, where $J$ is a fixed almost complex structure on $\R^{2n}$.
Consider the space of periodic orbits of $X_H$, \begin{equation} \calE_K = \left\{ z \in C^2(\R,\de K) : \dot z = X_H\circ z,\ z(0)=z(T) \text{ for some }T>0\right\} \;. \label{eq:def_space_of_periodic_orbits} \end{equation}

The action of an orbit $z \in \calE_K$ with period $T$ is $\calA (z) = \frac{1}{2}\int_{0}^T \left<-J\dot z,z\right> dt \;.$
The Ekeland-Hofer-Zehnder capacity of $K$ is the quantity \begin{equation} c_{EHZ} (K) = \inf \left\{ |\calA (z)| : z \in \calE _K \right\} \;. \label{eq:def_EHZ_capacity} \end{equation}
We denote by $\calK$ the set of all compact strictly convex subsets of $\R^{2n}$ which have $C^2$ boundary.
(Ekeland, Hofer, Zehnder) The functional $c_{EHZ}$ is a capacity on $\calK$.
This theorem is a combination of results from [EH] and [HZ2], that amounts to the fact that on the class of convex domains in $\R^{2n}$, two symplectic capacities — the Ekeland-Hofer capacity $c_{EH}$ and the Hofer-Zehnder capacity $c_{HZ}$ which are defined in a much more general setting — coincide, and they are equal to the Ekeland-Hofer-Zehnder capacity $c_{EHZ}$ as defined above.

For computational purposes we use the next result which is based on Clarke's duality principle (see [C]).

For a convex body $K$ containing $0$, it holds that \begin{equation} c_{EHZ}(K)=\min_{z\in\mathcal{E}}\frac{\pi}{2}\int_0^{2\pi}h_K^2(\dot{z}(t))dt, \end{equation} where \begin{equation} \mathcal{E}=\left\{z\in W^{1,2}(\mathbb{S}^1,\mathbb{R}) \Bigg\vert \int_0^{2\pi}z(t)dt=0, \frac{1}{2}\int_0^{2\pi}\langle Jz,\dot{z}\rangle=1\right\}, \end{equation} and $h_K(v)=\lVert v\rVert_{K^\circ}$.
For proof see [HZ2, section 1.5]; cf. Rabinowitz in [R] and Weinstein in [W].

Known Capacity Values

There are a few cases in which symplectic capacities are known analytically. Foy any compact connected domain with smooth boundary in $\R^2$, Siburg showed in [Si] that its symplectic capacity would equal its Lebesgue measure. Other examples where symplectic capacities can be computed precisely include polydiscs, ellipsoids (see [HZ2]), and convex Reinhardt domains (see [Hn]). Further, we wish to remark that Nir proved in [N] that the $c_{EHZ}$ capacity of the standard simplex in $\R^{2n}$ is $\frac{1}{2n}$; where by the standard simplex we mean the simplex whose vertices are 0 and the standard unit vectors in $\R^{2n}$.

Known Conjectures

Following Viterbo's work, it is interesting to establish a connection between symplectic capacities and the approach of Riemannian geometry. For instance, one can ask about the relations between symplectic capacities and volume. Some of the questions arising in this fashion are in the flavor of the isoperimetric inequality.

Viterbo conjectured that among all convex bodies in $\R^{2n}$ of a given volume, the Euclidean ball has maximal symplectic capacity. The precise statement is the following.

(Viterbo's Symplectic Isoperimetric Conjecture) Let $K$ be any convex body in $\R^{2n}$, and any symplectic capacity $c$, \begin{equation} \frac{c(K)}{c\left( B \right)} \leq \left( \frac{Vol (K) }{Vol (B)} \right) ^{1/n} \;, \label{viterbo_ineq_expression} \end{equation} where $B=B^{2n}(1)$ is the unit ball in $\R^{2n}$.
We want to note also that for the special class of convex Reinhardt domains, Hermann showed that this conjecture holds (see [Hn]).

We also considered the following conjecture as presented by Akopyan, Karasev and Petrov in [AKP].

(Symplectic subadditivity conjecture) If a convex body $K \subset \R^{2n}$ is covered by a finite set of convex bodies $\{ K_i \}$ then for some symplectic capacity $c$, \begin{equation} c(K) \leq \sum_i c(K_i) \;, \end{equation}
Note that we only checked this conjecture for $c_{EHZ}$.

As part of the project, we designed and ran tests to try and refute both these conjectures. However, we could not find any counterexamples.

Towards Computations - Discretization

In her thesis, Göing-Jaeschke developed a method of discretizing the minimization problem at hand, which we used in our implementation. Let us briefly describe this method, and note that the full details are given in [GJ]. First, recall that we want to solve the following minimization problem, which is equivalent to our initial one. We consider the set \begin{equation} M = \left\{ x\in H^{1}([0,1],\R^{2n})\,:\, \int_0^1 \dot x(t)dt = 0\right\} \end{equation} and the functional \begin{equation} \calF_K(x) = \int_0^1h_K^2(-J\dot x(t))dt \;. \label{eq:functional} \end{equation} Our minimization problem is the following constrained one: \begin{equation} \lambda = \inf\left\{ \calF_K(x) : \, x\in M,\, \int_0^1\langle -J\dot x(t),x(t)\rangle dt = 1\right\} \label{MinProblem} \end{equation} We know that it has at least one solution, denoted by $u$.

In order to numerically solve the minimization problem (\ref{MinProblem}), we follow the general outline presented in [GJ].
Given $m\in\N$, divide the interval $[0,1]$ to $m$ subintervals $I_{m,k}:=(\frac{k}{m},\frac{k+1}{m})$, $k=0,\dots,m-1$, of length $\frac{1}{m}$, and consider only $H^1$-functions that are linear on each subinterval, \begin{equation} W_m:=\left\{u\in H^1([0,1],\R^{2n})\ \big|\ |\dot u|_{I_{m,k}} \equiv\text{const}, \ \int_0^1u(t)dt=0,\ \int_0^1\left<-J\dot u,u\right>=1 \right\}, \end{equation} where $J\in\Mat(2n,\R)$ is the standard complex structure on $\R^{2n}$, \begin{equation*} J= \left(\begin{array}{cc} 0 & I_n\\ -I_n & 0 \end{array}\right). \end{equation*} Let $u\in W_m$, then $\dot u$ is piece-wise constant, and can be represented by a vector $\dot \x= (\dot x_1,\dots,\dot x_{m})\in \R^{2n\cdot m}$. In this case, the functional to minimize (\ref{eq:functional}) takes the form \begin{equation} F(\dot\x) = \frac{1}{m} \sum_{j=1}^{m} G(-J \dot x_j), \end{equation} and its gradient is \begin{equation} \nabla F (\dot\x) = \frac{1}{m} (J\nabla G(-J\dot x_1),\dots,J\nabla G(-J\dot x_{m})). \end{equation} The constraints are given by \begin{eqnarray} 0 &=& \ell(\dot\x) = \sum_{j=0}^{m-1}\dot x_j, \label{eq:lincon}\\ 0 &=& q(\dot\x) = \frac{1}{m^2} \sum_{k=1}^{m-1}\sum_{j=0}^{k-1}\left<-J \dot x_k, \dot x_j\right> -1 = \frac{1}{m^2} \dot \x^T A \dot \x -1,\label{eq:quadcon} \end{eqnarray} where \begin{equation*} A:=\left(\begin{array}{cccc} 0_{2n} & -J_{2n} & \cdots & -J_{2n}\\ \vdots & \ddots & & \vdots\\ \vdots & & \ddots & -J_{2n}\\ 0_{2n} & \cdots & \cdots & 0_{2n} \end{array}\right) \in \Mat(\R, 2n\cdot m), \end{equation*} and the gradient of the quadratic constraint is \begin{equation*} \nabla q(\dot\x) = \frac{1}{m^2} (A+A^T)\dot \x. \end{equation*} To minimize the functional $F$ under the constraints $\ell,q$ we use the MATLAB library function named fmincon. This function receives as input the functional, its gradient, the constraints and their gradients, together with a starting point in $\R^{2n\cdot m}$ and iteratively searches for a local minimum. The function fmincon enables choosing a minimization algorithm out of a given list. After numerically experimenting the possible algorithms, we chose "active-set" as it produced the best results in terms of accuracy and run-time. To choose a starting vector, we randomly pick a vector in $\R^{2n\cdot m}$, then shift and rescale it to satisfy conditions (\ref{eq:lincon}), (\ref{eq:quadcon}).

The Algorithm

The project contains several sub-modules, each with its own computational purpose, yet all of them use the basic function called Capacity which calculates the capacity of a given polytope. In this section we explain the algorithm of this function which relies on the discretization presented in A. Göing-Jaeschke's PhD thesis [GJ].

We present the algorithm, with modifications, here:

  1. If needed move the polytope's centroid to 0.
  2. Generate a random path.
    1. Put x0:= a random vector of size (2*n*m, 1) with entries in [-1,1].
    2. Correct the vector so that the sum of the entries in each dimension is 0.
    3. Put l:=x0TA2nx0.
    4. if (l<ε), go back to step 2.3.
    5. If (l<0) invert the direction of the path.
    6. Normalize the path by 1l.
  3. Find the action minimizer
    1. Use MATLAB's fmincon to find the action minimizer. Keep the output in [minimizer, minimizerAction].
    2. for k=1 to exp
      1. Break the minimizer's path into twice as many smaller sections.
      2. Multiply m by 2, and recalculate the constraints matrix.
      3. If m is large enough, increase the tolerance used for fmincon.
      4. Use fmincon as before.
      5. If the change in the minimizer's action is small, stop the iterations.
  4. If required, reconstruct the characteristic.
  5. Return 2*F(minimizer).

Let's explain some of the points in the algorithm. First, this is the general form of the algorithm, in the sense that it's agnostic to the question of "what type of convex body (e.g. ellipsoids, simplices, etc.) are we considering?". It is known that one can approximate a convex body by polytopes, and that approximation translates to capacities. Yet, this requires proof, and further, this approach takes a heavy toll in terms of running time. In cases where the norm of the dual body is known analytically, the algorithm is much more precise and takes far less time.

With this in mind, we'll continue with polytopes as a prime example. The input of the algorithm, therefore, is a matrix $P$ whose rows are the extreme points of the convex hull for which we want to calculate the capacity. Therefore, the dimensions of $P$ is $k\times 2n$ where $k$ is the number of points and $2n$ is the dimension of the ambient Euclidean space.

We begin by centralizing the polytope, so to increase the accuracy of the action function. Recall that the Hamiltonian is the norm of the convex body, whence we need the centroid of the convex body to be at 0. We then generate a random starting path in the dual space. This path is represented by constant vectors, as explained in the theoretical section of this site. We then fix the path so that it closes to a closed path, and then calculate the constraint function on this path, i.e., $x^TA_{2n}x$. We continue generating such random paths until we reach one whose constraint value is not too small, since we later normalize by this value and would like to avoid loss of significance due to small numbers division. Finally for this part of the algorithm, we check if the constraint value is negative, since this indicates the path is going in the wrong direction, in which case we fix by reversing the order of the vertices.

For the action minimizing section of the algorithm, note that we iterate the minimization process a few times (the stopping conditions for this are configurable parameters of the algorithm). The idea behind this procedure is to begin with a rough approximation of the action, and then fine tune this value until we don't produce further meaningful improvements. The reason we do this is to improve the running time of the algorithm, and at least empirically this produced satisfying results. Note that we have no proof that this phenomenon holds true in all cases.

The procedure of iteration consists in both increasing the number of subintervals, thus allowing finer paths; and in increasing the tolerance values for the fmincon function. Some of these values are configurable, and generally have been tested to produce better running time.

For completeness we remind the reader that the function to minimize is given by \begin{equation} F(x)=\frac{1}{m}\sum_{j=0}^{m-1}G(-J_{2n}x_j), \end{equation} where $x_j$ is the derivative of the characteristic in the interval $I_j$. Note that in $G$ is where a change of the convex body, e.g. from polytope to ellipsoid, affects the implementation. The linear constraints supplied to fmincon is the discretized version of the constraint $\int_0^1\tilde{u}\operatorname{d}t=0$, namely, \begin{equation} \sum_{j=0}^{m-1}x_j=0, \end{equation} where $\tilde{u}$ is the piecewise constant derivatives. The nonlinear constraints supplied to fmincon is given by the function \begin{equation} c(x)=\frac{1}{m^2}x^TA_{2n}x-1 \end{equation}

A final remark on the minimization, is that the algorithm we use in fmincon is called 'active set'. This choice was made after several attempts to find the most suitable of algorithms, during which all others gave worse results - either in running time or in numerical result.

The last non-trivial step in the Capacity function is to reconstruct the characteristic out of the minimizer of the dual problem. Theoretically this is achieved by integration, and so the implementation of the ReconstructCharacteristic function simply consists of summing of the correct terms in the right factors.


We ran different tests, attempting to better understand the behaviour of the capacity function, and to test the correctness of some conjectures.

Symplectic Capacity of a Disk in $\R^2$

One of the basic convex bodies is a disk in the 2 dimensional Euclidean plane. Since the Ekeland-Hofer-Zehnder capacity is normalized, we know that the output of our program for the disk of radius 1 should be $\pi$. To calculate the capacity of a disk one can either calculate its dual norm and plug it into the algorithm above, or one can approximate by regular polygons. Here we'll discuss approximation by polygons, since this approximation is more general in the sense that if the result for the disk is correct, then it's likely that the program would also work for other approximated bodies.

Therefore, here are the results of our program for regular polygons in $\R^2$ with increasing number of edges. Indeed, we see that as the number of edges increases, so does the result approach $\pi$.

Symplectic capacity of $m$-gons approximating a disk in $\R^2$
Number of Edges = $m$ Capacity
21 3.0974
22 3.1016
23 3.1052
24 3.1089
25 3.1112
26 3.1137
27 3.1159
28 3.1179
29 3.1196
30 3.1212
31 3.1226
32 3.1291
33 3.1253
34 3.1264
35 3.1273
36 3.1283
37 3.1291
38 3.1299
39 3.1306
40 3.1314

Symplectic Capacity of $l_p$ Balls in $\R^2$

Running the Capacity function on $l_p$ unit balls in $\R^2$, denoted by $B_p(1)$, yields the following results when approximating them with 50 random points scaled by the appropriate norm.

Symplectic capacity of unit balls in $\R^2$ with $l_p$ norm for varried $p$, approximated by random 50-gons
Capacity $p$ Value
1.9774 1.01
3.1137 2
3.5076 3
3.6910 4
3.8887 8
3.8548 10

Symplectic Capacity of the Standard Simplex in $\R^{2n}$

The standard simplex is the simplex whose vertices are 0 and the unit vectors in $\R^{2n}$. The capacity of the standard simplex has been computed by Nir in [N] to be $\frac{1}{2n}$.

Our capacity function gave the following results:

Symplectic capacity of the standard simplex in $\R^{2n}$
Half Dimension = $n$ E.H.Z. Capacity Theoretical Value = $\frac{1}{2n}$ Running Time (sec)
1 0.5008 0.5 20
2 0.2520 0.25 15
3 0.1700 0.1667 9
4 0.1264 0.125 15
5 0.1021 0.1 152
6 0.0856 0.0833 53
7 0.0740 0.0714 46
8 0.0634 0.0625 549
9 0.0577 0.0556 135
10 0.0533 0.05 186

Note that the results of the program are very close to the theoretical value.

Symplectic Capacity of Ellipsoids in $\R^{2n}$

As mentioned in the theoretical section, the symplectic capacities for ellipsoids are known analytically. Moreover, all symplectic capacities coincide for ellipsoids, and give the following result. For details see [HZ2], and in the notations there we calculate that for a symplectic capacity $c$ and an ellipsoid $E$ we have that, \begin{equation} c(E)=\pi r_1(E)^2, \end{equation} where $r_1(E)$ is the smallest symplectic radius of the ellipsoid.

Therefore, we implemented a version of the Capacity function that calculates the capacity of ellipsoids, given in symplectic form. That is, the input for this version of the function receives a vector of symplectic radii as input, and finds its capacity using the general algorithm described above.

In this example we show the results of the Capacity function for a couple of ellipsoids. Note that they are of dimension twice the number of radii. Further, note that although our implementation can only find the $c_{EHZ}$ capacity of ellipsoids already given in symplectic form, the order of the radii makes no difference.

Symplectic capacity of ellipsoids in $\R^{2n}$
Half dimension = $n$ Radii Capacity Theoretical Value Running time (sec)
6 [1 2 3 4 5 6] 3.1485 3.1415 288
3 [5 3 6] 28.3117 28.2743 13.4

Note that the calculated value by the program is indeed very close to the theoretical value.

Symplectic Capacity of the Lagrangian Product of a Triangle and a Circle in $\mathbb{R}^4$

Following F. Schlenk, denote by $\triangle(a) = \left\{0< q_1,q_2 \Big| \frac{q_1}{a} + \frac{q_2}{a} < 1 \right\}$ an open right triangle with side length $a$, and by $\square = \left\{0< p_1,p_2 \Big| 0< p_i<1,\quad i=1,2 \right\}$ an open square of side length 1.

We consider the Lagrangian product $\triangle(\pi)\times \square$. The interior of this product is symplectomorphic to a Euclidean ball with the same volume. See Lemma 5.3.1 in [Sc]. It therefore follows that all symplectic capacities should yield $\pi$ for this product.

In our implementation, to compute the Lagrangian product of bodies for which the dual norm is not known analytically, we construct the product of two polytopes representing the bodies, then compute the extreme points of this product. The result of this calculation is then supplied as input to the Capacity function.

Following the theoretical results above, we constructed the polytope $\triangle(\pi)\times \square$ and tested the program for various values of the number of subintervals $m$ (for details see Discretization above), and got the following:

Symplectic capacity of $\triangle(\pi)\times \square$ for different number $m$ of subintervals
Number of Subintervals = $m$ Capacity
30 3.1651
60 3.1603
120 3.1539


As mentioned, while working on the project, we did not find any counter examples for the conjectures in the theoretical section above, but there are several directions for further investigation. It is our hope that this project would help anyone interested in symplectic capacities, both theoretically and computationally. We hope this project could be used further in advancing research, in asking new questions, and perhaps in answering them as well.


The Team

Gautam Banhatti, University of Muenster,
Robert Castellano, Columbia University,
Yaniv Ganor, Tel-Aviv University,
Jeremy Lane, University of Toronto,
Christopher Policastro, University of California, Berkeley,
Itamar Rauch, Tel-Aviv University,
Karina Samvelyan, Tel-Aviv University,
Kyler Bryce Siegel, Stanford University,
Emmanuel Tsukerman, University of California,
Shira Tanny, Tel-Aviv University.