Cette page n'a pas encore été traduite. Vous voyez la version originale en anglais.
The variational quantum eigensolver (VQE)
This lesson will introduce the variational quantum eigensolver, explain its importance as a foundational algorithm in quantum computing, and also explore its strengths and weaknesses. VQE by itself, without augmenting methods, is not likely to be sufficient for modern utility scale quantum computations. It is nevertheless important as an archetypal classical-quantum hybrid method, an it is an important foundation upon which many more advanced algorithms are built.
This video gives an overview of VQE and factors that affect its efficiency. The text below adds more detail and implements VQE using Qiskit.
1. What is VQE?
The variational quantum eigensolver is an algorithm that uses classical and quantum computing in conjunction to accomplish a task. There are 4 main components of a VQE calculation:
- An operator: Often a Hamiltonian, which we’ll call , that describes a property of your system that you wish to optimize. Another way of saying this is that you are seeking the eigenvector of this operator that corresponds to the minimum eigenvalue. We often call that eigenvector the “ground state”.
- An “ansatz” (a German word meaning “approach”): this is a quantum circuit that prepares a quantum state approximating the eigenvector you’re seeking. Really the ansatz is a family of quantum circuits, because some of the gates in the ansatz are parametrized, that is, they are fed a parameter which we can vary. This family of quantum circuits can prepare a family of quantum states approximating the ground state.
- An estimator: a means of estimating the expectation value of the operator over the current variational quantum state. Sometimes what we really care about is simply this expectation value, which we call a cost function. Sometimes, we care about a more complicated function that can still be written starting from one or more expectation values.
- A classical optimizer: an algorithm that varies parameters to try to minimize the cost function.
Let's look at each of these components in more depth.
1.1 The operator (Hamiltonian)
At the core of a VQE problem is an operator that describes a system of interest. We will assume here that the lowest eigenvalue and the corresponding eigenvector of this operator are useful for some scientific or business purpose. Examples might include a chemical Hamiltonian describing a molecule, such that the lowest eigenvalue of the operator corresponds to the ground state energy of the molecule, and the corresponding eigenstate describes the geometry or electron configuration of the molecule. Or the operator could describe a cost of a certain process to be optimized, and the eigenstates could correspond to routes or practices. In some fields, like physics, a "Hamiltonian" almost always refers to an operator describing the energy of a physical system. But in quantum computing, it is common to see quantum operators that describe a business or logistical problem also referred to as a "Hamiltonian". We will adopt that convention here.

Mapping a physical or optimization problem to qubits is typically a non-trivial task, but those details are not the focus of this course. A general discussion of mapping a problem to a quantum operator can be found in Quantum computing in practice. A more detailed look at the mapping of chemistry problems into quantum operators can be found in Quantum Chemistry with VQE.
For the purposes of this course, we will assume the form of the Hamiltonian is known. For example, a Hamiltonian for a simple hydrogen molecule (under certain active space assumptions, and using the Jordan-Wigner mapper) is:
from qiskit.quantum_info import SparsePauliOp
hamiltonian = SparsePauliOp(
[
"IIII",
"IIIZ",
"IZII",
"IIZI",
"ZIII",
"IZIZ",
"IIZZ",
"ZIIZ",
"IZZI",
"ZZII",
"ZIZI",
"YYYY",
"XXYY",
"YYXX",
"XXXX",
],
coeffs=[
-0.09820182 + 0.0j,
-0.1740751 + 0.0j,
-0.1740751 + 0.0j,
0.2242933 + 0.0j,
0.2242933 + 0.0j,
0.16891402 + 0.0j,
0.1210099 + 0.0j,
0.16631441 + 0.0j,
0.16631441 + 0.0j,
0.1210099 + 0.0j,
0.17504456 + 0.0j,
0.04530451 + 0.0j,
0.04530451 + 0.0j,
0.04530451 + 0.0j,
0.04530451 + 0.0j,
],
)
Note that in the Hamiltonian above, there are terms like ZZII and YYYY that do not commute with each other. That is, to evaluate ZZII, we would need to measure the Pauli Z operator on qubit 3 (among other measurements). But to evaluate YYYY, we need to measure the Pauli Y operator on that same qubit, qubit 3. There is an uncertainty relation between Y and Z operators on the same qubit; we cannot measure both of those operators at the same time. We will revisit this point below, and indeed throughout the course.
The Hamiltonian above is a matrix operator. Diagonalizing the operator to find its lowest energy eigenvalue is not difficult.
import numpy as np
A = np.array(hamiltonian)
eigenvalues, eigenvectors = np.linalg.eigh(A)
print("The ground state energy is ", min(eigenvalues), "hartrees")
The ground state energy is -1.1459778447627311 hartrees
Brute force classical eigensolvers cannot scale to describe the energies or geometries of very large systems of atoms, like medications or proteins. VQE is one of the early attempts to leverage quantum computing in this problem.
We will encounter Hamiltonians in this lesson much larger than that above. But it would be wasteful to push the limits of what VQE can do, before we introduce some of the more advanced tools that can augment or replace VQE, later in this course.
1.2 Ansatz
The word "ansatz" is German for "approach". The correct plural in German is "ansätze", though one often sees "ansatzes" or "ansatze". In the context of VQE, an ansatz is the quantum circuit you use to create a multi-qubit wave function that most closely approximates the ground state of the system you are studying, and which thus produces the lowest expectation value of your operator. This quantum circuit will contain variational parameters (often collected together in the vector of variables ).

An initial set of values of the variational parameters is chosen. We will call the unitary operation of the ansatz on the circuit . By default, all qubits in IBM® quantum computers are initialized to the state. When the circuit is run, the state of the qubits will be
If all we needed were the lowest energy (using the language of physical systems), we could estimate this by simply measuring the energy many times and taking the lowest. But we typically also want the configuration that yields that lowest energy or eigenvalue. So the next step is the estimation of the expectation value of the Hamiltonian, which is accomplished through quantum measurements. A lot goes into that. But we can understand this process qualitatively by noting that the probability of measuring an energy (again using the language of physical systems) is related to the expectation value by:
The probability is also related to the overlap between the eigenstate and the current state of the system :
So by making many measurements of the Pauli operators making up our Hamiltonian, we can estimate the Hamiltonian's expectation value in the current state of the system . The next step is to vary the parameters and try to more closely approach the lowest-energy (ground) state of the system. Because of the variational parameters in the ansatz, one sometimes hears it referred to as the variational form.
Before we move on to that variational process, note that it is often useful to start your state from a "good guess" state. You might know enough about your system to make a better initial guess than . For example, it is common to initialize qubits to the Hartree-Fock state in chemical applications. This starting guess which does not contain any variational parameters is called the reference state. Let us call the quantum circuit used to create reference state . Whenever it becomes important to distinguish the reference state from the rest of the ansatz, use: Equivalently
1.3 Estimator
We need a way to estimate the expectation value of our Hamiltonian in a particular variational state . If we could directly measure the entire operator , this would be as simple as making many (say ) measurements and averaging the measured values:
Here, the symbol reminds us that this expectation value would only be precisely correct in the limit as . But with thousands of measurements being made on a circuit, the sampling error of the expectation value is fairly low. There are other considerations such as noise that become an issue for very precise calculations.
However, it is generally not possible to measure all at once. may contain multiple non-commuting Pauli X, Y, and Z operators. So the Hamiltonian must be broken up into groups of operators that can be simultaneously measured, and each such group must be estimated separately, and the results combined to obtain an expectation value. We will revisit this in greater detail in the next lesson, when we discuss the scaling of classical and quantum approaches. This complexity in measurement is one reason we need highly efficient code for carrying out such estimation. In this lesson and beyond, we will use the Qiskit Runtime primitive Estimator for this purpose.
1.4 Classical optimizers
A classical optimizer is any classical algorithm designed to find extrema of a target function (typically a minimum). They search through the space of possible parameters looking for a set that minimizes some function of interest. They can be broadly categorized into gradient-based methods, which utilize gradient information, and gradient-free methods, which operate as black-box optimizers. The choice of classical optimizer can significantly impact an algorithm's performance, especially in the presence of noise in quantum hardware. Popular optimizers in this field include Adam, AMSGrad, and SPSA, which have shown promising results in noisy environments. More traditional optimizers include COBYLA and SLSQP.
A common workflow (demonstrated in Section 3.3) is to use one of these algorithms as the method inside a minimizer like scipy's minimize function. This takes as its arguments:
- Some function to be minimized. This is often the energy expectation value. But these are generally referred to as "cost functions".
- A set of parameters from which to begin the search. Often called or .
- Arguments, including arguments of the cost function. In quantum computing with Qiskit, these arguments will include the ansatz, the Hamiltonian, and the estimator, which is discussed more in the next subsection.
- A 'method' of minimization. This refers to the specific algorithm used to search the parameter space. This is where we would specify, for example, COBYLA or SLSQP.
- Options. The options available may differ by method. But an example which practically all methods would include is the maximum number of iterations of the optimizer before ending the search: 'maxiter'.

At each iterative step, the expectation value of the Hamiltonian is estimated by making many measurements. This estimated energy is returned by the cost function, and the minimizer updates the information it has about the energy landscape. Exactly what the optimizer does to choose the next step varies from method to method. Some use gradients and select the direction of steepest descent. Others may take noise into account and may require that the cost decrease by a large margin before accepting that the true energy decreases along that direction.
# Example syntax for minimization
# from scipy.optimize import minimize
# res = minimize(cost_func, x0, args=(ansatz, hamiltonian, estimator), method="cobyla", options={'maxiter': 200})
1.5 The variational principle
In this context the variational principle is very important; it states that no variational wave function can yield an energy (or cost) expectation value lower than that yielded by the ground state wave function. Mathematically,
This is easy to verify if we note that the set of all eigenstates of form a complete basis for the Hilbert space. In other words, any state and in particular can be written as a weighted (normalized) sum of these eigenstates of :
where are constants to be determined, and . We leave this as an exercise to the reader. But note the implication: the variational state that produces the lowest-energy expectation value is the best estimate of the true ground state.
Check your understanding
Read the question below, think about your answer, then click the triangle to reveal the solution.
Verify mathematically that for any variational state .
Answer:
Using the given expansion of the variational state in terms of the energy eigenstates,
we can write the variational energy expectation value as
For all coefficients . So we can write
2. Comparison with classical workflow
Let’s say we are interested in a matrix with N rows and N columns. Suppose your matrix is so large that exact diagonalization is not an option. Suppose further that you know enough about your problem that you can make some guesses about the overall structure of the target eigenstate, and you want to probe states similar to your initial guess to see if your cost/energy can be lowered further. This is a variational approach, and it is one method that is used when exact diagonalization is not an option.
2.1 Classical workflow
Using a classical computer, this would work as follows:
- Make a guess state, with some parameters that you will vary: . Although this initial guess could be random, that is not advisable. We want to use knowledge of the problem at hand to tailor our guess as much as possible.
- Calculate the expectation value of the operator with the system in that state:
- Alter the variational parameters and repeat: .
- Use accumulated information about the landscape of possible states in your variational subspace to make better and better guesses and approach the target state. The variational principle guarantees that our variational state cannot yield an eigenvalue lower than that of the target ground state. So the lower the expectation value the better our approximation of the ground state:
Let us examine the difficulty of each step in this approach. Setting or updating parameters is computationally easy; the difficulty there is in selecting useful, physically motivated initial parameters. Using accumulated information from prior iterations to update parameters in such a way that you approach the ground state is a non-trivial. But classical optimization algorithms exist that do this quite efficiently. This classical optimization is only expensive because it may require many iterations; in the worst case, the number of iterations may scale exponentially with N. The most computationally expensive single step is almost certainly calculating the expectation value of your matrix using a given state :
The matrix must act on the -element vector, which corresponds to: multiplication operations in the worst case. This must be done at each iteration of parameters. For extremely large matrices, this has high computational cost.
2.2 Quantum workflow and commuting Pauli groups
Now imagine relegating this portion of the calculation to a quantum computer. Instead of calculating this expectation value, you estimate it by preparing the state on the quantum computer using your variational ansatz, and then making measurements.
That may sound easier than it is. is generally not easy to measure. For example it could be made up of many non-commuting Pauli X, Y, and Z operators. But can be written as a linear combination of terms, , each of which is easily measurable (for example, Pauli operators or groups of qubit-wise commuting Pauli operators). The expectation value of over some state is the weighted sum of expectation values of the constituent terms . This expression holds for any state , but we will specifically be using this with our variational states .
where is a Pauli string like IZZX…XIYX, or several such strings that commute with each other. So a description of the expectation value that more closely matches the realities of measurement on quantum computers is
And in the context of our variational wave function:
Each of the terms can be measured times yielding measurement samples with and returns an expectation value and a standard deviation . We can sum these terms and propagate errors through the sum to obtain an overall expectation value and standard deviation .