Get started with qsimcirq

View on QuantumAI Run in Google Colab View source on GitHub Download notebook

The qsim library provides a Python interface to Cirq in the qsimcirq PyPI package.

Setup

Install the Cirq and qsimcirq packages:

try:
    import cirq
except ImportError:
    !pip install -q cirq --quiet
    import cirq

try:
    import qsimcirq
except ImportError:
    !pip install -q qsimcirq --quiet
    import qsimcirq

Simulating Cirq circuits with qsim is easy: just define the circuit as you normally would, then create a QSimSimulator to perform the simulation. This object implements Cirq's simulator.py interfaces, so you can drop it in anywhere the basic Cirq simulator is used.

Full state-vector simulation

qsim is optimized for computing the final state vector of a circuit. Try it by running the example below.

# Define qubits and a short circuit.
q0, q1 = cirq.LineQubit.range(2)
circuit = cirq.Circuit(cirq.H(q0), cirq.CX(q0, q1))
print("Circuit:")
print(circuit)
print()

# Simulate the circuit with Cirq and return the full state vector.
print('Cirq results:')
cirq_simulator = cirq.Simulator()
cirq_results = cirq_simulator.simulate(circuit)
print(cirq_results)
print()

# Simulate the circuit with qsim and return the full state vector.
print('qsim results:')
qsim_simulator = qsimcirq.QSimSimulator()
qsim_results = qsim_simulator.simulate(circuit)
print(qsim_results)
Circuit:
0: ───H───@───
          │
1: ───────X───

Cirq results:
measurements: (no measurements)
output vector: 0.707|00⟩ + 0.707|11⟩

qsim results:
measurements: (no measurements)
output vector: 0.707|00⟩ + 0.707|11⟩

To sample from this state, you can invoke Cirq's sample_state_vector method:

samples = cirq.sample_state_vector(
    qsim_results.state_vector(), indices=[0, 1], repetitions=10)
print(samples)
[[1 1]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [1 1]]

Measurement sampling

qsim also supports sampling from user-defined measurement gates.

# Define a circuit with measurements.
q0, q1 = cirq.LineQubit.range(2)
circuit = cirq.Circuit(
    cirq.H(q0), cirq.X(q1), cirq.CX(q0, q1),
    cirq.measure(q0, key='qubit_0'),
    cirq.measure(q1, key='qubit_1'),
)
print("Circuit:")
print(circuit)
print()

# Simulate the circuit with Cirq and return just the measurement values.
print('Cirq results:')
cirq_simulator = cirq.Simulator()
cirq_results = cirq_simulator.run(circuit, repetitions=5)
print(cirq_results)
print()

# Simulate the circuit with qsim and return just the measurement values.
print('qsim results:')
qsim_simulator = qsimcirq.QSimSimulator()
qsim_results = qsim_simulator.run(circuit, repetitions=5)
print(qsim_results)
Circuit:
0: ───H───@───M('qubit_0')───
          │
1: ───X───X───M('qubit_1')───

Cirq results:
qubit_0=10111
qubit_1=01000

qsim results:
Provided circuit has no intermediate measurements. Sampling repeatedly from final state vector.
qubit_0=11110
qubit_1=00001

The warning above highlights an important distinction between the simulate and run methods:

  • simulate only executes the circuit once.
    • Sampling from the resulting state is fast, but if there are intermediate measurements the final state vector depends on the results of those measurements.
  • run will execute the circuit once for each repetition requested.
    • As a result, sampling is much slower, but intermediate measurements are re-sampled for each repetition. If there are no intermediate measurements, run redirects to simulate for faster execution.

The warning goes away if intermediate measurements are present:

# Define a circuit with intermediate measurements.
q0 = cirq.LineQubit(0)
circuit = cirq.Circuit(
    cirq.X(q0)**0.5, cirq.measure(q0, key='m0'),
    cirq.X(q0)**0.5, cirq.measure(q0, key='m1'),
    cirq.X(q0)**0.5, cirq.measure(q0, key='m2'),
)
print("Circuit:")
print(circuit)
print()

# Simulate the circuit with qsim and return just the measurement values.
print('qsim results:')
qsim_simulator = qsimcirq.QSimSimulator()
qsim_results = qsim_simulator.run(circuit, repetitions=5)
print(qsim_results)
Circuit:
0: ───X^0.5───M('m0')───X^0.5───M('m1')───X^0.5───M('m2')───

qsim results:
m0=00110
m1=11000
m2=01101

Amplitude evaluation

qsim can also calculate amplitudes for specific output bitstrings.

# Define a simple circuit.
q0, q1 = cirq.LineQubit.range(2)
circuit = cirq.Circuit(cirq.H(q0), cirq.CX(q0, q1))
print("Circuit:")
print(circuit)
print()

# Simulate the circuit with qsim and return the amplitudes for |00) and |01).
print('Cirq results:')
cirq_simulator = cirq.Simulator()
cirq_results = cirq_simulator.compute_amplitudes(
    circuit, bitstrings=[0b00, 0b01])
print(cirq_results)
print()

# Simulate the circuit with qsim and return the amplitudes for |00) and |01).
print('qsim results:')
qsim_simulator = qsimcirq.QSimSimulator()
qsim_results = qsim_simulator.compute_amplitudes(
    circuit, bitstrings=[0b00, 0b01])
print(qsim_results)
Circuit:
0: ───H───@───
          │
1: ───────X───

Cirq results:
[0.70710677+0.j 0.        +0.j]

qsim results:
[(0.7071067690849304+0j), 0j]

Performance benchmark

The code below generates a depth-16 circuit on a 4x5 qubit grid, then runs it against the basic Cirq simulator. For a circuit of this size, the difference in runtime can be significant - try it out!

import time

# Get a rectangular grid of qubits.
qubits = cirq.GridQubit.rect(4, 5)

# Generates a random circuit on the provided qubits.
circuit = cirq.experiments.random_rotations_between_grid_interaction_layers_circuit(
    qubits=qubits, depth=16)

# Simulate the circuit with Cirq and print the runtime.
cirq_simulator = cirq.Simulator()
cirq_start = time.time()
cirq_results = cirq_simulator.simulate(circuit)
cirq_elapsed = time.time() - cirq_start
print(f'Cirq runtime: {cirq_elapsed} seconds.')
print()

# Simulate the circuit with qsim and print the runtime.
qsim_simulator = qsimcirq.QSimSimulator()
qsim_start = time.time()
qsim_results = qsim_simulator.simulate(circuit)
qsim_elapsed = time.time() - qsim_start
print(f'qsim runtime: {qsim_elapsed} seconds.')
Cirq runtime: 4.301793813705444 seconds.

qsim runtime: 0.13714051246643066 seconds.

qsim performance can be tuned further by passing options to the simulator constructor. These options use the same format as the qsim_base binary - a full description can be found in the qsim usage doc. The example below demonstrates enabling multithreading in qsim; for best performance, use the same number of threads as the number of cores (or virtual cores) on your machine.

# Use eight threads to parallelize simulation.
options = {'t': 8}

qsim_simulator = qsimcirq.QSimSimulator(options)
qsim_start = time.time()
qsim_results = qsim_simulator.simulate(circuit)
qsim_elapsed = time.time() - qsim_start
print(f'qsim runtime: {qsim_elapsed} seconds.')
qsim runtime: 0.037862539291381836 seconds.

Another option is to adjust the maximum number of qubits over which to fuse gates. Increasing this value (as demonstrated below) increases arithmetic intensity, which may improve performance with the right environment settings.

# Increase maximum fused gate size to three qubits.
options = {'f': 3}

qsim_simulator = qsimcirq.QSimSimulator(options)
qsim_start = time.time()
qsim_results = qsim_simulator.simulate(circuit)
qsim_elapsed = time.time() - qsim_start
print(f'qsim runtime: {qsim_elapsed} seconds.')
qsim runtime: 0.12808942794799805 seconds.

Advanced applications: Distributed execution

qsimh (qsim-hybrid) is a second library in the qsim repository that takes a slightly different approach to circuit simulation. When simulating a quantum circuit, it's possible to simplify the execution by decomposing a subset of two-qubit gates into pairs of one-qubit gates with shared indices. This operation is called "slicing" (or "cutting") the gates.

qsimh takes advantage of the "slicing" operation by selecting a set of gates to "slice" and assigning each possible value of the shared indices across a set of executors running in parallel. By adding up the results afterwards, the total state can be recovered.

# Pick a pair of qubits.
q0 = cirq.GridQubit(0, 0)
q1 = cirq.GridQubit(0, 1)

# Create a circuit that entangles the pair.
circuit = cirq.Circuit(
    cirq.H(q0), cirq.CX(q0, q1), cirq.X(q1)
)
print("Circuit:")
print(circuit)
Circuit:
(0, 0): ───H───@───────
               │
(0, 1): ───────X───X───

In order to let qsimh know how we want to split up the circuit, we need to pass it some additional options. More detail on these can be found in the qsim usage doc, but the fundamentals are explained below.

options = {}

# 'k' indicates the qubits on one side of the cut.
# We'll use qubit 0 for this.
options['k'] = [0]

# 'p' and 'r' control when values are assigned to cut indices.
# There are some intricacies in choosing values for these options,
# but for now we'll set p=1 and r=0.
# This allows us to pre-assign the value of the CX indices
# and distribute its execution to multiple jobs.
options['p'] = 1
options['r'] = 0

# 'w' indicates the value pre-assigned to the cut.
# This should change for each execution.
options['w'] = 0

# Create the qsimh simulator with those options.
qsimh_simulator = qsimcirq.QSimhSimulator(options)
results_0 = qsimh_simulator.compute_amplitudes(
    circuit, bitstrings=[0b00, 0b01, 0b10, 0b11])
print(results_0)
[0j, (0.7071067690849304+0j), 0j, 0j]

Now to run the other side of the cut...

options['w'] = 1

qsimh_simulator = qsimcirq.QSimhSimulator(options)
results_1 = qsimh_simulator.compute_amplitudes(
    circuit, bitstrings=[0b00, 0b01, 0b10, 0b11])
print(results_1)
[0j, 0j, (0.7071067690849304+0j), 0j]

...and add the two together. The results of a normal qsim simulation are shown for comparison.

results = [r0 + r1 for r0, r1 in zip(results_0, results_1)]
print("qsimh results:")
print(results)

qsim_simulator = qsimcirq.QSimSimulator()
qsim_simulator.compute_amplitudes(circuit, bitstrings=[0b00, 0b01, 0b10, 0b11])
print("qsim results:")
print(results)
qsimh results:
[0j, (0.7071067690849304+0j), (0.7071067690849304+0j), 0j]
qsim results:
[0j, (0.7071067690849304+0j), (0.7071067690849304+0j), 0j]

The key point to note here is that results_0 and results_1 are completely independent - they can be run in parallel on two separate machines, with no communication between the two. Getting the full result requires 2^p executions, but each individual result is much cheaper to calculate than trying to do the whole circuit at once.