manage_search
Case Study

A Quantum Speedup for String Matching

Last updated: January 18, 2024
Pradeep Niroula, Yunseong Nam Quantum Theory Lead

https://images.ctfassets.net/hqm865gc1xfs/2jeaegfiQ5JYDEEvmiXltv/50bf79b09a6208baa0e0e2c359622ada/2021-04-01-string-matching.jpeg

Recently, we published a paper in Nature’s npj Quantum Information that describes a quantum string matching algorithm. This algorithm is faster than known classical algorithms and also avoids the shortcomings of known quantum algorithms.

String and pattern matching algorithms are ubiquitous in information processing. They are used in analyzing bioinformatics data, image processing, detecting patterns in large datasets and more.

In its most basic form, a string matching algorithm looks for a search text in a longer document. This can be easily generalized to two dimensions to search images and also to higher dimensions to search for higher dimensional patterns.

In this paper, we focus on exact string matching, in which the pattern has to match the text exactly. Although we don’t demonstrate this in the paper, it’s also possible to extend the algorithm to do a fuzzy search or even search with wildcards with the same speedup.

Our paper exploits one of the core techniques in quantum algorithms called Grover’s search, which lets you search through a database quadratically faster than a classical search algorithm. It is, however, not always clear how to construct the necessary ‘oracles’ or black box functions that Grover’s algorithm needs. We have known about quantum algorithms for string matching with similar speedups as early as 2003, but these earlier algorithms relied on a technique called a “memory oracle,” which allows you to load data from memory onto the quantum processing unit, much like how the CPU in a classical computer fetches data from the RAM1 .

The central idea behind this paper is the observation that you can do away with such memory oracles using a network of SWAP gates. This is a valuable advancement, because while we have blueprints for how to build a quantum random access memory, their feasibility is not well understood and the cost of initializing such a memory is expected to overwhelm any quantum advantage. It would, therefore, be helpful to have algorithms that didn’t resort to such random access memories.

Ultimately, while it’s possible to demonstrate our algorithm on existing current hardware, the quadratic speedup of this algorithm will only materialize for extremely large datasets. It will be a while before we build quantum computers capable of loading that much data.

That said, we’re hopeful that our algorithm will be able to accelerate various string matching tasks used across industry and research sooner than we might think. We are also hopeful that in the near-term the technique we use to obviate the need of a memory oracle can help design new algorithms and improve older ones.

1 Loading classical data onto the quantum processor is a common challenge with many quantum algorithms, and algorithms often resort to this sort of quantum random access memory as an efficient way to solve this problem.