Instruction-level parallelism

Content sourced from Wikipedia, licensed under CC BY-SA 3.0.

Instruction-level parallelism (ILP) is the idea of executing several instructions at the same time within a single thread of a program. It’s about how many instructions can be done per step in a parallel execution.

ILP is different from concurrency. ILP stays within one thread, while concurrency involves running multiple threads on a CPU, which can use multiple cores if available.

There are two ways to gain ILP: hardware ILP and software ILP. Hardware ILP means the processor decides on the fly which instructions to run in parallel. Software ILP means the compiler plans in advance which instructions can run together. Modern x86 CPUs use hardware ILP techniques, while the Itanium architecture relied more on software ILP.

Example: If instruction 3 depends on the results of instructions 1 and 2, 3 can’t run until 1 and 2 finish. But 1 and 2 can run at the same time. If each takes one time unit, all three complete in two units, giving an ILP of 3/2.

A goal for programmers and processors is to identify and use as much ILP as possible. Most programs are written to run instructions one after another, but ILP lets the processor overlap work or reorder instructions where safe.

How much ILP you get depends on the task. Graphics and scientific computing often have lots of ILP, while cryptography may have less.

Techniques to extract ILP include instruction scheduling, register allocation/renaming, and optimizing memory access. Some architectures explicitly support ILP through dataflow concepts (for example, the TRIPS approach).

ILP has helped boost performance even as memory latency grows, but large memory delays can still cost many cycles and add complexity and power use. Because of that, the industry is moving toward higher levels of parallelism, such as multiprocessing and multithreading, rather than relying on ILP alone.

This page was last edited on 3 February 2026, at 01:26 (CET).