Diving into V8

March 9, 2026

Part 1. History

Introduction

In the previous research we talked a lot about Node.js: how its Event Loop works and what role the libuv library plays in it.

Now it's time to talk about what unites Node.js and browsers. That would be V8 — the engine that executes JavaScript.

In this deep-dive we'll cover the main components of V8, interesting features, and internal mechanisms. We'll try to look inside it, and also analyze JavaScript code execution through profiling. But let's start with history.

The Origins of V8

V8 is a high-performance open-source engine developed to execute JavaScript. It was introduced by Google in 2008 alongside the first release of the Chrome browser.

Today V8 is used in Chrome and Node.js. In the latter, it allows executing JavaScript outside the browser and accessing system capabilities.

The name "V8", by the way, is a reference to the eight-cylinder internal combustion engine — and indeed, this engine became one of the fastest thanks to hybrid compilation, numerous optimizations, and an efficient memory management system.

JIT Compilation and Bytecode

Everything that runs on a computer ultimately turns into machine code — a sequence of zeros and ones that the processor understands. This is a low-level language specific to the device's architecture (e.g., x86 or ARM).

Developers, of course, don't write in machine code but in high-level languages. For such code to be executed, it must go through a series of transformations. There are different approaches to this process:

  • Interpretation. Code is executed line by line, without prior conversion to machine code. This approach is simple and flexible, but slow.
  • AOT compilation (ahead-of-time). The program is fully compiled to machine code in advance, before launch. This gives high execution speed but reduces flexibility.
  • JIT compilation (just-in-time). An intermediate approach: code is compiled to machine code during execution, as needed.

Before V8, most JavaScript engines worked as interpreters. Code was executed line by line, which significantly limited performance. For example, Mozilla's SpiderMonkey used exactly this model.

As dynamic web applications grew in popularity, a race for speed began. WebKit introduced the SquirrelFish Extreme project, which already used JIT compilation, but its architecture was less performant and versatile.

The revolutionary aspect of V8 was that it was the first to compile JavaScript directly to machine code, bypassing an intermediate bytecode stage. This provided a significant performance boost compared to both interpreters and early JIT approaches.

Bytecode is an intermediate code created from the source code. It is processor-architecture-independent and isn't executed directly, but can be interpreted or serve as the basis for compilation to machine code.

Over time it turned out that the absence of bytecode made memory management harder and slowed down startup, especially when executing code that is called only once. So in 2016 V8 introduced Ignition — an interpreter that converts JavaScript to bytecode and executes it. This allowed faster program startup.

Today, JIT compilation in V8 works as follows:

  1. JavaScript code is parsed into an Abstract Syntax Tree (AST).
  2. The AST is compiled to bytecode by the Ignition interpreter.
  3. The bytecode is executed. Meanwhile, V8 tracks which sections of code repeat most often.
  4. "Hot" code is passed to the compiler, which transforms it into machine code.
  5. If the program's behavior changes (for example, data types become different), deoptimization may occur — a return to bytecode interpretation.

This approach achieves a balance: start executing as quickly as possible, then speed up the critical sections of code.

Additionally, V8 uses hidden classes, inline caches, and advanced memory management algorithms, which allows running complex web applications with performance close to native.

Development Milestones

V8 appeared in 2008 alongside the Chrome browser. Its development was led by Lars Bak. Early versions used a simple parser and compiler. The Strongtalk assembler was also used at that time.

Later the engine evolved as follows:

  • 2010 — the introduction of Crankshaft, an optimizing compiler that significantly increased execution speed.
  • 2015 — transition to TurboFan, which was more versatile and improved support for new JavaScript features.
  • 2016Ignition was added — an interpreter that executes bytecode. This sped up startup and reduced launch overhead.
  • 2021SparkPlug appeared, a compiler aimed at closing the gap between interpretation and optimization.
  • 2023Maglev was added, a compiler with a focus on speed and efficient memory usage.

Alternatives to V8

Other JavaScript engines exist, used in different browsers and systems:

  • JavaScriptCore (JSC) — Apple's engine, which powers Safari. It focuses not only on speed but also on energy efficiency, which is especially important for mobile devices.
  • SpiderMonkey — Mozilla Firefox's engine. One of the first engines for JavaScript, with support for all modern standards.

V8's main distinguishing feature is its focus on JIT compilation and adaptation to modern web applications. Its widespread adoption through Chrome and Node.js has made it the de facto standard in the world of frontend development.


Part 2. What the Engine Is Made Of

Introduction

In the first part we covered the history of V8's creation and development, as well as its differences from other JavaScript engines. We mentioned that V8 consists of several components, and in this part we'll examine each of them in more detail. We'll look at how they work, in what order they operate, and what ideas underlie their architecture.

Full-codegen and V8's First Steps

Originally, V8 was introduced in the official Chromium blog as an engine that compiles JavaScript directly to machine code. Along with this it introduced a hidden class system that is still used today. The initial compiler was fairly simple, didn't support serious optimizations, and treated all code the same way. It later became known as Full-codegen.

Crankshaft and Adaptive Optimizations

The next stage was the Crankshaft compiler, whose name translates to "crankshaft" — a reference to the automotive theme running throughout V8. The announcement of Crankshaft was accompanied by a notable performance improvement.

Crankshaft first introduced the adaptive compilation strategy: dividing code into "hot" code, which runs frequently and warrants deep optimization, and "cold" code, which can be left as-is. This explains why benchmarks without a "warm-up" phase didn't show significant performance gains: optimizations were only applied over time.

Key techniques used by Crankshaft:

  • SSA (Static Single Assignment form) — a code representation where each variable is assigned a value only once. This simplifies dependency analysis and enables other optimizations. For example:
  let x = 1;
  x = x + 2;
  x = x * 3;

In SSA this looks like:

  let x1 = 1;
  let x2 = x1 + 2;
  let x3 = x2 * 3;
  • Loop-invariant code motion — moving expressions that don't change inside a loop outside of it. For example, arr.length doesn't need to be recalculated on every iteration.
  • Linear-scan register allocation — distributing variables between registers and memory with the goal of keeping as many active values as possible in CPU registers, where access is faster.
  • Inlining — embedding the bodies of small functions instead of calling them, which reduces the overhead of "jumping."
  • Type assumptions. JavaScript is a dynamic language and types aren't known in advance. So the engine collects statistics: for example, a variable is most often a number. This allows more efficient optimizations to be applied.

TurboFan: Multi-Stage Optimization

Later, Crankshaft was replaced by TurboFan, which is still used in V8 today. The main difference is a more powerful and versatile approach to generating optimized machine code.

  • TurboFan uses a dependency graph (sea of nodes), an intermediate representation where operations are represented as nodes and the dependencies between them as edges. This provides flexibility when reorganizing code and finding unused sections.
  • The compiler is built as a multi-stage pipeline: code passes through a series of transformations, with specific rules and analyses applied at each stage.
  • Instead of complex algorithms, local rewriting rules are used: x * 1 → x, x + 0 → x, if (true) → remove branching.
  • TurboFan can perform value range analysis, enabling even more precise optimizations.
  • Since code in graph form isn't tied to a strict order, TurboFan can freely move nodes: hoist unnecessary computations out of loops, move them to less frequently executed paths, and keep frequently executed paths maximally "clean."
  • TurboFan generates code targeting a specific processor architecture: it accounts for SIMD instructions, register and instruction characteristics, etc.

Ignition: Register Interpreter and Bytecode

Despite the efficiency of JIT compilers like Crankshaft and TurboFan, for some scenarios — especially on mobile devices — startup speed and low memory consumption were more important. So V8 introduced the Ignition interpreter, which is still used today.

Ignition works as a register machine.

With a stack machine approach (like the JVM), every operation works with a stack: "take two values from the stack → add them → put the result back on the stack." This looks simpler and more compact in the resulting bytecode, but adds a lot of execution overhead.

With a register machine, each instruction directly states which virtual registers to work with. This gives a speed advantage and simplifies further JIT compilation.

Ignition also has a special accumulator: it's used as an "implicit" register for most temporary values. For example, the expression a + b * c can store intermediate results directly in the accumulator, avoiding a lot of unnecessary loads and stores.

Additionally, simple optimizations are applied immediately during bytecode generation: pattern replacement, elimination of redundant operations, minimization of register movements. As a result, the bytecode is not only compact but also fast to execute.

Dropping Full-codegen and Crankshaft

Before Ignition and TurboFan, V8 used a Full-codegen and Crankshaft combination. Full-codegen handled basic code execution, while Crankshaft kicked in on repeated calls to apply optimizations. But over time it became clear that this architecture had serious drawbacks: Full-codegen generated too much machine code, even for code executed only once, and Crankshaft was too complex and poorly supported new JavaScript features.

Ignition and TurboFan made it possible to drop this combination. In 2017 the V8 team officially removed Full-codegen and Crankshaft. From that point on, code in V8 followed two scenarios:

  • If the code is "cold," it stays as bytecode and is executed by the Ignition interpreter.
  • If the code becomes "hot," its bytecode is profiled and passed to TurboFan, where it's compiled into highly optimized machine code.

This approach improved script load startup time and reduced memory consumption.

Sparkplug: The Baseline Compiler

Despite the efficiency of Ignition and TurboFan, in real applications there's a large layer of code that doesn't run frequently, but still runs enough to justify moving from interpretation to compilation. However, TurboFan is a heavy tool and it's impractical to apply it to everything. So Sparkplug was created — a lightweight baseline compiler that complements the existing pipeline.

Sparkplug compiles functions not from the original JavaScript but from already-generated bytecode. This means the work of parsing syntax, resolving variables, and destructuring has already been done. Sparkplug doesn't create intermediate representations like a dependency graph the way TurboFan does. Instead, it passes through the bytecode linearly and generates machine code directly. The entire compiler is essentially a large switch inside a for, where each bytecode instruction maps to a piece of ready-made generation logic.

It performs almost no optimizations, except local ones (e.g., it removes x + 0). But this isn't a problem, because it's not designed to achieve peak performance. Its job is to free the interpreter from overhead: decoding instructions, branch prediction, extracting operands from memory. Instead, Sparkplug simply "serializes" the interpreter's execution. It also actively uses built-in machine code fragments (code stubs), shared with Ignition, to avoid duplicating the complex implementation of JavaScript operations. Thanks to this, compilation is fast and memory is used economically.

Maglev: Fast Optimizations Between Sparkplug and TurboFan

In 2023 the V8 team introduced Maglev — another JIT compiler that took an intermediate position between Sparkplug and TurboFan. It appeared because the performance gap between them was too large: Sparkplug compiles fast but barely optimizes, while TurboFan optimizes deeply but takes a long time to compile.

Maglev builds a dependency graph and uses an SSA-like representation (where each variable is assigned a value only once), simplified compared to TurboFan. This allows it to perform basic optimizations:

  • dead code elimination,
  • expression simplification,
  • branch restructuring,
  • instruction movement.

Maglev also uses profiling data and makes type assumptions (e.g., "this variable is always a number"), which allows inserting fast versions of operations without additional checks.

The key feature of Maglev is the balance between compilation speed and code quality. It compiles 10–20 times faster than TurboFan, while delivering comparable performance in most common scenarios. This allows it to be applied more frequently without risking overloading the system in terms of time or memory.

The Overall Code Execution Pipeline

Today, V8 operates the following multi-level pipeline:

**JS**
↓
**Ignition** (bytecode + interpretation for cold operations)
↓
**Sparkplug** (compiles baseline machine code from bytecode)
↓
**Maglev** (compiles moderately optimized machine code from bytecode)
↓
**TurboFan** (compiles highly optimized machine code from bytecode)

This approach allows efficient code execution without spending resources on rarely used sections, while achieving high performance where it truly matters.

However, these are far from all the components of the large V8 engine system. We'll talk about other parts in the following sections.

Part 3. Parsing, AST and Code Analysis

Introduction

In the previous parts we covered the history of V8 and the architecture of its Ignition interpreter and Sparkplug, Maglev, and TurboFan compilers. But before any of them can begin working, JavaScript code must go through several preparatory stages.

When the engine receives source code, the first thing it must do is understand what that code means. To do this, the string of characters is transformed into a structured representation that compilers can work with.

In this part we'll break down how V8 parses JavaScript code: from the first character to the finished bytecode that Ignition can execute.

Lexical Analysis

The first stage of processing any code is lexical analysis, or tokenization. Its job is to break the stream of characters into tokens — the smallest meaningful units of the language.

For example, the code:

const x = 42;

Is turned into a sequence of tokens:

  • Token::CONST (keyword)
  • Token::IDENTIFIER with value "x"
  • Token::ASSIGN (the = operator)
  • Token::NUMBER with value 42
  • Token::SEMICOLON (the terminating semicolon)

V8 uses a scanner that reads the source code character by character and groups them into tokens.

It's important to note that the scanner doesn't work with a ready-made string but with a stream of characters. This allows it to process a script before the entire file has fully loaded over the network. Thanks to this, V8 can start scanning and preparing for parsing from the very first bytes of a loading script, without waiting for the transfer to complete. This is especially critical for large bundles or slow connections.

Handling Whitespace and Comments

Although spaces, tabs, and comments seem like "noise," they play an important role for V8. All such characters are classified as Token::WHITESPACE, and before processing a new token the engine skips them sequentially. However, there's one important detail: if a line break was encountered among the skipped whitespace, this can affect subsequent tokenization — especially due to automatic semicolon insertion rules.

function test() {
    return
    42;
}

The scanner must understand that a semicolon should be inserted after return, and the function will return undefined, not 42.

Abstract Syntax Tree (AST)

After tokenization, syntactic analysis begins — the construction of an Abstract Syntax Tree (AST). An AST is a tree-like structure that reflects the syntactic structure of a program, but abstracts away the specific details of how it's written.

For example, the expression a + b * c becomes a tree:

    +
   / \
  a   *
     / \
    b   c

For this JavaScript example:

function add(a, b) {
    return a + b;
}

add(2, 3);

The AST representation for the add function will look like this:

--- AST ---
FUNC at 12
. KIND 0
. LITERAL ID 1
. SUSPEND COUNT 0
. NAME "add"
. INFERRED NAME ""
. PARAMS
. . VAR (0x11c00e33470) (mode = VAR, assigned = false) "a"
. . VAR (0x11c00e334f0) (mode = VAR, assigned = false) "b"
. DECLS
. . VARIABLE (0x11c00e33470) (mode = VAR, assigned = false) "a"
. . VARIABLE (0x11c00e334f0) (mode = VAR, assigned = false) "b"
. RETURN at 25
. . kAdd at 34
. . . VAR PROXY parameter[0] (0x11c00e33470) (mode = VAR, assigned = false) "a"
. . . VAR PROXY parameter[1] (0x11c00e334f0) (mode = VAR, assigned = false) "b"

And for the script root:

--- AST ---
FUNC at 0
. KIND 0
. LITERAL ID 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. DECLS
. . FUNCTION "add" = function add
. EXPRESSION STATEMENT at 42
. . kAssign at -1
. . . VAR PROXY local[0] (0x11c00e334e8) (mode = TEMPORARY, assigned = true) ".result"
. . . CALL
. . . . VAR PROXY unallocated (0x11c00e333e0) (mode = VAR, assigned = true) "add"
. . . . LITERAL 2
. . . . LITERAL 3
. RETURN at -1
. . VAR PROXY local[0] (0x11c00e334e8) (mode = TEMPORARY, assigned = true) ".result"

Lazy Parsing of Functions

To speed up script loading and save resources, V8 uses a lazy parsing strategy. Instead of immediately parsing all the code and building an AST for it, the engine can temporarily limit itself to a preliminary check of a function using the preparser.

The preparser is a lightweight syntactic analysis that:

  • Checks code validity (are there any syntax errors).
  • Tracks variable declarations and references to them — to correctly place variables on the stack or in context.
  • Skips inner functions and doesn't build an AST for them until they are actually needed.

This is especially useful when working with large scripts that have many functions that may not be used at startup. For example:

function outerFunction() {
    // This function will be preparsed
    function innerFunction() {
        console.log("Hello");
    }

    // This part will be fully parsed
    console.log("Outer");
}

Unlike a full parser, the preparser doesn't save a parse tree. Instead, it collects a minimal set of data — for example, information about closures (closure scope) and the need to use context (heap allocation).

An exception for the preparser are functions that are likely to be executed immediately, for example:

(function() { /* ... */ })();

Such constructs are called PIFEs (Possibly-Invoked Function Expressions) in V8. The preparser doesn't defer them and immediately sends them to full parsing. To do this, it uses a heuristic based on syntactic construct analysis, including the presence of parentheses around the function declaration.

The preparser allows V8 to balance between startup speed and completeness of code parsing, saving resources without losing compatibility.

Scope and Variable Analysis

In addition to parsing, V8 also performs scope analysis during code processing — determining variable visibility scopes and their relationships.

Each variable is either placed on the stack or in context, if it can be accessed by nested functions.

Consider this example:

function outer() {
    const b = 1;
    function inner(a) {
      console.log(a + b);
    }
    return inner;
}

If we output the scope information for it, we get the following data:

Inner function scope:
function outer () { // (0x12400e33220) (14, 111)
  // NormalFunction
  // 2 heap slots
  // local vars:
  CONST b;  // (0x12400e30a48) forced context allocation, never assigned
  VAR inner;  // (0x12400e30d58) never assigned

  function () { // (0x12400e30a78) (54, 91)
    // NormalFunction
    // 2 heap slots
    // local vars:
    VAR a;  // (0x12400e30cb0) never assigned
  }
}
Global scope:
global { // (0x12400e33030) (0, 112)
  // will be compiled
  // NormalFunction
  // local vars:
  VAR outer;  // (0x12400e334d8) 

  function outer () { // (0x12400e33220) (14, 111)
    // lazily parsed
    // NormalFunction
    // 2 heap slots
  }
}

Note the line CONST b; // (0x12400e30a48) forced context allocation, never assigned — this means b is accessible in inner's context.

If we remove the use of b inside inner:

function outer() {
    const b = 1;
    function inner(a) {
      console.log(a);
    }
    return inner;
}

the data changes to CONST b; // (0x10c00e30a48) never assigned.

And if we separate the functions entirely:

function outer() {
    const b = 1;
}

function inner(a) {
    console.log(a);
}

the scopes will be separate too:

Inner function scope:
function outer () { // (0x13c00e33220) (14, 37)
  // NormalFunction
  // 2 heap slots
  // local vars:
  CONST b;  // (0x13c00e30a48) never assigned
}
Inner function scope:
function inner () { // (0x13c00e33410) (53, 80)
  // NormalFunction
  // 2 heap slots
  // local vars:
  VAR a;  // (0x13c00e30a30) never assigned
}
Global scope:
global { // (0x13c00e33030) (0, 81)
  // will be compiled
  // NormalFunction
  // local vars:
  VAR inner;  // (0x13c00e335d0) 
  VAR outer;  // (0x13c00e333e0) 

  function inner () { // (0x13c00e33410) (53, 80)
    // lazily parsed
    // NormalFunction
    // 2 heap slots
  }

  function outer () { // (0x13c00e33220) (14, 37)
    // lazily parsed
    // NormalFunction
    // 2 heap slots
  }
}

From AST to Bytecode

After analysis is complete, the AST is passed to the bytecode generator. This component traverses the tree and generates corresponding bytecode instructions for each node.

For our first example:

function add(a, b) {
    return a + b;
}

add(2, 3);

the bytecode for the add function will look like this:

0x153a001000c8 @    0 : 0b 04             Ldar a1
0x153a001000ca @    2 : 40 03 00          Add a0, [0]
0x153a001000cd @    5 : b7                Return

And for the script root:

0x153a00100084 @    0 : 13 00             LdaConstant [0]
0x153a00100086 @    2 : d1                Star1
0x153a00100087 @    3 : 1b fe f7          Mov <closure>, r2
0x153a0010008a @    6 : 6e 70 01 f8 02    CallRuntime [DeclareGlobals], r1-r2
0x153a0010008f @   11 : 23 01 00          LdaGlobal [1], [0]
0x153a00100092 @   14 : d1                Star1
0x153a00100093 @   15 : 0d 02             LdaSmi [2]
0x153a00100095 @   17 : d0                Star2
0x153a00100096 @   18 : 0d 03             LdaSmi [3]
0x153a00100098 @   20 : cf                Star3
0x153a00100099 @   21 : 6c f8 f7 f6 02    CallUndefinedReceiver2 r1, r2, r3, [2]
0x153a0010009e @   26 : d2                Star0
0x153a0010009f @   27 : b7                Return

At the parsing and bytecode generation stages, V8 already applies simple optimizations, such as replacing instruction patterns with more efficient ones or removing unreachable code.

// Will be removed
function test() {
 const y = 7;
}

const x = 2 + 3; // Becomes const x = 5;

Impact on Performance

The speed of code parsing directly affects application startup time. The faster the engine parses a script, the faster it starts executing. There are several approaches that help speed up this process:

  • Minification. As research has shown, shortened variable names and the absence of unnecessary whitespace do actually speed up tokenization.
  • Code splitting. Dividing code into modules allows parsing only the part needed at the current moment. The rest can be deferred until first use, saving resources.
  • Simple function structure. The fewer deeply nested functions and IIFEs there are, the easier it is for the preparser to determine what can be deferred for now. This reduces load in the early stages and speeds up startup.

Part 4. Memory Management and Garbage Collection

Introduction

Many modern language engines, such as V8, dynamically manage memory for running applications, freeing developers from the need to handle this themselves. The engine periodically scans the memory allocated to the application, determines which data is no longer needed, and clears it, freeing up space. This process is called Garbage Collection (GC). For optimization, V8 uses a complex memory management system with several types of garbage collectors.

Memory Management Basics in V8

The JavaScript engine works with two main memory areas: the stack and the heap.

The stack is a fast and compact structure that stores primitives and references to objects. It works on a LIFO (Last-In, First-Out) principle, and data in it has a very short lifespan — for example, for the duration of a function call. All "heavy" entities like objects, arrays, or closures are stored in the heap, and garbage collectors work specifically with it.

The heap in V8 is divided into generations. There is the Young Generation, where all new objects appear, and the Old Generation, where survivors are eventually moved. This division is based on statistics: most objects "die young" — they have a short lifespan and quickly become garbage. If GC checked the entire heap every time, it would be very expensive. But by focusing on the young generation, a large amount of garbage can be collected quickly and almost imperceptibly to the user.

Generational Garbage Collection

The generational model in V8 works as follows: all new objects are created in the Nursery — the smallest area of the heap. If they survive the first garbage collection, they are moved to the Intermediate Generation, and if they survive the second — to the Old Generation. This is beneficial for the engine: short-lived objects disappear immediately, and GC doesn't waste time checking long-lived ones again and again.

Imagine calling a function a thousand times and creating a temporary object each time. Almost all of them will die immediately after leaving the function, and the collector will quickly clean up the Nursery. But the application's global cache will survive to the Old Generation. There, more complex algorithms come into play, because memory becomes large and fragmented.

Young Generation and Minor GC

Previously, the young generation was cleaned using the simple Cheney algorithm. Its essence is as follows. Memory is divided into two equal parts: from-space and to-space. When collection is triggered, GC takes all live objects from from-space, copies them into to-space, and updates the pointers. Everything remaining in from-space is considered garbage and freed entirely. Then the roles of the areas swap.

This approach eliminates fragmentation in one pass: new objects in to-space end up in a dense chunk of memory, and allocating new ones will be fast. The cost is the need to copy data, but since we're only dealing with the young generation, which typically occupies tens of megabytes, the pause is very short.

However, although this algorithm was efficient, it couldn't take advantage of multi-core processors. So over time V8 switched to Parallel Scavenger — a parallel copying collector that can use multiple threads to process the young generation.

Parallel Scavenger works as follows:

  • Young generation memory is still divided into two areas (from-space and to-space).
  • Live objects are copied from one area to another.
  • The key difference: work is dynamically distributed among multiple threads.
  • A work-stealing algorithm is used — if one thread finishes its work, it can help others.

This approach reduces young generation collection time while preserving all the advantages of the copying algorithm: no fragmentation and compact placement of surviving objects.

Old Generation and Major GC

Once an object reaches the Old Generation, things become more complex. Memory is much larger here, and simple copying is no longer appropriate. The Mark-Sweep and Mark-Compact algorithms come into play.

First, the engine performs the mark phase: it traverses the object graph and marks all live objects. Then comes sweep: unmarked memory sections are freed. But this method leaves holes, and over time the heap becomes fragmented. To fix this, compacting is run — objects are shifted, memory is "compressed," and large contiguous chunks are freed.

If all of this were done at once, we'd get long stop-the-world pauses that would cause the interface to stutter and lag. To avoid this, V8 uses parallelism and incrementality:

  • Concurrent Marking — marking is performed in background threads in parallel with JavaScript execution.
  • Parallel Sweeping — memory freeing happens in multiple threads simultaneously.
  • Incremental Mark-Compact — memory compaction is broken into small stages.

Thanks to this, GC stops being 100% blocking and moves most of its work to the background.

Orinoco

Orinoco is the general name for the new generation of GC in V8. It includes Parallel Scavenger (multi-threaded Young Generation collection), as well as all improvements to Old Generation collection. This means the collector can now simultaneously collect garbage and compact memory while JavaScript continues running.

To make this possible, auxiliary tools are needed.

One of the key ones is write barriers. This mechanism fires when references to objects in the heap are modified. It's used to keep the garbage collector informed about all changes in the object graph during incremental or parallel garbage collection. The main goal is to guarantee that processed objects don't point to unprocessed ones, and to avoid missing live objects during the marking process. When a new pointer is written, write barrier checks it and, if necessary, marks the new field and adds it to the marking work list so the collector can process it later. This prevents the erroneous deletion of objects that were just referenced during garbage collection.

In turn, remembered sets are data structures that track references between different garbage collection generations — for example, references from the Old Generation to the young generation (Nursery or Intermediate). This allows the garbage collector to efficiently find objects referenced from other generations without needing to do a full scan of the entire heap.

Additionally, V8 schedules garbage collection not only reactively when memory overflows, but proactively — using the browser's idle-time scheduler. When the engine sees a free window (for example, a few milliseconds before the next frame), it triggers an appropriate GC stage: a quick minor GC for the young generation, an incremental marking phase, or a background sweep for the old generation. If the deadline is tight, only part of the work is done; if the window is longer — compacting can be afforded. The key idea is to break heavy operations into chunks and parallelize them so as not to block the main thread.

In the end, all these mechanisms aim to make garbage collection as unnoticeable as possible. And although pauses can't be eliminated entirely, they've become shorter and less frequent.

Other Memory Management Techniques in V8

When we talk about memory in V8, we most often mean the work of the garbage collector. But in reality, optimizations exist much earlier — at the level of how the engine stores data and manages the heap.

  • Pointer compression and memory cage. Modern processors use 64-bit addresses, and if every pointer is stored as 8 bytes, the memory volume for references alone grows significantly. To avoid this, V8 uses pointer compression: instead of a full address, a 32-bit offset from a base point (the so-called memory cage) is stored. As a result, all objects are placed within a single 4-gigabyte range, which is sufficient for JS applications. This approach reduces memory consumption by almost half and improves CPU cache performance. At the same time, this enhances security: the cage acts as a "sandbox," preventing stray references from leaving the heap.

  • Tracking external memory. Not all data that JS code works with is in the heap. For example, ArrayBuffer or Buffer in Node.js can store gigabytes of binary data outside V8. To avoid losing control, the engine tracks such memory and can trigger GC if the total volume goes beyond a reasonable limit. This way, even resources outside the heap are included in the overall memory budget.

  • Oilpan and DOM integration. In the Chromium ecosystem there's a separate project called Oilpan, responsible for memory management for C++ objects in the Blink engine. Its idea is to link V8's GC and C++ object collection into a single mechanism. So if you have a JS object and a corresponding DOM node, they will be collected synchronously, without leaks or dangling references.


Part 5. Hidden Optimizations

Introduction

After parsing and bytecode generation, V8 begins executing JavaScript and "learning" from it. Everything we write in code goes through a complex cycle of interpretation, optimization, and possible deoptimization. JavaScript is a dynamic language where objects can change on the fly, types can be swapped, and functions can be called with any arguments. However, V8 has learned to turn this into predictable and fast machine code.

In this part we'll break down how the engine optimizes working with objects, properties, and functions, how it builds internal data representations, and how the TurboFan optimizing compiler works. We'll also look at the recommendations the V8 team gives to make your code as optimized by the engine as possible.

Numbers and Pointers

In 2014, Chrome switched from 32-bit to 64-bit architecture. This improved Chrome's security, stability, and performance, but led to increased memory consumption since each pointer now takes eight bytes instead of four. So in 2020 the V8 team introduced the concept of Pointer Compression. The goal was simple — bring the effective pointer size back to 32 bits. We already mentioned this concept in the previous part, but now let's examine some nuances.

The V8 heap contains many elements: floating-point values, string characters, interpreter bytecode. But about 70% of the V8 heap is typically occupied by tagged values — a special encoding method where a few bits of a word are reserved for storing type information.

In V8, the least significant bit determines whether a value is a pointer to an object in the heap or a simple integer. This allows an integer to be stored directly in the tag, without allocating additional memory for it or making extra checks. Simple integers stored this way are called Smi (Small Integers). This saves memory and speeds up access to the most common values in JavaScript — integers in a small range.

For pointers, 31 bits of payload are available for the address, giving 4 GB of address space. The value is stored as an offset from the heap's base address.

In the end, pointer compression reduced the V8 heap size by up to 43% and the memory consumed by the Chrome rendering process by up to 20% on desktop.

But there's a catch: when using pointer compression, the heap size is automatically limited. In particular, this optimization in Node.js won't work with a heap size larger than 4 GB.

Arrays

Arrays deserve special attention. In V8 they are a special type of object, optimized for storing data of the same type. To speed up operations, the engine uses a system of Elements Kinds — internal categories that determine exactly how array elements are stored in memory and how they're accessed. These categories allow V8 to avoid unnecessary checks and choose the most efficient data access method.

When an array is created, the engine analyzes its contents. If all elements are small integers, the array gets the type PACKED_SMI_ELEMENTS, where values are stored compactly and directly.

const array = [1, 2, 3];

If floating-point numbers appear among the elements, V8 transitions the array to PACKED_DOUBLE_ELEMENTS, and if objects are added — to PACKED_ELEMENTS.

const array = [1, 2, 3];
// elements kind: PACKED_SMI_ELEMENTS
array.push(4.56);
// elements kind: PACKED_DOUBLE_ELEMENTS
array.push('x');
// elements kind: PACKED_ELEMENTS

The appearance of even a single "hole" (an element with undefined or a skipped index) forces the engine to switch to a version with the HOLEY prefix, for example HOLEY_SMI_ELEMENTS.

const array = [1, 2, 3, 4.56, 'x'];
// elements kind: PACKED_ELEMENTS
array.length = 5;
array[9] = 1;
// elements kind: HOLEY_ELEMENTS

These transitions, called elements-kind transitions, happen automatically and affect performance. When the engine is forced to work with "holey" arrays or heterogeneous element types, it can no longer use fast offset-based access and must perform additional checks on every read and write. Additionally, such arrays lose some JIT compiler optimizations, and their operations execute through more general but slower paths.

To minimize the number of transitions, V8 tries to preserve the original element type as long as possible. For example, if an array was originally numeric but a string was later added, the engine is forced to change the storage method for all elements. Therefore, the more stable the data structure and the fewer mixed types inside the array, the faster it performs.

Objects and Hidden Classes

Another important structure in JavaScript is the object. Unlike statically typed languages where object structure is fixed, JavaScript allows adding or removing a property at any time. If the engine created a unique structure for each object every time, this approach couldn't be performant. So V8 uses Hidden Classes (or Maps) — internal templates that describe the placement of properties in memory. When an object is created, the engine forms a hidden class for it that stores information about the sequence of properties and their offsets.

Every time a new property is added to an object, a new version of the class is created, and the old one is linked to it via a transition. For example, if we have an object {a: 1} and then add b, the engine creates a new map for the state {a, b} and stores the transition from old to new. This way, objects with the same property order share the same Hidden Class, which allows V8 to access properties directly by offset — like structures in C++.

If two objects are created with the same properties but in different order, they'll get different maps, and the engine won't be able to optimize access the same way. This is one reason why it's recommended to initialize all properties in the constructor and avoid adding new properties dynamically. The more objects share one map, the more predictable their behavior is for the JIT compiler, and the faster the code runs.

Hidden Classes are closely related to another mechanism — Inline Caching, which allows V8 to remember property access patterns and reuse them.

Inline Caching (IC)

Inline Caching is the way the engine remembers the results of previous property or function accesses.

In JavaScript, the code obj.a doesn't mean you can simply take the value of a from the object. First, a getter might be called here. If neither a value nor a getter is present on the object itself, the prototype chain must be traversed, and so on.

On first access to a property, the engine performs a full resolution — it goes through all the options and chains and finds the needed value. After that, it saves the result along with information about the object's type. On the next access, V8 checks whether the Hidden Class of the object matches what was cached earlier. If it matches, the engine accesses the property directly, bypassing the entire search chain.

There are several levels of IC. A monomorphic cache means the property is always requested from objects of one type. If the engine encounters multiple types, the cache becomes polymorphic. When the number of different maps exceeds a certain threshold, the cache is declared megamorphic, and further optimizations become impossible. This scenario is especially common when working with dynamic data structures or code where the same method is called on different types of objects.

Understanding how IC works helps avoid situations where code becomes too "megamorphic". If functions are called with different types of objects, the engine spends more time on checks and may ultimately deoptimize the code. So it's worth writing predictable code where the same function handles data of similar structure.

TurboFan and Intermediate Representation

TurboFan is the heart of V8's performance — an optimizing compiler that for a long time used the Sea of Nodes architecture. In this model, a program is represented not as familiar lines of code or sequential instructions, but as a dependency graph between nodes. Each node is an operation (for example, addition, comparison, reading a variable), and the graph's edges reflect three types of dependencies: value edges, control edges, and effect edges.

  • value edges describe data passing: the result of one operation is used in another. For example, if we have x + y, the nodes x and y connect to the addition through value edges.
  • control edges define execution order — what must happen earlier and what later (for example, the body of an if depends on the result of the check).
  • effect edges capture side effects: reading and writing variables, memory access, function calls that can change something.

Thanks to separating these three types of dependencies, the compiler can freely rearrange pure (side-effect-free) operations and eliminate unnecessary computations without breaking program correctness.

Consider a simple example:

let x = 1;
let y = 2;
let z = x + y;

In Sea of Nodes this isn't three lines but a network of nodes: 1 and 2 are constant nodes, their values flow via value edges to the Add node, whose result flows via value edges to the Store[z] node. Order isn't specified explicitly — it's determined by control and effect edges. control edges ensure Store[z] doesn't execute before Add is computed, while effect edges ensure the memory write happens after all computations that affect it. This structure provides flexibility for optimizations: TurboFan can, for example, remove unused nodes, merge identical expressions, or rearrange independent operations.

However, this model has weaknesses. When branches, exceptions, or async operations appear in code, the graph grows complex chains of dependencies: every condition, every side effect pulls along control and effect links, making rearrangements limited and the graph itself hard to read and optimize. Adding a new execution branch or new memory state to such a structure is difficult: part of the graph must be rebuilt and dependencies recalculated.

To solve these problems, the V8 team began transitioning to a CFG (Control Flow Graph) architecture, implemented in the Turboshaft project. CFG is a classic representation of a program as a set of basic blocks — linear sequences of instructions without branches, connected by transitions (edges) describing possible execution flow. Unlike Sea of Nodes, execution order is explicitly defined here: each block knows who comes before and after it. This makes control-flow-related optimizations simpler and more predictable, and compiler behavior more manageable.

If we rewrite the previous example at the CFG level, we get a linear structure:

Block0:
  x = 1
  y = 2
  z = x + y
  goto Block1
Block1:
  return

No "floating" dependencies — everything is sequential and transparent.

And if we add a condition, for example:

if (x > 0) {
  y = y + 1;
} else {
  y = y - 1;
}

in Sea of Nodes this would be one graph where the comparison, addition, and subtraction nodes are connected by control and effect edges indicating which of them are active depending on the check result. In CFG the structure is divided into blocks:

Block0:
  t0 = x > 0
  if t0 goto Block1 else goto Block2

Block1:
  y = y + 1
  goto Block3

Block2:
  y = y - 1
  goto Block3

Block3:
  return

This representation makes branches and transitions obvious and allows the compiler to easily analyze execution paths, merge blocks, eliminate dead code, and apply classical control-flow optimizations.

In the end, the V8 team decided to move away from Sea of Nodes: the entire JavaScript backend has already moved to Turboshaft, WebAssembly is also fully on Turboshaft, and parts of Sea of Nodes are being gradually removed.

Code Writing Recommendations

V8 is actively developing and supporting new JavaScript standards, and user devices have reached a new level in terms of CPU power and memory, so we no longer need to keep tables of "performance killers" in our heads. However, based on everything we've discussed above, we can gather some recommendations to make code work even more optimally and get the most out of the engine.

  • Keep arrays "packed" and without holes. Arrays are divided into packed / holey. Any "hole" (a skipped index) transitions the array to a holey variant, which adds checks on access. Try not to create sparse arrays by assigning far beyond the current length, avoid using delete, and initialize arrays fully if you know the size.
// worse: creates a hole, array will become HOLEY_*
const a = [1, 2, 3];
a[10] = 42;        // many "empty" slots between 3 and 10

// better: grow without holes
const b = [];
b.push(1); b.push(2); b.push(3); // stays PACKED_*
  • Keep element type stable. Once a floating-point number enters an SMI array, it transitions to DOUBLE variant; if an object/null/undefined arrives, it becomes PACKED_ELEMENTS. Avoid unintentional type upgrades.
// worse: degradation SMI -> DOUBLE
const xs = [1, 2, 3];  // PACKED_SMI_ELEMENTS
xs.push(3.14);         // now PACKED_DOUBLE_ELEMENTS

// better: decide on the type upfront
const ys = [1, 2, 3, 4];        // all integers
const zs = [1.0, 2.0, 3.5];     // all doubles, no switching
  • If you know the size — fill it, don't leave "holes." new Array(n) creates an array with "empty" slots (holey) until they're explicitly filled. Fill immediately to stay in packed mode.
// worse: HOLEY until elements are assigned
const tmp = new Array(3);   // [ <3 empty items> ]

// better: immediately PACKED_SMI
const ok = new Array(3).fill(0); // [0, 0, 0]
  • Try to maintain monomorphism. The Inline Cache remembers the "shape" of objects (their hidden class) at a call site. If objects of different shapes frequently appear at the same place in code, the IC becomes polymorphic, and with large differences — megamorphic, making optimizations worthless. Construct objects in one order and don't add fields dynamically.
// worse: the same call-site sees different shapes
function getX(o) { return o.x; }    // want monomorphic IC

const a1 = {}; a1.x = 1;            // shape: {x}
const b1 = {}; b1.y = 2; b1.x = 3;  // shape: {y,x} (different order)

getX(a1); // IC #1
getX(b1); // IC sees different map -> polymorphic

// better: fix the shape via constructor/class
function A(x) { this.x = x; this.y = 0; } // consistent order
const a2 = new A(1);
const b2 = new A(3);
getX(a2); getX(b2); // monomorphic IC
  • Normalize inputs to "hot" functions. If objects of different shapes arrive at the same hot code path, insert lightweight normalization "before," or split the path into multiple functions (by shape) so each access point stays monomorphic. This is better than one "universal" point sliding into megamorphism.
// splitting paths helps keep monomorphic IC in each
function readX_A(o /* shape A */) { return o.x; }
function readX_B(o /* shape B */) { return o.x; }

function readX(o) {
  return (o.hasOwnProperty('y') ? readX_A(o) : readX_B(o));
}
  • Remember that elements and properties are different stores. Numeric keys go into the "elements store," named ones into the "properties store"; mixing access patterns can lead to less predictable map transitions. This is another reason not to turn arrays into dictionaries or use "holey" indices.

These practices avoid interfering with the optimizer. Then V8 can more often generate direct access to the needed data in objects and arrays, and less often fall back to deoptimization.


Part 6. From Environment to Environment

Introduction

In the previous parts we examined V8's internal workings in detail: from its creation history and compiler architecture to memory management mechanisms and hidden optimizations. But V8 is not just an engine running in isolation. It's the foundation for an entire ecosystem of technologies, each with its own requirements and constraints.

In this final part we'll look at how V8 integrates into various environments: the Chrome browser, the Node.js server platform, and the WebAssembly ecosystem. We'll talk about the specifics of each environment, what tasks the engine solves in each context, and look ahead — what changes await V8 and what directions the team is pursuing.

Bindings

The browser is V8's native environment. This is where the engine appeared in 2008 alongside Chrome and continues to develop in close conjunction with the browser's other components.

In the browser environment, V8 doesn't operate on its own but as part of a larger system. Alongside it runs the Blink rendering engine, responsible for building the DOM tree, computing styles, layout, and paint. Between JavaScript and the DOM there is a bridge — bindings — that allows code to access page elements, listen to events, and manipulate styles and attributes.

Bindings enable communication between the C++ world of the Blink engine and JavaScript objects. When a script accesses, for example, document.body, V8 is actually working not with the DOM object itself but with its wrapper — a JS wrapper created for a specific context. The same DOM element can have multiple such wrappers if it's accessed from different contexts — for example, from the main document and from a frame (<iframe>). Each wrapper is stored separately to guarantee security and data isolation between scripts.

The primary unit of JavaScript execution in the browser is an isolate. It represents a separate instance of the V8 engine, with its own memory and garbage collector. The main thread of a page has one isolate, and each Web Worker or Service Worker gets its own.

Inside an isolate there can be multiple contexts — essentially different JavaScript global environments with their own window, document, and prototype objects. For example, a page and each <iframe> have their own context, isolated from the others.

On top of this layer exists the concept of worlds (worlds). The main world is the code of the page itself. Isolated worlds are environments for browser extensions. The worker world is the world of a Web Worker or Service Worker.

To summarize: the main thread isolate consists of one main world and N isolated worlds. A worker isolate consists of one worker world and 0 isolated worlds. All worlds in one isolate share the common underlying C++ DOM objects, but each world has its own JS wrappers. Each world has its own context and therefore its own global variable scope and prototype chains.

This multi-level architecture — isolateworldcontext — allows the browser to simultaneously ensure both performance and security. For example, it allows Chrome extensions to safely interact with a page without disrupting its scripts.

Browser

One of the key features of the browser environment is the need to ensure interface smoothness. If JavaScript runs too long, it blocks the main thread and causes "freezes" — the user can't interact with the page, animations stop, the interface becomes unresponsive. So V8 in the browser actively uses strategies aimed at minimizing pauses: incremental and parallel garbage collection, idle-time GC scheduling, and distributing compiler work across background threads. We already discussed all these optimizations in previous parts.

Integration with the browser's Event Loop plays an important role. JavaScript executes in a single thread where script execution, event handling, frame rendering, and GC work alternate. V8 must be able to fit into this cycle without interfering with rendering. For example, if only a few milliseconds remain before the next frame, the engine can run a quick Minor GC or part of incremental marking. If there's more time — it can perform heavier operations like compacting.

V8 is also tightly integrated with the browser's security system. Chrome's sandbox isolates rendering processes from each other and from the system. Pointer compression and the memory cage, which we discussed earlier, strengthen protection by limiting the memory area JavaScript can access. This reduces vulnerability risks and makes attacks harder.

Node.js

Node.js took V8 beyond the browser and turned JavaScript into a language for server-side development. Here the engine works in a completely different environment: there's no DOM, no rendering, but there is access to the file system, the network, and operating system processes.

At the core of Node.js is the libuv library, which provides async I/O and the Event Loop. V8 executes JavaScript code, while libuv manages operations that can take time: reading files, network requests, timers. When an operation completes, libuv returns control to V8, where the corresponding callback is executed.

Unlike the browser, where the main priority is interface responsiveness, in Node.js throughput takes center stage. Server applications often handle thousands of requests simultaneously, and any delay in the Event Loop can lead to increased response time. So GC settings in Node.js differ from browser settings: pauses are kept shorter, and garbage collector behavior is made more predictable.

Node.js makes active use of flags for V8 tuning. For example, you can increase heap size via --max-old-space-size, configure GC frequency, or enable experimental optimizations. This gives developers flexibility but requires an understanding of how the engine works.

An important feature of Node.js is working with native modules. Using node-gyp, you can connect C/C++ libraries and call them from JavaScript. This allows using existing code, performing performance-critical operations outside V8, and integrating with system APIs. But such integration requires care: incorrect memory management in native code can lead to leaks, and blocking operations can freeze the Event Loop.

Another aspect is Worker Threads support. This is the server-side analog of Web Workers. Each worker gets its own V8 isolate and can execute code in parallel. This is useful for CPU-bound tasks that would otherwise block the main thread. However, creating isolates requires resources, and having too many can lead to performance degradation.

Node.js also uses the capabilities that V8 provides through its API: profiling, heap snapshots, coverage. They allow connecting to a running process and analyzing it in real time — viewing the call stack, memory, and performance.

WebAssembly

WebAssembly (Wasm) is another direction where V8 plays a key role. Wasm is a low-level bytecode that can be executed in the browser and in Node.js. It was created for performance-demanding tasks: games, video processing, simulations, porting existing C/C++ applications.

V8 compiles WebAssembly not through the same pipeline as JavaScript but through a specialized compiler. First, the Wasm module goes through Liftoff — a fast baseline compiler that generates machine code almost instantly. This allows execution to begin without delays. Then, for "hot" code, TurboFan kicks in, applying deep optimizations and creating high-performance machine code.

An important difference between Wasm and JavaScript is static typing and predictable structure. This gives the compiler more opportunities for optimizations — for example, there's no need to make type assumptions. Thanks to this, WebAssembly often runs faster than equivalent JavaScript code.

One of the new capabilities is WasmGC. Previously, WebAssembly only worked with linear memory and managed allocation manually. This was inconvenient for languages with automatic garbage collection — such as Kotlin, Dart, or Java. Developers of such languages were forced to compile an entire custom runtime alongside their code, including a GC implementation and memory model, which increased module size and reduced performance. Interaction with JavaScript was also cumbersome: objects from the Kotlin or Java world were simply bytes in linear memory, and accessing them required additional layers and data serialization.

WasmGC solves these problems. It adds support for managed objects to the WebAssembly specification — structures (structs), arrays (arrays), and reference types (ref) that can be stored in the shared V8 heap and collected by the same garbage collector as JavaScript objects. This allows GC-based languages to abandon their own memory manager and directly use the mechanism built into the engine. As a result, modules become more compact, faster, and easier to integrate with JS: objects can now be exchanged between worlds without costly copying.

Such changes happen at the compiler level. The backend that previously transformed language classes into a set of bytes in linear memory can now compile them into native WebAssembly GC structures. All of this opens a path to truly native integration of high-level languages with the WebAssembly ecosystem and makes the boundary between JavaScript and other languages much thinner.

The Future of V8

V8's development doesn't stop. The engine team constantly works on improving performance, reducing memory consumption, and supporting new standards. Let's look at some of the directions:

  • Transition to Turboshaft. As we discussed in the previous part, V8 is gradually moving away from the Sea of Nodes architecture in favor of the classical CFG-based representation — the Turboshaft project. This transition is already complete for WebAssembly, and gradually all the old code will be removed. Turboshaft simplifies adding new optimizations, makes the compiler more predictable, and speeds up compilation time.

  • Maglev and further compiler work. Maglev, which appeared in 2023, took an intermediate position between Sparkplug and TurboFan. But the team continues experimenting: new compilation levels may appear or the logic for switching between them may change. The goal is to achieve the optimal balance between startup speed, compilation time, and optimization quality.

Conclusion

V8 is not just an engine for executing JavaScript. It's a complex system that evolves alongside the language, the platforms, and the requirements of users. From the first release in 2008 to today's multi-level compilers and advanced garbage collectors — the engine has come a long way and continues to develop.

Understanding how V8 works helps you write more efficient code, find bottlenecks, and get the most out of the platform. I hope this series of articles (now compiled into one big piece!) gave you a foundation for further exploration into the world of JavaScript engines and inspired you to dive deeper into this area.