7 Chapter 6: Data Extraction
A save editor needs game data: weapon stats, part definitions, manufacturer information. You might assume this data lives neatly in game files, waiting to be extracted. The reality is more complicated—and more interesting.
This chapter explores what data we can extract, what we can’t, and why. Along the way, we’ll document our investigation into authoritative category mappings, including the binary analysis that revealed why some data simply doesn’t exist in extractable form.
7.1 The Game File Landscape
BL4’s data lives in Unreal Engine pak files, stored in IoStore format:
Borderlands 4/OakGame/Content/Paks/
├── pakchunk0-Windows_0_P.utoc ← Main game assets
├── pakchunk0-Windows_0_P.ucas ← Compressed data
├── pakchunk2-Windows_0_P.utoc ← Audio (Wwise)
├── pakchunk3-Windows_0_P.utoc ← Localized audio
├── global.utoc ← Shared engine data
└── ...
IoStore is UE5’s container format, splitting asset indices (.utoc) from compressed data (.ucas). This differs from older PAK-only formats and requires specialized tools.
!!! note BL4 uses IoStore (UE5’s format), not legacy PAK. Tools like repak won’t work on .utoc/.ucas files. You need retoc or similar IoStore-aware extractors.
7.2 What We Can Extract
Some game data extracts cleanly from pak files:
Balance data: Stat templates and modifiers for weapons, shields, and gear. These define base damage, fire rate, accuracy scales.
Naming strategies: How weapons get their prefix names. “Damage → Tortuous” mappings live in extractable assets.
Body definitions: Weapon body assets that reference parts and mesh fragments.
Loot pools: Drop tables and rarity weights for different sources.
Gestalt meshes: Visual mesh fragments that parts reference.
These assets follow Unreal’s content structure:
OakGame/Content/
├── Gear/
│ ├── Weapons/
│ │ ├── _Shared/BalanceData/
│ │ ├── Pistols/JAK/Parts/
│ │ └── ...
│ └── Shields/
├── PlayerCharacters/
│ ├── DarkSiren/
│ └── ...
└── GameData/Loot/
7.3 What We Can’t Extract
Here’s where it gets interesting. The mappings between serial tokens and actual game parts—the heart of what makes serial decoding work—don’t exist as extractable pak file assets.
We wanted authoritative category mappings. Serial token {4} on a Vladof SMG should mean a specific part, and we wanted the game’s own data to tell us which one. So we investigated.
7.3.1 The Investigation: Binary Analysis
We used Rizin (a radare2 fork) to analyze the Borderlands4.exe binary directly:
rz-bin -S Borderlands4.exeResults: - Total size: 715 MB - .sdata section: 157 MB (code) - .rodata section: 313 MB (read-only data)
We searched for part prefix strings like “DAD_PS.part_” and “VLA_SM.part_barrel”. Nothing. The prefixes don’t exist as literal strings in the binary.
We searched for category value sequences. Serial decoding uses Part Group IDs like 2, 3, 4, 5, 6, 7 (consecutive integers stored as i64). We found one promising sequence at offset 0x02367554:
# Found sequence 2,3,4,5,6,7 as consecutive i64 values at 0x02367554But examining the context revealed it was near crypto code—specifically “Poly1305 for x86_64, CRYPTOGAMS”. Those consecutive integers were coincidental, not category definitions.
!!! warning “False Positives” When searching binaries for numeric patterns, verify the context. Small consecutive integers appear in many places: crypto code, lookup tables, version numbers. Always examine surrounding bytes.
7.3.2 UE5 Metadata: What We Know
From usmap analysis, we confirmed the exact structure linking parts to serials:
GbxSerialNumberIndex (12 bytes)
├── Category (Int64): Part Group ID
├── scope (Byte): EGbxSerialNumberIndexScope (Root=1, Sub=2)
├── status (Byte): EGbxSerialNumberIndexStatus
└── Index (Int16): Position in category
Every InventoryPartDef contains this structure. The Category field maps to Part Group IDs (2=Daedalus Pistol, 22=Vladof SMG, etc.). The Index field determines which part token decodes to this part.
But here’s the problem: we found zero InventoryPartDef assets in pak files.
uextract /path/to/Paks find-by-class InventoryPartDef
# Result: 0 assets found7.3.3 Where Parts Actually Live
Parts aren’t stored as individual pak file assets. They’re:
- Runtime UObjects — Created when the game initializes
- Code-defined — Registrations happen in native code
- Self-describing — Each part carries its own index internally
7.3.4 The Key Insight: Self-Describing Parts
Here’s the crucial design pattern we discovered: there is no separate mapping file because each part stores its own index.
Every part UObject contains a GbxSerialNumberIndex structure at offset +0x28:
UObject + 0x28: GbxSerialNumberIndex (4 bytes)
├── Scope (1 byte) ← EGbxSerialNumberIndexScope (Root=1, Sub=2)
├── Status (1 byte) ← Reserved/state flags
└── Index (2 bytes) ← THE serial index for this part
This is a “reverse mapping” architecture:
- Traditional approach: Separate lookup table maps
index → part_name - BL4’s approach: Each part stores its own index; the “mapping” IS the parts themselves
Why this design makes sense:
| Benefit | Explanation |
|---|---|
| No central registry | Adding DLC parts doesn’t require updating a mapping file |
| Self-contained | Each part is fully self-describing |
| Stable indices | A part’s index never changes because it’s intrinsic to that part |
| No sync issues | Impossible for mapping to drift from actual parts |
Practical implication: When we extract parts from memory, we’re not building a mapping from separate data—we’re reading the authoritative index directly from each part. The memory dump contains the complete, correct mapping because that mapping IS the parts.
!!! note “Why Memory Dumps Are Essential” Since each part carries its own index internally, and parts only exist as runtime UObjects (not pak file assets), memory dumps are the only way to capture this data. The game’s binary contains the code to create parts, but the actual GbxSerialNumberIndex values are set during initialization.
7.4 Memory Extraction: The Breakthrough
Through systematic memory analysis, we discovered authoritative part-to-index mappings can be extracted from memory dumps. Here’s the structure:
7.4.1 The Part Registration Structure
When the game loads, it creates UObjects for each part and registers them in an internal array. This array has a discoverable pattern:
Part Array Entry (24 bytes):
├── FName Index (4 bytes) ← References the part name in FNamePool
├── Padding (4 bytes) ← Always zero
├── Pointer (8 bytes) ← Address of the part's UObject
├── Marker (4 bytes) ← 0xFFFFFFFF sentinel value
└── Priority (4 bytes) ← Selection priority (not the serial index!)
The serial Index is stored inside the pointed UObject, at offset +0x28:
UObject at Pointer (offset +0x28):
├── Scope (1 byte) ← EGbxSerialNumberIndexScope (always 2 for parts)
├── Reserved (1 byte) ← Usually 0
└── Index (2 bytes, Int16) ← THE SERIAL INDEX we need!
!!! important “Category Derivation” The Part Group ID (category) is not stored in the UObject at a fixed offset. Instead, derive it from the part name prefix (e.g., DAD_PS → category 2, VLA_AR → category 17). The bl4 tool includes a complete prefix-to-category mapping.
7.4.2 Verified Example
Searching for FName DAD_PS.part_barrel_01 (FName index 0x736a0a):
- Find the array entry: FName appears in the part array with pointer 0x7ff4ca7d75d0
- Read offset +0x28: At 0x7ff4ca7d75f8 we find
02 00 07 00 - Parse: Scope=2, Reserved=0, Index=7
- Derive category:
DAD_PSprefix → category 2 - Verify: Reference data confirms DAD_PS.part_barrel_01 has index 7 ✓
Additional verified mappings: - DAD_PS.part_barrel_02 → Index 8 ✓ - DAD_PS.part_barrel_01_Zipgun → Index 1 ✓ - DAD_PS.part_barrel_02_rangefinder → Index 78 ✓
7.4.3 Extraction Algorithm
# Pseudocode for extracting all part mappings
def extract_parts(memory_dump):
# Step 1: Build FName lookup table
# Scan FNamePool for all names containing ".part_"
fname_table = {} # fname_idx -> name
for block in fnamepool.blocks:
for entry in block:
if ".part_" in entry.name.lower():
fname_table[entry.index] = entry.name
parts = []
# Step 2: Scan memory for 0xFFFFFFFF markers
for marker_addr in scan(memory_dump, "ff ff ff ff"):
# Read the 24-byte entry (marker is at offset 16)
entry = read(marker_addr - 16, 24)
fname_idx = entry[0:4] # FName index
pointer = entry[8:16] # UObject pointer
# Validate: known FName, padding=0, valid pointer
if fname_idx not in fname_table:
continue
if entry[4:8] != 0 or not is_valid_pointer(pointer):
continue
# Read serial index from pointed UObject at offset +0x28
uobject = read(pointer, 0x2C)
scope = uobject[0x28]
index = uobject[0x2A:0x2C] # Int16 at bytes 2-3
# Derive category from part name prefix
name = fname_table[fname_idx]
category = get_category_from_prefix(name)
if category is not None:
parts.append({
'name': name,
'category': category,
'index': index
})
return parts
def get_category_from_prefix(name):
prefix = name.split(".part_")[0].lower()
# Pistols
if prefix == "dad_ps": return 2
if prefix == "jak_ps": return 3
# ... (complete mapping in bl4 source)
return None7.4.4 Why This Works
The game registers parts at startup into internal arrays. Each entry links: - FName reference → The part’s name (e.g., “VLA_SM.part_barrel_01”) - UObject pointer → The full part definition, including serial index
By scanning for the 0xFFFFFFFF sentinel pattern that marks entry boundaries, we can walk these arrays and extract every part mapping the game knows about.
!!! tip “Practical Implication” Memory dumps contain authoritative part-to-index mappings. Extract them directly—no empirical testing required for known parts. Empirical validation is only needed for new parts added in patches.
7.4.5 Extraction Results
Running the extraction on a Dec 2025 memory dump yields:
| Metric | Value |
|---|---|
| Total parts extracted | 1,070 |
| FNames scanned | 1,399 |
| Categories covered | 49 |
| Match rate vs reference | 84.3% |
| Core weapons (cat 2-29) | 100% accurate |
Distribution by type:
| Type | Categories | Parts |
|---|---|---|
| Pistols | 2-7 | 207 |
| Shotguns | 8-13 | 190 |
| Assault Rifles | 14-19 | 169 |
| SMGs | 20-23 | 135 |
| Snipers | 25-29 | 176 |
| Heavy Weapons | 244-247 | 37 |
| Shields | 279-287 | 44 |
| Gadgets | 300-330 | 146 |
| Enhancements | 400-409 | 23 |
!!! success “Core Weapon Accuracy” The extraction achieves zero mismatches for core weapon categories (2-29). Mismatches only occur in heavy weapons, shields, gadgets, and enhancements—likely due to different UObject layouts or reference data issues for those categories.
7.5 Empirical Validation (Fallback)
For edge cases or when memory extraction isn’t possible, empirical validation remains an option:
- Collect serials from real game items
- Decode the Part Group ID and part tokens
- Record which weapon/part combinations the tokens represent
- Validate by injecting serials into saves and checking in-game
The parts_database.json file combines memory-extracted mappings with empirically-verified data for comprehensive coverage.
7.6 Extraction Tools
7.6.1 retoc — IoStore Extraction
The essential tool for BL4’s pak format:
cargo install --git https://github.com/trumank/retoc retoc_cli
# List assets in a container
retoc list /path/to/pakchunk0-Windows_0_P.utoc
# Extract all assets
retoc unpack /path/to/pakchunk0-Windows_0_P.utoc ./output/!!! warning For converting to legacy format, point at the Paks directory, not a single file. The tool needs access to global.utoc for ScriptObjects: bash retoc to-legacy /path/to/Paks/ ./output/ --no-script-objects
7.6.2 uextract — Project Tool
The bl4 project’s custom extraction tool:
cargo build --release -p uextract
# List all assets
./target/release/uextract /path/to/Paks --list
# Extract with filtering
./target/release/uextract /path/to/Paks -o ./output --ifilter "BalanceData"
# Use usmap for property resolution
./target/release/uextract /path/to/Paks -o ./output --usmap share/borderlands.usmap7.7 The Usmap Requirement
UE5 uses “unversioned” serialization. Properties are stored without field names:
Versioned (old): "Damage": 50.0, "Level": 10
Unversioned (new): 0x42480000 0x0000000A
└── Just values, no names
To parse unversioned data, you need a usmap file containing the schema—all class definitions, property names, types, and offsets.
We generate usmap from memory dumps:
bl4 memory --dump share/dumps/game.dmp dump-usmap
# Output: mappings.usmap
# Names: 64917, Enums: 2986, Structs: 16849, Properties: 58793The project includes a pre-generated usmap at share/manifest/mappings.usmap.
7.8 Extracting Parts from Memory
Since parts only exist at runtime, memory extraction is the path forward.
7.8.1 Step 1: Create Memory Dump
Follow Chapter 3’s instructions to capture game memory while playing.
7.8.2 Step 2: Extract Part Names
bl4 memory --dump share/dumps/game.dmp dump-parts \
-o share/manifest/parts_dump.jsonThis scans for strings matching XXX_YY.part_* patterns:
{
"DAD_AR": [
"DAD_AR.part_barrel_01",
"DAD_AR.part_barrel_01_a",
"DAD_AR.part_body"
],
"VLA_SM": [
"VLA_SM.part_barrel_01"
]
}7.8.3 Step 3: Build Parts Database
bl4 memory --dump share/dumps/game.dmp build-parts-db \
-i share/manifest/parts_dump.json \
-o share/manifest/parts_database.jsonThe result maps parts to categories and indices:
{
"parts": [
{"category": 2, "index": 0, "name": "DAD_PS.part_barrel_01"},
{"category": 22, "index": 5, "name": "VLA_SM.part_body_a"}
],
"categories": {
"2": {"count": 74, "name": "Daedalus Pistol"},
"22": {"count": 84, "name": "Vladof SMG"}
}
}!!! important “Index Ordering” Part indices from memory dumps reflect the game’s internal registration order—not alphabetical. Parts typically register in this order: unique variants, bodies, barrels, shields, magazines, scopes, grips, licensed parts. Alphabetical sorting produces wrong indices.
7.9 Working with Extracted Assets
7.9.1 Asset Structure
Extracted .uasset files follow the Zen package format:
Package
├── Header
├── Name Map (local FNames)
├── Import Map (external dependencies)
├── Export Map (objects defined here)
└── Export Data (serialized properties)
With usmap, these parse into readable JSON:
{
"asset_path": "OakGame/Content/Gear/Weapons/_Shared/BalanceData/WeaponStats/Struct_Weapon_Barrel_Init",
"exports": [
{
"class": "ScriptStruct",
"properties": {
"Damage_Scale": 1.0,
"FireRate_Scale": 1.0,
"Accuracy_Scale": 1.0
}
}
]
}7.9.2 Finding Specific Data
# Find legendary items
find ./bl4_assets -name "*legendary*" -type f
# Find manufacturer data
find ./bl4_assets -iname "*manufacturer*"
# Search asset contents
grep -r "Linebacker" ./bl4_assets --include="*.uasset" -l7.9.3 Stat Patterns
Stats follow naming conventions: StatName_ModifierType_Index_GUID
| Modifier | Meaning |
|---|---|
Scale |
Multiplier (×) |
Add |
Flat addition (+) |
Value |
Absolute override |
Percent |
Percentage bonus |
7.10 Oodle Compression
BL4 uses Oodle compression (RAD Game Tools). The retoc tool handles decompression automatically by loading the game’s DLL:
~/.steam/steam/steamapps/common/"Borderlands 4"/Engine/Binaries/ThirdParty/Oodle/
└── oo2core_9_win64.dll
!!! tip If extraction fails with Oodle errors, verify the game is installed and the DLL path is accessible. On Linux, Wine must be able to load the DLL.
7.11 Building a Data Pipeline
An automated extraction script saves time when the game updates:
#!/bin/bash
GAME_DIR="$HOME/.steam/steam/steamapps/common/Borderlands 4"
OUTPUT_DIR="./bl4_data"
USMAP="./share/manifest/mappings.usmap"
# Extract pak files
retoc unpack "$GAME_DIR/OakGame/Content/Paks/pakchunk0-Windows_0_P.utoc" "$OUTPUT_DIR/raw"
# Parse with usmap
./target/release/uextract "$OUTPUT_DIR/raw" -o "$OUTPUT_DIR/parsed" --usmap "$USMAP"
# Generate manifest
bl4-research pak-manifest -e "$OUTPUT_DIR/parsed" -o "$OUTPUT_DIR/manifest"7.12 Summary: Data Sources
| Data | Source | Extractable? |
|---|---|---|
| Balance/stats | Pak files | Yes |
| Naming strategies | Pak files | Yes |
| Loot pools | Pak files | Yes |
| Body definitions | Pak files | Yes |
| Part definitions | Memory dump | Yes (via UObject array scan) |
| Category mappings | Memory dump | Yes (embedded in part UObjects) |
While parts don’t exist as pak file assets, memory dumps capture the complete runtime state including all part definitions with their authoritative serial indices. The 0xFFFFFFFF sentinel pattern and UObject offset +0x20 provide reliable extraction paths.
7.13 Exercises
Exercise 1: Extract and Explore
Extract the main pak file. Find balance data for a weapon type you use. What stats does the base template define?
Exercise 2: Search for Part References
Search extracted assets for references to specific parts (like “JAK_PS.part_barrel”). Where do they appear? What references them?
Exercise 3: Compare Manufacturers
Extract assets for two manufacturers (Jakobs vs Maliwan). Compare directory structures. What patterns emerge?
7.14 What’s Next
We’ve covered the full data extraction story—what works, what doesn’t, and why. The bl4 project wraps all these techniques into command-line tools.
Next, we’ll tour those tools: how to decode serials, edit saves, extract data, and more, all from the command line.