The context

You rarely see "Perl" and "libFuzzer" in the same sentence. Perl is a powerful, mature language, known for its text-processing capabilities, extreme flexibility and high information density. Because it's interpreted, many applications use modules written in C for performance-critical tasks (using Perl's XS interface). As we all know, C code is (unfortunately) a prime target for memory corruption bugs, especially in parsers.

This got me curious when I stumbled over the JSON::XS module. How hard would it be to aim a modern C/C++ fuzzer like libFuzzer at its C internals?

It turns out it wasn't just possible, it led to the discovery of a Heap-Buffer-Overflow that could be triggered by a specially crafted input string.

Here's how I did it.

The setup: Bridging C and Perl

The main challenge is straightforward: libFuzzer is a C/C++ tool. It works by repeatedly calling a single function, LLVMFuzzerTestOneInput(), providing it with mutated data. My target, JSON::XS, is a Perl module.

My first thought was to just test the underlying C implementation (the XS.xs file) directly. However, after reviewing the C code, it became clear this wouldn't work. The code is heavily interwoven with the Perl environment. It includes Perl headers (perl.h, EXTERN.h) and calls many Perl functions.

Stripping out these dependencies to create a "standalone" C function would be challenging and risky. The test case would differ from the original implementation, potentially hiding real bugs or introducing new ones.

Therefore, I determined to test the module from within a running Perl interpreter. This means my implementation must embed not only the module but also the entire Perl interpreter and basically wrap it in a C function.

This so-called fuzz harness is a small C program that does three things in this case:

  1. Initializes an embedded Perl interpreter.
  2. Loads a simple Perl wrapper script to handle errors.
  3. Accepts raw data from libFuzzer and passes it to the Perl function under test.

But before I could start, an extra step is needed.

Step 1: The Perl Wrapper for error handling

Why do I need an "extra step" with a Perl wrapper?

The fuzzer will generate thousands of inputs, and most of them will be invalid JSON. The decode_json function would correctly report an error (an exception) for these inputs. If I don't handle these exceptions, they will interrupt the harness, and the fuzzer will not be able to continue. I only care about memory corruption crashes, not simple parsing errors.

To solve this, I used a tiny Perl script (json_eval.pl) that wraps the call in an eval block. This acts as a simple exception handler, suppressing the expected errors.

use JSON::XS;
sub do_json {
    eval {
        decode_json("@_");
    } or do {
        return;
    };
}

This script provides a new, safe function called do_json that I can call from the C harness.

Step 2: The C Harness for embedding Perl

Now for the main C program. First, the embedded Perl interpreter needs to be initialized and the cleanup ensured. I'm using the __attribute__((constructor)) and __attribute__((destructor)) helpers so this runs automatically and only once on start.

This code (based on the perlembed documentation) allocates an interpreter, parses the json_eval.pl script, and runs it. Note that the function xs_init is defined as per documentation and not shown here.

#include <EXTERN.h>
#include <perl.h>

EXTERN_C void xs_init (pTHX);

static PerlInterpreter *my_perl;

// Called once when the program starts
__attribute__((constructor))
static void init_perl_fuzzer() {
    char *my_argv[] = {"", "json_eval.pl"};
    my_perl = perl_alloc();
    perl_construct(my_perl);
    perl_parse(my_perl, xs_init, 2, my_argv, (char **) NULL);
    perl_run(my_perl);
}

// Called once when the program exits
__attribute__((destructor))
static void destroy_perl_fuzzer() {
    perl_destruct(my_perl);
    perl_free(my_perl);
}

Next, I needed a C function to call the Perl do_json function. This involves using the Perl C API to set up the stack, push the JSON string as an argument, and make the call.

static void PerlJson(const char *const json, const STRLEN len) {
    dSP;
    ENTER;
    SAVETMPS;
    PUSHMARK(SP);
    // Push the fuzzer data onto the Perl stack
    XPUSHs(sv_2mortal(newSVpvn(json, len)));
    PUTBACK;
    // Call the "do_json" function, discarding output (G_EVAL)
    call_pv("do_json", G_EVAL);
    SPAGAIN;
    PUTBACK;
    FREETMPS;
    LEAVE;
}

Finally, I implemented the LLVMFuzzerTestOneInput function required by libFuzzer. This is now quite simple: it just shovels the data from the fuzzer directly into my PerlJson function.

int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
    char *json = (char *) Data;
    PerlJson(json, Size);
    return 0;
}

Compiling and Instrumenting

I was almost done, I thought. The final step was to compile and link my harness. Getting the compiler flags right for an embedded Perl app is tricky, but Perl has a utility for it, which gets me 90% of the way there:

perl -MExtUtils::Embed -e 'print(ccopts(), ldopts())'

I added Clang's fuzzer flags (-fsanitize=fuzzer,address) to this and compiled the harness.

But when I first ran it, I hit a major problem: the performance was terrible. The fuzzer was running incredibly fast, but it was not exploring the target code at all.

The reason: The C harness was compiled with instrumentation, but the Perl interpreter itself and the JSON::XS module (the code I actually want to test) were not. The fuzzer was essentially "flying blind," with no code-coverage feedback from the target.

To fix this, I rebuilt Perl itself and the module with the same fuzzer instrumentation flags.

After reviewing the documentation for building Perl, I configured it like this:

./Configure -de -Dcc=clang -Dusedevel -Doptimize=-g -O1 \
        -Accflags=-fsanitize=fuzzer-no-link,address \
        -Aldflags=-fsanitize=fuzzer-no-link,address

After compiling and installing this custom Perl, I re-installed JSON::XS. Because the compiler flags are saved in the Perl installation, the module was automatically compiled with the same instrumentation.

Now, the entire stack (libFuzzer, test harness, the Perl interpreter, and the JSON::XS C code) was fully instrumented. The fuzzing performance immediately jumped up, and the fuzzer was now effectively exploring the target code.

The discovery

With the setup and compilation complete and using some generic JSON files as corpus, starting the fuzzer was straightforward:

./fuzzer -jobs=10 -workers=10 corpus

After less than an hour, the fuzzer found a crash.

Output of AdressSanitizer for the observed Heap-Buffer-Overflow

AddressSanitizer (ASan) detected a heap-buffer-overflow. The fuzzer had generated a specific combination of bytes that, when passed to decode_json, caused its underlying C code to read memory it shouldn't have.

libFuzzer helpfully saved the exact crashing input into a file, allowing me to easily reproduce the crash and confirm the vulnerability. As no pre-conditions are required for this vulnerability, applications may face availability risks when they accept JSON input and use a vulnerable JSON parser.

Looking at the reported line, we find a float parsing function with the following while loop:

while (((U8)*s - '0') < 10)
    ++s;

The while loop is designed to skip through a floating-point number. It reads through the string s, checks each character if it is a digit, and it increments the pointer until it finds a non-digit. As the loop does not check the range of numbers correctly, the read access may overstep the string. As such, the main fix is to check that the character is indeed between 0 and 9.

Disclosure timeline

The vulnerability was fixed in JSON::XS version 4.04. The CPAN Security Team was very responsive and promptly coordinated the fix resolution and patch distribution. They also determined that two more modules are affected by the same vulnerability. As the code base was shared with other projects, the patch was applied additionally to the modules JSON-SIMD and Cpanel::JSON::XS. The module authors were also very prompt after initial contact occurred, releasing a new version of their modules with the patch.

  1. 15.07.2025: Reported vulnerability to module author.
  2. 03.09.2025: Reported vulnerability to CPAN Security Team.
  3. 04.09.2025: Vulnerability acknowledged and patches prepared by the CPAN team and module authors.
  4. 08.09.2025: Patches published and distros notified.

Contact us

We help your company with the configuration and testing of your software.