Get your FREE 30 page Developing SOLID Applications guide!

An XSS Vulnerability In The Making

Out Of Date Warning

Languages change. Perspectives are different. Ideas move on. This article was published on March 7, 2012 which is more than two years ago. It may be out of date. You should verify that technical information in this article is still current before relying upon it for your own purposes.

Back in September, Socorro received a security bug relating to the method we were using for processing inputs for the duration of certain reports. The vulnerability included a proof of concept, with an alert box popping up on production when the link was followed.

The Vulnerability

I was quite surprised at the root cause of the vulnerability. We had opted to compare the incoming data against a known set of valid values – a common practice when whitelisting certain inputs. Here is a sample of the type of code we were using to check for valid inputs:

<?php

function validateDuration($duration) {

    $valid = array(3, 7, 14, 28);

    if(in_array($duration, $valid)) {
        return $duration;
    } else {
        return false;
    }
}

This snippet of code is contrived for this example (you can see the actual code here, but it’s complexity makes for a bad example); however it’s true to form for what we were doing. As expected, when this code is tested, a string of ’3′ and an integer of 3 work equally well, and a string of ’5′ and an integer of 5 fail equally. Since the $duration value is coming through the query string, it’s especially important that the function work with strings, since all query string values are presented to PHP in this fashion.

However, we quickly discovered something interesting. As long as we passed in a string that started with a valid numerical value (3, 7, 14, 28), anything we pasted on the end was treated as acceptable by the application. We discovered the reason for this is because in_array() cast $duration to an integer before comparing it to an array of integers. As long as the $duration value started with a known numerical value (which was then cast to a valid integer), the system was fooled.

In PHP, if you cast a string to an integer, and the string begins with a numerical value, the string becomes that numerical value and anything beyond that is lost. If you start a string with alphabetical characters, however, the integer you get is zero. For example:

<?php

$string1 = '123abc';
$string2 = 'abc123';

echo (int)$string1; //123
echo (int)$string2; //0

As a result, $duration = ’3 /><img src=x onerror=”alert(1)’ was being found valid, because when cast to an integer it became int(3), but when returned the whole string was returned!

The code could be fixed with just a small tweak: casting the $duration value to an integer and discarding the original string:

<?php

function validateDuration($duration) {

    $valid = array(3, 7, 14, 28);
    $duration = (int)$duration;

    if(in_array($duration, $valid)) {
        return $duration;
    } else {
        return false;
    }
}

This ensures that only the valid values are returned by making sure that only an integer value is possible.

Lessons Learned

At first we thought this surely had to be a bug in PHP. However, Laura Thomson told me “If comparing two values, type juggling is performed first, which means that the string is converted to a number. This is done by taking the first number found in the string (see http://us2.php.net/manual/en/language.types.string.php#language.types.string.conversion ). So this may be confusing/a quirk/a gotcha, but it isn’t a bug.” And she’s right: this isn’t a bug per se, but it’s certainly an interesting “gotcha.”

Were I to write a unit test for this in the future, I would write the test in such a way to pass it both valid and invalid data, including strings that started with valid data but didn’t fit the exact format. The experience also taught me that I cannot implicitly trust non-security functions to provide adequate security protection and that we must completely and comprehensively test their behaviors, because “expected behavior” may not be “secure behavior” for our purposes.

Finally, there’s the lesson that even the most simple web application can have vulnerabilities in it. They’re a part of life. This is important because there’s a bunch of stuff being bandied about right now regarding the supposed insecurity of a significant framework. We can never assume we are secure until we have tested all avenues, and even then, vulnerabilities can crop up.

Learning design patterns doesn't have to suck.

Design patterns open a whole new world of possibilities. So why are you avoiding them? This brand new book will help you finally understand these wonderful programming techiques!

Learn design patterns TODAY »

Mario Awad (@marioawad) wrote at 3/7/2012 8:34 am:

Thank you for reminding us developers that even the simplest code snippets might be vulnerable. While I usually always filter my input and escape my output (FIEO), it’s easy to overlook this rule sometimes, but never again. Cheers!

Brian Powers wrote at 3/7/2012 9:44 am:

Nice tip! It’s little things like this that we often forget about, particularly if you happen to be going back-and-forth between strongly and dynamically typed languages.

eric wrote at 3/7/2012 10:15 am:

OMG, fix the headline immediately!! (then delete this post ;))

Brandon Savage (@brandonsavage) wrote at 3/7/2012 10:20 am:

I don’t follow you Eric. I don’t see anything wrong with the headline, or the post itself.

Sven (@SvenRtbg) wrote at 3/7/2012 10:23 am:

I’d say the fail is later in your process. While it should be considered bad that your function was believed to validate and sanitize the input, the XSS only came to life because outputting $duration later was not properly escaped.

In general, starting a discussion about whether a certain variable has to be escaped or not because “the content is safe” is a bad thing. What if an assumed safe variable gets changed somewhere else and the output now gets unsafe?

Output a variable? Escape it! End of discussion. :)

DrSlump (@drslump) wrote at 3/7/2012 10:45 am:

The best solution to deal with XSS attacks is to properly escape the output. Escaping is much simpler than filtering thus it’s easier to get it done right.

That doesn’t mean that data shouldn’t be filtered but in the event that some filter is buggy, escaping the output will protect you against XSS. This not only applies to HTML, any source of code injection attacks benefits from escaping the values before using them, like in dynamic database queries even after we’ve filtered the params.

Jannik wrote at 3/7/2012 10:50 am:

… and that’s why in_array() has a third parameter called $strict.

Brandon Savage (@brandonsavage) wrote at 3/7/2012 10:56 am:

The $strict argument wouldn’t have helped in this case. Remember, the bug was possible because in_array() was type-juggling the string to an integer before comparing it. If we used $strict = true, we would either have had to cast the $duration value to an integer (which DID solve the issue) OR have explicitly defined the array values as strings, in which case the vulnerability wouldn’t have been possible in the first place.

There are a few ways to solve this particular issue. This isn’t a blog post of “look how clever we are”; it’s about highlighting behavior that’s correct, but unexpected.

Jannik wrote at 3/7/2012 11:00 am:

Well, yes, that is true. There is nothing wrong with being clever, btw. ;-)

But the problem with the current solution is, that you take possibly invalid input, transform it to valid input and keep working like expected (?).
This might work, but it could also produce more serious (= less visible) bugs. If I pass you ?delete=123test, then you *could* delete the entry 123, but you could also assume, that there might be some bugs in there and be more cautious (= do nothing for now).

The casting is a valid solution, but keep in mind, that there are (possible) side effects.

Matthew Weier O'Phinney (@mwop) wrote at 3/7/2012 11:02 am:

Actually, the type juggling is done on each comparison, not once – in other words, if your array has mixed types, it will do the type juggling based on the element currently being compared against.

Also: in_array() accepts a third, optional argument, which turns on strict checks. When used, comparisons are done that not check against the value, but also the type. Of course, if you want to allow either numeric strings or integers, this will fail, and you’ll want to cast first, but it does provide another way to whitelist.

Shay Ben Moshe wrote at 3/7/2012 2:45 pm:

Wow I didn’t know that!
This is actually a very valuable piece of information.

Thank you!

Cameron wrote at 3/7/2012 10:24 pm:

Yet another example of silent type conversion fail!

I don’t mind untyped variables, but silent type conversions seem to be a common trip-up. And yet it’s used as an example of why PHP is so great.

One example I personally hate is the boolean checking of variables in IFs.

if ($x) { .. }

Try it with: “0″ (string with a zero in it)… you’ll get an unexpected result esp. if it’s a valid input from a form! *bangs head*

Andrew wrote at 3/8/2012 4:17 am:

This is a nice example of a very easy mistake to make. You should always be paranoid about user input and ALWAYS escape output.

Or you could always return you known good data, and disregard the input, e.g.

$duration = (int)$duration;

if(in_array($duration, $valid)) {
return $valid[$duration]
}

If you are outputting ANY user entered data it should *always* be escaped.

KelvinJ wrote at 3/8/2012 8:53 am:

Escape, escape, escape!

As @Sven said in his comment, the XSS issue here lies with not escaping your output.

That’s the lesson to take away here: ALWAYS ESCAPE!

What you’ve pointed out is a reminder of how PHP compares strings.

foobar wrote at 3/9/2012 4:37 pm:

why not just

function validateDuration($duration) {
return !preg_match(‘/[^\d]/’, $duration) && //found non-digit
in_array((int)$duration, array(3, 7, 14, 28), true);
}

Jannik wrote at 3/15/2012 12:58 pm:

Quick Note to previous comment:
You can use ctype_digit($duration) as alternative to preg_match(‘/[^\d]/’, $duration) if one is not so in to regex. ;-)

Richard Lynch wrote at 3/19/2012 5:41 pm:

Personally, I would rather use:
switch($duration){
case “3″:
case “7″:
case “14″:
case “28″:
//this block intentionally left blank
break;
default:
die(“Desert Dog”);
break;
}

This is a matter of taste, really.

And, yes, ESCAPE OUTPUT is what should have been done, even if you thought the data was safe.