From 2ce6f325f5a3aaf3757b790e0f156fa328ea7351 Mon Sep 17 00:00:00 2001 From: Tod Beardsley Date: Thu, 29 May 2014 11:52:17 -0500 Subject: [PATCH] Be more specific with Nokogiri check There are still strong reservations about using Nokogiri to parse untrusted XML data. http://www.wireharbor.com/hidden-security-risks-of-xml-parsing-xxe-attack/ It is also believed that many desktop operating systems are still shipping out-of-date and vulnerable libxml2 libraries, which become exposed via Nokogiri. For example: http://stackoverflow.com/questions/18627075/nokogiri-1-6-0-still-pulls-in-wrong-version-of-libxml-on-os-x While this isn't a problem for binary builds of Metasploit (Metasploit Community, Express, or Pro) it can be a problem for development versions or Kali's / Backtrack's version. So, the compromise here is to allow for modules that don't directly expose XML parsing. I can't say for sure that the various libxml2 vulnerabilities (current and future) aren't also exposed via `Nokogiri::HTML` but I also can't come up with a reasonable demo. Metasploit committers should still look at any module that relies on Nokogiri very carefully, and suggest alternatives if there are any. But, it's sometimes going to be required for complex HTML parsing. tl;dr: Use REXML for XML parsing, and Nokogiri for HTML parsing if you absolutely must. --- tools/msftidy.rb | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/tools/msftidy.rb b/tools/msftidy.rb index 8c5d82458d..81e3a5367e 100755 --- a/tools/msftidy.rb +++ b/tools/msftidy.rb @@ -113,16 +113,32 @@ class Msftidy end end + # Updated this check to see if Nokogiri::XML.parse is being called + # specifically. The main reason for this concern is that some versions + # of libxml2 are still vulnerable to XXE attacks. REXML is safer (and + # slower) since it's pure ruby. Unfortunately, there is no pure Ruby + # HTML parser (except Hpricot which is abandonware) -- easy checks + # can avoid Nokogiri (most modules use regex anyway), but more complex + # checks tends to require Nokogiri for HTML element and value parsing. def check_nokogiri - msg = "Requiring Nokogiri in modules can be risky, use REXML instead." + msg = "Using Nokogiri in modules can be risky, use REXML instead." has_nokogiri = false + has_nokogiri_xml_parser = false @source.each_line do |line| if line =~ /^\s*(require|load)\s+['"]nokogiri['"]/ has_nokogiri = true break end end - error(msg) if has_nokogiri + if has_nokogiri + @source.each_line do |line| + if line =~ /Nokogiri::XML.parse/ or line =~ /Nokogiri::XML::Reader/ + has_nokogiri_xml_parser = true + break + end + end + end + error(msg) if has_nokogiri_xml_parser end def check_ref_identifiers