There are still strong reservations about using Nokogiri to parse
untrusted XML data.
http://www.wireharbor.com/hidden-security-risks-of-xml-parsing-xxe-attack/
It is also believed that many desktop operating systems are still
shipping out-of-date and vulnerable libxml2 libraries, which become
exposed via Nokogiri. For example:
http://stackoverflow.com/questions/18627075/nokogiri-1-6-0-still-pulls-in-wrong-version-of-libxml-on-os-x
While this isn't a problem for binary builds of Metasploit (Metasploit
Community, Express, or Pro) it can be a problem for development
versions or Kali's / Backtrack's version.
So, the compromise here is to allow for modules that don't directly
expose XML parsing. I can't say for sure that the various libxml2
vulnerabilities (current and future) aren't also exposed via
`Nokogiri::HTML` but I also can't come up with a reasonable demo.
Metasploit committers should still look at any module that relies on
Nokogiri very carefully, and suggest alternatives if there are any. But,
it's sometimes going to be required for complex HTML parsing.
tl;dr: Use REXML for XML parsing, and Nokogiri for HTML parsing if you
absolutely must.
To be clear, the shell that was tested with was 'windows/shell_reverse_tcp' delivered via 'exploit/windows/smb/psexec'
Additional changes required to fix regex to support the multiline output. Also, InstanceId uses a lower case 'D' on the platforms I tested - PowerShell 2.0 on Windows 2003, Windows 7, Windows 2008 R2 as well as PowerShell 4.0 on Windows 2012 R2.
This method doesn't appear to be used anywhere in the Metasploit codebase currently.
I have a case where on a Windows 2008 R2 host with PowerShell 2.0 the 'have_powershell' method times out. When I interactively run the command I find that the output stops after the PowerShell command and the token from 'cmd_exec' is NOT displayed. When I hit return the shell then processes the '&echo <randomstring>' and generates the token that 'cmd_exec' was looking for. I tried various versions of the PowerShell command string such as 'Get-Host;Exit(0)', '$PSVErsionTable.PSVersion', and '-Command Get-Host' but was unable to change the behavior. I found that adding 'echo. | ' simulated pressing enter and did not disrupt the results on this host or on another host where the 'have_powershell' method functioned as expected.
There may be a better solution, but this was the only one that I could find.