Suggestion & patch for browser identification (user-agent)
I know this is too late to even consider for 0.7.3, but I thought I'd get this out for discussion. I've been looking at browser statistics on my website, and started thinking about what various browsers do and don't report in their User-Agent identification. All report their name. Many report the rendering engine (Gecko, KHTML, etc.) Many also report the operating system and the type of system (with varying levels of detail), and some even report the language of the user interface. Here are some examples from my website's server logs: Mozilla/4.7 [en] (WinNT; U) Mozilla/5.0 (compatible; Konqueror/3.1; Linux) Mozilla/4.72 [en] (X11; U; Linux 2.4.18 i686) Opera/7.11 (Linux 2.4.20-18.9 i686; U) [en] Opera/6.0 (Macintosh; PPC Mac OS X; U) [en] Mozilla/4.0 (compatible; MSIE 5.22; Mac_PowerPC) Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030524 Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/85 (KHTML, like Gecko) Safari/85 Some interesting notes: - Netscape and Mozilla always use "X11" for anything X-based, and have a separate field to indicate the actual OS (Linux, Solaris, *BSD, etc.) - Konqueror lets you configure the amount of detail you report. The example here is the default. - Most of the *nix-based strings use qhat you get from uname -sm or uname -srm. - Most browsers designed for Mac OS X report "PPC Mac OS X" - There are possible privacy/security implications of revealing too much information, like your system's kernel version or your primary language. I think that knowing the operating system is interesting, but much more than that probably isn't necessary, so I'd like to suggest one of the following syntaxes to extend the Dillo user-agent string: Dillo/<version> (<system>) Dillo/<version> (<platform>; <system>) Where <system> is the output of "uname -sm" (i.e. Linux PPC, FreeBSD i386, etc) and <platform> is of the form (Windows|Macintosh|X11|etc.) as used by Mozilla (see http://www.mozilla.org/build/revised-user-agent-strings.html ). This would result in identification like the following: Dillo/0.8 (Linux i386) Dillo/0.8 (FreeBSD i686) Dillo/0.8 (Linux PPC) Dillo/0.8 (PPC Mac OS X)* and so on, or perhaps: Dillo/0.8 (X11; Linux i386) Dillo/0.8 (X11; CYGWIN_NT-5.0 i686) to fit Netscape/Mozilla's pattern. * (Aside from the reversed order, Mac OS X would probably show up as Darwin unless a special case is made. This would probably mean setting the User-Agent string once and keeping it in memory instead of building it with each HTTP request. And for the longer syntax there's the issue that it still runs under X, so the platform would probably be X11 instead of Macintosh.) I have attached a quick patch which uses the utsname struct to create the first syntax, and I've tested it successfully on Linux and Cygwin on Intel hardware. So does anyone else think this is a good/bad/interesting idea? And is anyone interested in testing it on *BSD/Solaris/Mac/etc.? -- Kelson Vibber www.hyperborea.org
On 2003-07-08 at 19:44 -0700, Kelson Vibber wrote:
syntaxes to extend the Dillo user-agent string: Dillo/<version> (<system>) Dillo/<version> (<platform>; <system>)
Where <system> is the output of "uname -sm" (i.e. Linux PPC, FreeBSD i386, etc) and <platform> is of the form (Windows|Macintosh|X11|etc.) as used by Mozilla (see http://www.mozilla.org/build/revised-user-agent-strings.html ).
This would result in identification like the following: Dillo/0.8 (Linux i386)
The problem with these is when there are security holes in image rendering libraries (eg, versions of Netscape with custom handlers for extended information in GIF files). If you state the system architecture then it's relatively trivial to use the User-Agent field on the server-side to select the image with the correct shell-code to exploit your system. If you say "FreeBSD" or "Linux" then you're stating which system calls are where. All of which helps a malicious web-site operator target their exploit to your browser. As you note, Konqueror lets you configure the amount of information available without a source-patch. I'm one of those who reduces the information to the minimum -- I'm happy supplying something saying "Konqueror" or "Dillo" and don't get upset at the version being present. I like to turn it off though -- this is one of the things which I like about Dillo -- it's so fast to compile that trivial local hacks like this become feasible. I don't supply patches for stuff that removes things that the developers like because it's their baby, but open source means I get to run stuff how I like it; everyone gets what they want (as long as they can make trivial hacks) so everyone's happy. Since I patch anyway, it makes no difference to me -- I'd just remove the extra information. But I do think that people should think about how easy it is to target shell-code exploits with this sort of information given away. -- 2001: Blogging invented. Promises to change the way people bore strangers with banal anecdotes about their pets. <http://www.thelemon.net/issues/timeline.php>
At 06:15 AM 7/9/2003, Phil Pennock wrote:
The problem with these is when there are security holes in image rendering libraries (eg, versions of Netscape with custom handlers for extended information in GIF files). If you state the system architecture then it's relatively trivial to use the User-Agent field on the server-side to select the image with the correct shell-code to exploit your system.
Hmm, I hadn't thought of that one. There are a couple of discussions linked to from the Mozilla page I mentioned, but they're mainly focused on things like revealing the OS version to make follow-up hacking attempts easier. However, it seems to me that it would be just as trivial to put several malicious images on a single page, each targeting a different system. It's not as if multiple images on a page - or even multiple broken images - would raise much suspicion.
open source means I get to run stuff how I like it; everyone gets what they want (as long as they can make trivial hacks) so everyone's happy.
Agreed! Kelson Vibber www.hyperborea.org
On 2003-07-09 at 08:55 -0700, Kelson Vibber wrote:
However, it seems to me that it would be just as trivial to put several malicious images on a single page, each targeting a different system. It's not as if multiple images on a page - or even multiple broken images - would raise much suspicion.
No, because the first one that overflows the buffer will either crash the browser or successfully exploit the issue. People tend to notice when their browser crashes ;^) -- or at least, I do which is another reason why I like Dillo. :^) My boss too is a convert because of the simplicity, stability and speed, only resorting to other browsers when necessary. So it's an all-or-nothing attack -- unless you happen to know that on one platform the browser loads images in one order and on another platform it loads them in reverse order, in which case you can conceivably get two attack opportunities for the price of one. You don't see a broken image link if the browser has already been compromised or crashed. Hence a user-agent saying "I'm this version of this browser running on this OS on this hardware platform" is, uhm, interesting. Knowing Netscape 4.76 tells the attacker which security holes you're vulnerable to; knowing the other details says which exploit is likely to work. Many types of shellcode can to some extent successfully handle different builds using different locations. But hey, it's trivial to find the place in the code to make such changes in Dillo, so anyone who's bothered by the issue can do something about it. So don't let people like me stop you from doing this -- I just ask that people consider the issues before doing something just because every other browser does it. -- 2001: Blogging invented. Promises to change the way people bore strangers with banal anecdotes about their pets. <http://www.thelemon.net/issues/timeline.php>
I agree with this. Only one detail about the path: what about use autoconfig to fill a define with the system and plataform instead to add code? Diego. El Tue, 8 Jul 2003 19:44:50 -0700 Kelson Vibber <kelson@pobox.com> escribio:
I know this is too late to even consider for 0.7.3, but I thought I'd get this out for discussion.
I've been looking at browser statistics on my website, and started thinking about what various browsers do and don't report in their User-Agent identification. All report their name. Many report the rendering engine (Gecko, KHTML, etc.) Many also report the operating system and the type of system (with varying levels of detail), and some even report the language of the user interface.
Here are some examples from my website's server logs:
Mozilla/4.7 [en] (WinNT; U) Mozilla/5.0 (compatible; Konqueror/3.1; Linux) Mozilla/4.72 [en] (X11; U; Linux 2.4.18 i686) Opera/7.11 (Linux 2.4.20-18.9 i686 [en] Opera/6.0 (Macintosh; PPC Mac OS X; U) [en] Mozilla/4.0 (compatible; MSIE 5.22; Mac_PowerPC) Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030524 Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/85 (KHTML, like Gecko) Safari/85
Some interesting notes: - Netscape and Mozilla always use "X11" for anything X-based, and have a separate field to indicate the actual OS (Linux, Solaris, *BSD, etc.)- Konqueror lets you configure the amount of detail you report. The example here is the default. - Most of the *nix-based strings use qhat you get from uname -sm or uname -srm. - Most browsers designed for Mac OS X report "PPC Mac OS X" - There are possible privacy/security implications of revealing too much information, like your system's kernel version or your primary language.
I think that knowing the operating system is interesting, but much more than that probably isn't necessary, so I'd like to suggest one of the following syntaxes to extend the Dillo user-agent string: Dillo/<version> (<system>) Dillo/<version> (<platform>; <system>)
Where <system> is the output of "uname -sm" (i.e. Linux PPC, FreeBSD i386, etc) and <platform> is of the form (Windows|Macintosh|X11|etc.) as used by Mozilla (see http://www.mozilla.org/build/revised-user-agent-strings.html ).
This would result in identification like the following: Dillo/0.8 (Linux i386) Dillo/0.8 (FreeBSD i686) Dillo/0.8 (Linux PPC) Dillo/0.8 (PPC Mac OS X)* and so on, or perhaps: Dillo/0.8 (X11; Linux i386) Dillo/0.8 (X11; CYGWIN_NT-5.0 i686) to fit Netscape/Mozilla's pattern.
* (Aside from the reversed order, Mac OS X would probably show up as Darwin unless a special case is made. This would probably mean setting the User-Agent string once and keeping it in memory instead of building it with each HTTP request. And for the longer syntax there's the issue that it still runs under X, so the platform would probably be X11 instead of Macintosh.)
I have attached a quick patch which uses the utsname struct to create the first syntax, and I've tested it successfully on Linux and Cygwin on Intel hardware.
So does anyone else think this is a good/bad/interesting idea? And is anyone interested in testing it on *BSD/Solaris/Mac/etc.?
-- Kelson Vibber www.hyperborea.org
At 06:41 AM 7/9/2003, Diego Sáenz wrote:
I agree with this. Only one detail about the path: what about use autoconfig to fill a define with the system and plataform instead to add code?
I thought about that, but it's a matter of accuracy where systems are compatible, but named differently. The only sure example I can think of is Intel-style processors - you can compile a program on an i386 and run it on an i686, for instance, or the other way around if you don't use Pentium-specific optimizations. If the various *BSDs are binary-compatible, that would be another case (i.e. if you can compile on FreeBSD and run on OpenBSD), but I don't know if that's possible. Kelson Vibber www.hyperborea.org
On Tue, 8 Jul 2003 19:44:50 -0700 Kelson Vibber <kelson@pobox.com> wrote:
So does anyone else think this is a good/bad/interesting idea? And is anyone interested in testing it on *BSD/Solaris/Mac/etc.?
Your patch works for me on Solaris. It sets the user-agent to Dillo/0.7.2 (SunOS sun4u).
* Kelson Vibber <kelson@pobox.com>:
So does anyone else think this is a good/bad/interesting idea? And is anyone interested in testing it on *BSD/Solaris/Mac/etc.?
I think it is a good idea. Working nicely on Gentoo Linux ppc. User-agent says Dillo/0.7.2 (Linux ppc). Have a nice day, n.
participants (5)
-
Diego Sáenz
-
Kelson Vibber
-
Nicolas Kaiser
-
Phil Pennock
-
Todd Carson