Hi, On Fri, Mar 07, 2008 at 04:21:16PM +0000, place wrote:
Jorge wrote:
On Fri, Mar 07, 2008 at 06:17:17AM +0000, place wrote:
This is no sort of proper fix whatsoever, but I was tired of seeing the "would set charset" message. It's actually surprisingly useful for remote files, since they're generally still at the javascript-and-css point at the end of the first packet.
You mean, the decoder injection usually happens in the javascript-and-css section, right?
Please post some test-case URLs.
Ummm, okay. I'll search for ?????????????????? and see what comes up, looking for pages where meta is setting it to something other than utf-8.
OK, committed as is. I tried to make a "clean" switch by re-starting parsing all over again, but it needs more work (prototype works but unstable).
http://www.nbu.bg/ works. http://www.biforum.org/ works. http://bgstudent.8m.com/ works. http://www.newobjects.com/dict.asp works. http://www.mastylo.net/ fails that content-type testing in misc.c. http://bultext.tripod.com/ works. (tripod's still around?) http://textove.com/ works. http://bgstories.athost.net/ works. http://www.bglekar.com/ works even though it sets charset to ISO-8859-1! (it's all entities)
Getting bored, but I'll keep going until one breaks by having body text in the first packet.
http://www.bds-bg.org/ works. http://www.kursove-neg.com/ works. http://avast.110mb.com/ fails that content-type testing in misc.c. http://esperanto.vnvsoft.com/ works. http://www.dfbulgaria.org/ works. http://www.bghelsinki.org/ works. http://www.bcnl.org/ works. http://www.angelfire.com/ca/canbul/Bcacb.html works. http://www.ibl.bas.bg/ works. http://bgjedi.com/ breaks. So no Bulgarian jedi for dillo yet.
Thanks for the examples.
And of course the titles are all broken, but my window manager can only understand latin1 anyway.
Oh, titles and local files work here (with the unstable prototype). -- Cheers Jorge.-