Folks, I've been using dillo for a number of months now, and it has become my first choice browser--I switch to mozilla or opera only when I have to java, javascript, http/1.1 authentication, ssl and the like. Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0 Sometimes it recovers after a while, sometime I get fed up and close all windows at the WM manager level. Stopping all the active windows doesn't cure it, nor does closing them. I haven't tried using the `Exit Dillo' menu item, but I'll do so on the next occurrence. This has been with the CVS version(s) over the last few weeks. Two systems, one running redhat 6.2 (I know, I know...), the other Debian GNU/Linux 3.0r1. So, two questions: -- Has anyone else observed this behavior? -- Can anyone suggest a good way to diagnose the problem so I can submit a decent bug report? As things stand my understanding is horribly vague. N.B. I seems to happen most when I have opened several links in new windows in rapid succession. Thanks, -- -- David McKee -- dmckee@jlab.org -- (757) 269-7492 (Office)
On Thu, 19 Jun 2003, David McKee wrote:
Folks, I've been using dillo for a number of months now, and it has become my first choice browser--I switch to mozilla or opera only when I have to java, javascript, http/1.1 authentication, ssl and the like.
That's the idea!
Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0
Sometimes it recovers after a while, sometime I get fed up and close all windows at the WM manager level. Stopping all the active windows doesn't cure it, nor does closing them. I haven't tried using the `Exit Dillo' menu item, but I'll do so on the next occurrence.
Closing the Dillo instance "solves" it.
This has been with the CVS version(s) over the last few weeks. Two systems, one running redhat 6.2 (I know, I know...), the other Debian GNU/Linux 3.0r1.
So, two questions:
-- Has anyone else observed this behavior?
Yes, I've noticed, and have been on it a few times, but I still don't know how to reproduce it reliably (I haven't devoted much time to it though).
-- Can anyone suggest a good way to diagnose the problem so I can submit a decent bug report? As things stand my understanding is horribly vague.
Well I think it has to do with a socket connection in an "exceptional" state of some class (every image in a page, and the page itself, open socket connections). Attaching GDB to the hogging Dillo instance inmediatly stops the problem (and "cont" resumes :-), the interesting thing is that you can't break the code into a function inside Dillo. It seems to get trapped in a busy wait between GTK+ and the kernel signals. * Sometimes de-attaching GDB magically solves the problem. !?
N.B. I seems to happen most when I have opened several links in new windows in rapid succession.
Finding a reliable way to reproduce the problem is KEY to solving it. If you can do that it'd be of great help! I'd try to bring the socket connection into exceptional conditions, and refine from there to find the bug. For instance, what happens when the connection is aborted just before the remote server is contacted? Or maybe what if it is aborted after the remote server is contacted but before Dillo gets notified? It sounds more complex than it is. Just try to find a busy server and play hard on it! Start with the sites you were visiting. I'll check the code for exception handling. Any help is highly appreciated. Cheers Jorge.-
On Thu, 19 Jun 2003, David McKee wrote:
[...] Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0
[...] N.B. I seems to happen most when I have opened several links in new windows in rapid succession.
I just developed a "blind patch" by checking the IO code for exception handling (as suggested in my previous email). After giving the new code a hard time playing with simultaneous connections, it seems to work OK. The real test, of course, is to prove it against a reliable way to reproduce the problem. Can you guys work on finding a way to reproduce the bug please? Cheers Jorge.-
Hi ya! Just forgot to mention that the patch is not in CVS. Please play a while to find the way to reproduce it! Cheers Jorge.- PS: Did I suggest finding a way to... :-)
On Thu, 19 Jun 2003 13:46:02 -0400 (EDT) David McKee <dmckee@jlab.org> wrote:
Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0
Sometimes it recovers after a while, sometime I get fed up and close all windows at the WM manager level. Stopping all the active windows doesn't cure it, nor does closing them. I haven't tried using the `Exit Dillo' menu item, but I'll do so on the next occurrence.
I have had a the same sort of problem - about the only consistency I can see behind it is that I can only remember it happening when I hit reload on an page that seems to have stalled part-way through. However, it doesn't *always* do it. Are you using a proxy server? I am, and from what I can remember when it does this, the reload (often, but not always) doesn't actually start over, so I suspect that my proxy server still has an open connection (which has stalled) and just resends what it's got so far. The 100% cpu usage ends when(if) the reload completes, and I(think) it also ends when the connection times out. Next time it happens, I'll try to get something more concrete. -- Stephen Lewis slewis@paradise.net.nz
* David McKee <dmckee@jlab.org> [2003-06-19 13:46] :
Folks, I've been using dillo for a number of months now, and it has become my first choice browser--I switch to mozilla or opera only when I have to java, javascript, http/1.1 authentication, ssl and the like.
Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0
Sometimes it recovers after a while, sometime I get fed up and close all windows at the WM manager level. Stopping all the active windows doesn't cure it, nor does closing them. I haven't tried using the `Exit Dillo' menu item, but I'll do so on the next occurrence.
This has been with the CVS version(s) over the last few weeks. Two systems, one running redhat 6.2 (I know, I know...), the other Debian GNU/Linux 3.0r1.
So, two questions:
-- Has anyone else observed this behavior?
Not sure if the problem is the same, but I have noticed this behaviour when I load a big page with an anchor (or is it a reference?) in the address, eg: http://ftp-master.debian.org/testing/update_excuses.html#kdelibs (WARNING: 853 Kb). It eventually finishes loading the page but it can take a HUGE amount of time (several minutes whereas the same page without '#kdelibs' only takes about 10 to 15 seconds). I do not think however that this particular problem is specific to Dillo as I have already seen Mozilla hung with the same test. HTH Fred
On Fri, 20 Jun 2003, Frédéric Bothamy wrote:
* David McKee <dmckee@jlab.org> [2003-06-19 13:46] :
[...] Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0
[...] So, two questions:
-- Has anyone else observed this behavior?
Not sure if the problem is the same, but I have noticed this behaviour when I load a big page with an anchor (or is it a reference?) in the address, eg:
http://ftp-master.debian.org/testing/update_excuses.html#kdelibs
(WARNING: 853 Kb). It eventually finishes loading the page but it can take a HUGE amount of time (several minutes whereas the same page without '#kdelibs' only takes about 10 to 15 seconds).
This is another problem! IIRC Sebastian posted a comment about this sometime ago, but although I searched for, I didn't find the post :( After reviewing it myself (the huge page with anchor problem), I developed a small patch that more or less solves the problem (makes the anchored URL to load two times slower, but it works). -- I'm still working on it. The problem is similar to what we had before incremental rewraping. The anchor (aka. "fragment") code falls inside an exponential algorithm so it dies hogging (unless you have exponential patience!).
I do not think however that this particular problem is specific to Dillo as I have already seen Mozilla hung with the same test.
Most programs don't care much about hogging the CPU, but we do! So this is a BUG in Dillo and we'll try to work it out. Sebastian: Did you post something about this, or is it my mind joking on me? Cheers Jorge.- PS: The search for a way to reproduce the other BUG is still open
On Fri, Jun 20, Jorge Arellano Cid wrote:
On Fri, 20 Jun 2003, Frédéric Bothamy wrote:
Not sure if the problem is the same, but I have noticed this behaviour when I load a big page with an anchor (or is it a reference?) in the address, eg:
http://ftp-master.debian.org/testing/update_excuses.html#kdelibs
(WARNING: 853 Kb). It eventually finishes loading the page but it can take a HUGE amount of time (several minutes whereas the same page without '#kdelibs' only takes about 10 to 15 seconds).
This is another problem!
IIRC Sebastian posted a comment about this sometime ago, but although I searched for, I didn't find the post :(
I don't remember it, the problem is actually new to me. Anyway, I've looked a bit at it, read on.
After reviewing it myself (the huge page with anchor problem), I developed a small patch that more or less solves the problem (makes the anchored URL to load two times slower, but it works). -- I'm still working on it.
Profiling shows that dillo is most of the time in Dw_gtk_viewport_update_anchor_rec (and related functions). To avoid searching recursively for the widget which contains the anchor, I've tested to add an hashtable to the viewport itself, which works quite well. Dw_gtk_viewport_update_anchor itself is called quite a couple of times (1511 times in my 200k test page), but AFAIS, this is correct.
The problem is similar to what we had before incremental rewraping. The anchor (aka. "fragment") code falls inside an exponential algorithm so it dies hogging (unless you have exponential patience!).
Not exactly, it is quadratic, or cubic (too lazy to be correct), but it's still bad enough. Sebastian
On Thu, Jun 19, David McKee wrote:
Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0 ...
I've just committed a change, which fixes at least the problem with <http://ftp-master.debian.org/testing/update_excuses.html#kdelibs>. Please let me know, if the problems still arise. Sebastian
* Sebastian Geerken <s.geerken@ping.de> [2003-06-29 16:06] :
On Thu, Jun 19, David McKee wrote:
Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0 ...
I've just committed a change, which fixes at least the problem with <http://ftp-master.debian.org/testing/update_excuses.html#kdelibs>. Please let me know, if the problems still arise.
Ok. Since I was the last one talking about this issue, I guess I should try this. Seems to work perfectly to me, loading this page now takes about 15 seconds and jumps to the anchor as soon as it is loaded. Thanks a lot for your work (and thanks to Jorge for his continuous work on this wonderful browser). Fred
Hi there! On Thu, 19 Jun 2003, David McKee wrote:
Folks, I've been using dillo for a number of months now, and it has become my first choice browser--I switch to mozilla or opera only when I have to java, javascript, http/1.1 authentication, ssl and the like.
I like that part! :) Well, this thread has had a lot of post, and bugfixes! The problem is that we don't yet find a way to reproduce the original problem, that is:
Of late dillo has started getting into a state where all progress stop in all open windows, and it sits on the CPU---keeping my load average at 1.0 [...]
N.B. I seems to happen most when I have opened several links in new windows in rapid succession.
(I also hinted howto try to reproduce it in this thread). Now a "blind fix" is in the CVS. That is, new code that fixes all the lose ends in IO.c and file.c, I was able to find, about EINTR. If some of the file descriptor, for instance, happened to be interrupted inside the 'close' call (returning EINTR), and remained open, could be the cause of the CPU HOG (by not being eliminated from the FDset inside GTK+ code). What does it mean? If someone finds a way to reproduce the HOG (with older code) and it doesn't happen with the CVS, then it's fixed! If someone reproduces the bug with the newest CVS code, then the bug is hiding elsewhere... Cheers Jorge.-
participants (5)
-
David McKee
-
Frédéric Bothamy
-
Jorge Arellano Cid
-
Sebastian Geerken
-
Stephen Lewis