[Dillo-dev] hairy bug in IO fixed!

Oct. 31, 2003

      Hi there!

  The  last five or six days were spent on trying to find (& fix)
a  bug  that  caused  the CPU to hog under very hard to reproduce
conditions.  I  know that heavy duty dillo users had noticed that
in  rare  ocassions  dillo  starts taking all the CPU, and that's
almost impossible to reproduce it again with another dillo...

  Well, the first interesting thing is that the CPU hog is "soft"
(i.e.  doesn't  block  dillo)  because  it is caused by a polling
loop.  The  other  surprising  fact  is  that  the CPU-hog occurs
outside of dillo (!). That is, not in any part of its code. So if
you  were  to  attach GDB to find the bug (once on hog), you'd be
always pointed outside dillo's source, but with a clue: polling.

  I  was  working  with the ftp dpi when I finally came accross a
way  to  reproduce  the  bug,  so  I  took the chance and started
digging.  A  whole revision/tweaking of the IO engine close/abort
handling,  a revision of glib's sources (gio channels) to finally
find a way to stop the polling loop.

  Short  story,  there  were several involved parts: threads, the
IO,  external  processes,  several  open connections in parallel,
glib (and the kernel).

  It  turned out to be a race condition in the window between the
actual  close  of  the  file  descriptor,  the  glib source event
removal for the watch, and the kernel reuse of FD number.

  The  source code for gio channels (inside glib) hinted a way to
remove  the  event  source cleanly, but outside the gio channel's
explicit API.

  The details are in IO.[ch] in CVS!

  Cheers
  Jorge.-

PS: Indan wrote:
...
3) There is no efficient way of (un)registering file descriptor events in
GTK. (FLTK has a perfect interface).
Yes there is! (subtle but effective)
...
Did anyone found the bug yet, without reading the diff? If not, then it's
as harmless as I thought ;-). Still not nice of course, but it's unclear
where it goes wrong exactly, doesn't seem to be easily fixed without
changing Dillo a lot.
I hope this patch and explanations helps you with point number 3.

  More comments later...

  Cheers 2
  Jorge.-

[Dillo-dev] hairy bug in IO fixed!

Jorge Arellano Cid