Segfault on sites requiring authentication
Whenever I visit http://puppyisos.org/isos (puppy / linux), Dillo segfaults. The reason is that in some cases the variable Client_bw is NULL. However, some functions depend on it and access its member variables. Client_bw is initialized in Cache_process_queue() (src/cache.c). When authentication is required, this variable is passed over to Cache_auth_entry() which sets a timeout. Its callback (Cache_auth_callback()) opens up a dialog asking for the credentials and then calls a_Nav_reload() with the Client_bw as its first parameter. Finally in a_Nav_cancel_expect() Client_bw's member variable nav_expecting is used and as bw is NULL, Dillo segfaults. I came up with a small patch that removes Cache_auth_callback() and Cache_auth_entry() as I have not seen any advantages in delaying the authentication popup. The next thing I did was to show the dialog on condition that Client_bw is not NULL and that we have finished the transfer (At least I hope CA_Close is the proper flag for doing so). There were various unnecessary checks in src/nav.c. They compared bw with NULL. In my opinion the affected functions (a_Nav_expect_done, a_Nav_push and a_Nav_repush) should never be called if you cannot be sure that bw is not NUL. What is your view on this subject? Unfortunately, my approach does not solve the original problem that caused Client_bw being NULL. Furthermore, the patch introduces a new bug: The dialog pops up when reloading the site but after having entered the correct credentials or having pressed the ``Cancel'' button, it appears again. Any ideas? While digging deeper into the code, I noticed that we make heavy use of timers for popups and reloading pages? Is there any special reason for it? --Tim
Tim wrote:
Whenever I visit http://puppyisos.org/isos (puppy / linux), Dillo segfaults. The reason is that in some cases the variable Client_bw is NULL. However, some functions depend on it and access its member variables.
Client_bw is initialized in Cache_process_queue() (src/cache.c). When authentication is required, this variable is passed over to Cache_auth_entry() which sets a timeout. Its callback (Cache_auth_callback()) opens up a dialog asking for the credentials and then calls a_Nav_reload() with the Client_bw as its first parameter. Finally in a_Nav_cancel_expect() Client_bw's member variable nav_expecting is used and as bw is NULL, Dillo segfaults.
I'm not seeing this crash. Is it going through the for loop and getting set to NULL in Client_bw = ClientWeb->bw;, or is it NULL because there weren't any clients and it didn't execute that code?
I came up with a small patch that removes Cache_auth_callback() and Cache_auth_entry() as I have not seen any advantages in delaying the authentication popup. The next thing I did was to show the dialog on condition that Client_bw is not NULL and that we have finished the transfer (At least I hope CA_Close is the proper flag for doing so).
There were various unnecessary checks in src/nav.c. They compared bw with NULL. In my opinion the affected functions (a_Nav_expect_done, a_Nav_push and a_Nav_repush) should never be called if you cannot be sure that bw is not NUL. What is your view on this subject?
Unfortunately, my approach does not solve the original problem that caused Client_bw being NULL. Furthermore, the patch introduces a new bug: The dialog pops up when reloading the site but after having entered the correct credentials or having pressed the ``Cancel'' button, it appears again. Any ideas?
While digging deeper into the code, I noticed that we make heavy use of timers for popups and reloading pages? Is there any special reason for it?
If the calling code needs to go back to whatever it was doing rather than waiting around for the user to interact with a popup, a timer is used.
I'm not seeing this crash. Is it going through the for loop and getting set to NULL in Client_bw = ClientWeb->bw;, or is it NULL because there weren't any clients and it didn't execute that code?
Yes, it is going through the loop. Probably comparing Client->Url with entry->Url fails. As a result, ClientWeb and Client_bw do not get set. Have you also tried reloading the page? Sometimes it does not get loaded properly here. Anyway, it is odd that Dillo does not crash for you. Are you using the latest trunk? --Tim
Tim wrote:
I'm not seeing this crash. Is it going through the for loop and getting set to NULL in Client_bw = ClientWeb->bw;, or is it NULL because there weren't any clients and it didn't execute that code?
Yes, it is going through the loop. Probably comparing Client->Url with entry->Url fails. As a result, ClientWeb and Client_bw do not get set.
It would be good to find out what it's doing exactly for you. - I'd expect a client to be waiting for it in this case. - I think a client shouldn't ever have a NULL bw. - The case where there are no clients waiting for the URL does sound like a bug, although I think I'd be tempted just not to call Cache_auth_entry() if the bw was NULL.
Have you also tried reloading the page? Sometimes it does not get loaded properly here. Anyway, it is odd that Dillo does not crash for you. Are you using the latest trunk?
Reloading doesn't crash it for me. Yes, I'm using the latest code. You mentioned the other day that you get a lot of crashes in general, and I was surprised because I've had a total of three random crashes since 2.0 was released.
It would be good to find out what it's doing exactly for you. - I'd expect a client to be waiting for it in this case.
The client does not exist anymore because it has been destroyed too early. That is why I removed the timer in the previous patch. Debugging asynchronous code is difficult. I compiled Dillo with debugging symbols and added some printf()'s to see what is going on in detail. Afterwards I ran the binary through gdb. When visiting http://puppyisos.org/isos, everything looks alright in the beginning. The dialog pops up as expected. It is curious that we loop two times over ClientQueue though. Probably because the function is called every x received bytes, right? [New Thread 0xf760bb90 (LWP 14474)] [Thread 0xf760bb90 (LWP 14474) exited] Looping over ClientQueue... Looking for http://puppyisos.org/isos... Found (id=0). Looping over ClientQueue... Looking for http://puppyisos.org/isos... Found (id=0). Removing client (id=0) It seems that after confirming the authentication dialog, the page is being reloaded but without the header that is necessary for the HTTP authentication... Reloading page with authentication credentials... Looping over ClientQueue... Looking for http://puppyisos.org/isos... Found (id=0). Looping over ClientQueue... Looking for http://puppyisos.org/isos... Found (id=0). Removing client (id=0) At least this would explain, why the client for http://puppyisos.org/isos was deleted. Now the page is reloaded again because suddenly there is a client for http://puppyisos.org/isos/ again (with id=0): Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Found (id=0). Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Found (id=0). Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Found (id=0). Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Found (id=0). Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Found (id=0). Removing client (id=0) Looping over ClientQueue... Looking for http://puppyisos.org/miniicons/kfm_home.png... Looking for http://puppyisos.org/miniicons/ark.png... Found (id=1). HTTP warning: unhandled MIME type: "text/html; charset=iso-8859-1" Looping over ClientQueue... Looking for http://puppyisos.org/miniicons/kfm_home.png... Looking for http://puppyisos.org/miniicons/ark.png... Found (id=1). Removing client (id=1) We should add support for charsets within MIME types anytime. As far as I know, Dillo only depends on meta-tags when deciding which encoding to go with. This notation is officially approved, see: http://www.w3.org/International/tutorials/tutorial-char-enc/#Slide0270 Reloading page with authentication credentials... Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Not found! Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Not found! Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Not found! Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Not found! Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Not found! Looping over ClientQueue... Looking for http://puppyisos.org/isos/... Not found! Reloading page with authentication credentials... Why was the site reloaded again? Interestingly, no new client has been created for /isos this time. I am a bit astonished how often Cache_auth_callback() is called even though the dialog only appears once. Everything looks a bit asynchronous. An evidence might be the following lines: Looking for http://puppyisos.org/miniicons/kfm_home.png... Looking for http://puppyisos.org/miniicons/ark.png... In general, there should be a "Found (id=...)." or at least a "Not found!'" but they just follow after each other. And now it crashes: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xf79bf6d0 (LWP 14340)] 0x0805ba14 in a_Nav_cancel_expect (bw=0x0) at nav.c:250 250 if (bw->nav_expecting) { Here is the back trace: (gdb) bt #0 0x0805ba14 in a_Nav_cancel_expect (bw=0x0) at nav.c:250 #1 0x0805c098 in a_Nav_reload (bw=0x0) at nav.c:474 #2 0x0805dd4e in Cache_auth_callback (vdata=0x8252208) at cache.c:971 #3 0xf7f23e75 in fltk::wait () from /usr/lib/libfltk2.so #4 0xf7f23f88 in fltk::run () from /usr/lib/libfltk2.so #5 0x0804f686 in main (argc=1, argv=0xffdf7b74) at dillo.cc:333
- I think a client shouldn't ever have a NULL bw.
I agree.
- The case where there are no clients waiting for the URL does sound like a bug, although I think I'd be tempted just not to call Cache_auth_entry() if the bw was NULL.
Unfortunately, that did not really help. Well, Dillo does not crash anymore but no images on this page are shown and it reloads several times before anything gets displayed.
Reloading doesn't crash it for me. Yes, I'm using the latest code.
In contrast to /isos, the main page (http://puppyisos.org/) is working alright here. No crashes, all images are displayed properly. There are some other directory listings on that site which are working, too even though there syntax is quite similar to /isos'.
You mentioned the other day that you get a lot of crashes in general, and I was surprised because I've had a total of three random crashes since 2.0 was released.
The situation has improved since I am browsing with the latest trunk. Nevertheless, I still experience non-reproducible crashes sometimes: It happens frequently that dpid ``crashes'' (100% CPU consumption). This happens mostly when domain names cannot be resolved correctly. Trying to reload the page does not solve it. I have to kill the process manually. PS: Having applied the patch, the iterations of the ClientQueue loop will now stop after the URL has matched. Previously, the iteration for the same entry item was then repeated again (if AbortEntry is true). Is there a reason for keeping it as-is? --Tim
Tim wrote:
It is curious that we loop two times over ClientQueue though. Probably because the function is called every x received bytes, right?
a_Cache_process_dbuf calls Cache_process_queue once for each packet and once when it's done (IOClose).
Looking for http://puppyisos.org/miniicons/kfm_home.png... Looking for http://puppyisos.org/miniicons/ark.png... Found (id=1). HTTP warning: unhandled MIME type: "text/html; charset=iso-8859-1" Looping over ClientQueue... Looking for http://puppyisos.org/miniicons/kfm_home.png... Looking for http://puppyisos.org/miniicons/ark.png... Found (id=1). Removing client (id=1)
We should add support for charsets within MIME types anytime. As far as I know, Dillo only depends on meta-tags when deciding which encoding to go with. This notation is officially approved, see: http://www.w3.org/International/tutorials/tutorial-char-enc/#Slide0270
Charsets in MIME types are supported. It's probably complaining because it expected an image/png and got the text/html; charset=iso-8859-1 401 Auth page which can't be displayed as an image.
PS: Having applied the patch, the iterations of the ClientQueue loop will now stop after the URL has matched. Previously, the iteration for the same entry item was then repeated again (if AbortEntry is true). Is there a reason for keeping it as-is?
In case multiple clients were waiting for this data. Now I _am_ getting the crash. Maybe I accidentally typed in the main page instead. Don't think I did, but...whatever...anyway... in Cache_auth_callback(), a_Nav_reload(data->bw) does not sound at all like what we want for images on a page. We'd want something more along the lines of Cache_entry_remove() on the image and a_Nav_repush() on the page, but I think in either case that would remove clients for other images while we're trying to get their data, causing...what we're seeing. (Is dillo turning into a snarled knot?) If the last image comes in while there's no client, I wouldn't know what bw to repush.
participants (2)
-
corvid@lavabit.com
-
tim.nieradzik@gmx.de