Dpi_connect_socket function from src/IO/dpi.c (also duplicated in test/cookies.c) has retry argument. It is set to TRUE on first call, and Dpi_connect_socket calls itself recursively with retry == FALSE in case of error. This retry function is not working. If connection fails, at line 645 sock_fd is set, but ret is still set to -1. Even if the second attempt to connect is successful, -1 is returned and conected sock_fd is lost. It is possible to fix it by adding "ret = sock_fd", but I think it is better to just remove retries completely. It is unlikely that connecting using loopback interface will fail on first try and succeed on retry. Also, comment says "the server closes sock_fd on error". However, sock_fd should still be closed by client to free allocated file descriptor. OS don't know if Dillo is going to read from it (in which case error such as ECONNRESET should happen) so file descriptor can't be reused. File descriptor should be closed in case of error in a_Dpip_build_cmd too. Patch to fix this function and its calls in src/IO/dpi.c and test/cookies.c is attached.
Hi, Finally I had the time to review this issue in more detail... On Sun, Dec 09, 2012 at 02:58:43PM +0400, p37sitdu at lavabit.com wrote:
Dpi_connect_socket function from src/IO/dpi.c (also duplicated in test/cookies.c) has retry argument. It is set to TRUE on first call, and Dpi_connect_socket calls itself recursively with retry == FALSE in case of error.
This retry function is not working. If connection fails, at line 645 sock_fd is set, but ret is still set to -1. Even if the second attempt to connect is successful, -1 is returned and conected sock_fd is lost.
It is possible to fix it by adding "ret = sock_fd", but I think it is better to just remove retries completely. It is unlikely that connecting using loopback interface will fail on first try and succeed on retry.
Yes, this code is obsolete now. It was meant for Unix domain sockets (when dpis/dpid used them). Sometimes the filename lingered in the filesystem after a dpi or server crash. With unlink&retry it was possible to recover then. Details in patchset: 7faa2c7a544f (Nov, 2009 ;) Removed in the repo (see below).
Also, comment says "the server closes sock_fd on error". However, sock_fd should still be closed by client to free allocated file descriptor. OS don't know if Dillo is going to read from it (in which case error such as ECONNRESET should happen) so file descriptor can't be reused.
File descriptor should be closed in case of error in a_Dpip_build_cmd too.
The file descriptor is properly closed by its CCC afterwards (Visible by defining VERBOSE in chain.c). The CCC manages the error propagation/handling, in both the requesting and answer branches.
Patch to fix this function and its calls in src/IO/dpi.c and test/cookies.c is attached.
No need to fix, it works properly. -- Cheers Jorge.-
On Sun, Dec 16, 2012 at 07:43:19PM -0300, Jorge Arellano Cid wrote:
On Sun, Dec 09, 2012 at 02:58:43PM +0400, p37sitdu at lavabit.com wrote:
Also, comment says "the server closes sock_fd on error". However, sock_fd should still be closed by client to free allocated file descriptor. OS don't know if Dillo is going to read from it (in which case error such as ECONNRESET should happen) so file descriptor can't be reused.
File descriptor should be closed in case of error in a_Dpip_build_cmd too.
The file descriptor is properly closed by its CCC afterwards (Visible by defining VERBOSE in chain.c).
The CCC manages the error propagation/handling, in both the requesting and answer branches.
File descriptor is properly closed if there were no errors. But if there is an error in a_Dpip_build_cmd or Dpi_blocking_write, sock_fd should be closed before returning from Dpi_connect_socket. Dpi_connect_socket calls a_Dpip_build_cmd to build command and Dpi_blocking_write to write this command to sock_fd. a_Dpip_build_cmd does not accept sock_fd as its argument so it can't close it. Dpi_blocking_write only writes to fd, but does not close it on error. So if one of these two function fails, Dpi_connect_socket should close sock_fd before returning -1. In the current state it just returns -1 and leave sock_fd open. By the time Dpi_connect_socket returns, sock_fd is lost and will not be closed by any code outside Dpi_connect_socket. Attached patch adds code to close sock_fd in case of error.
Hi, On Thu, Dec 20, 2012 at 06:49:49PM +0400, p37sitdu at lavabit.com wrote:
On Sun, Dec 16, 2012 at 07:43:19PM -0300, Jorge Arellano Cid wrote:
On Sun, Dec 09, 2012 at 02:58:43PM +0400, p37sitdu at lavabit.com wrote:
Also, comment says "the server closes sock_fd on error". However, sock_fd should still be closed by client to free allocated file descriptor. OS don't know if Dillo is going to read from it (in which case error such as ECONNRESET should happen) so file descriptor can't be reused.
File descriptor should be closed in case of error in a_Dpip_build_cmd too.
The file descriptor is properly closed by its CCC afterwards (Visible by defining VERBOSE in chain.c).
The CCC manages the error propagation/handling, in both the requesting and answer branches.
File descriptor is properly closed if there were no errors. But if there is an error in a_Dpip_build_cmd or Dpi_blocking_write, sock_fd should be closed before returning from Dpi_connect_socket.
Dpi_connect_socket calls a_Dpip_build_cmd to build command and Dpi_blocking_write to write this command to sock_fd. a_Dpip_build_cmd does not accept sock_fd as its argument so it can't close it. Dpi_blocking_write only writes to fd, but does not close it on error.
So if one of these two function fails, Dpi_connect_socket should close sock_fd before returning -1. In the current state it just returns -1 and leave sock_fd open.
By the time Dpi_connect_socket returns, sock_fd is lost and will not be closed by any code outside Dpi_connect_socket.
Attached patch adds code to close sock_fd in case of error.
As from the patch you sent (same as before), and the explanation you give (ignoring my explanation), I guess you're making conclusions out of reading the code in Dpi_connect_socket() *only*. Have you checked what you state as fact? (i.e. that the FD is leaked and not closed by the CCC). FWIW, I did check before writing my previous answer. Of course I'm human and do make mistakes, so if this is the case: can you please verify and show how to reproduce the bug? -- Cheers Jorge.-
On Thu, Dec 27, 2012 at 04:37:36PM -0300, Jorge Arellano Cid wrote:
As from the patch you sent (same as before), and the explanation you give (ignoring my explanation), I guess you're making conclusions out of reading the code in Dpi_connect_socket() *only*.
I made conclusions from reading Dpi_connect_socket() only, because in the case of error in Dpi_blocking_write file descriptor is left open and not passed anywhere else. If it was passed somewhere else, I would read the code there and check if file descriptor was closed there.
Have you checked what you state as fact? (i.e. that the FD is leaked and not closed by the CCC). FWIW, I did check before writing my previous answer.
Patch that I did for checking is attached. It replaces Dpi_blocking_write call with -1 to simulate fault in Dpi_blocking_write for Dpi_connect_socket(). Introducing the error in Dpi_blocking_write for all functions doesn't work, in this case Dpi_connect_socket is not called at all (because Dpi_start_dpid fails). With this patch applied, I recompiled dillo and pressed "bookmarks" button several times to trigger dpi code. Each time I pressed "bookmarks" button, new descriptor was leaked: % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 7 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 7 8 Then I applied the patch I sent before (with hg qpush) on top of the fault patch and rebuilt dillo: "make clean && make". I ran dillo again with ./src/dillo and did the same test with bookmarks button: % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5
Hi, On Sat, Dec 29, 2012 at 05:44:22AM +0400, p37sitdu at lavabit.com wrote:
On Thu, Dec 27, 2012 at 04:37:36PM -0300, Jorge Arellano Cid wrote:
As from the patch you sent (same as before), and the explanation you give (ignoring my explanation), I guess you're making conclusions out of reading the code in Dpi_connect_socket() *only*.
I made conclusions from reading Dpi_connect_socket() only, because in the case of error in Dpi_blocking_write file descriptor is left open and not passed anywhere else. If it was passed somewhere else, I would read the code there and check if file descriptor was closed there.
Have you checked what you state as fact? (i.e. that the FD is leaked and not closed by the CCC). FWIW, I did check before writing my previous answer.
Patch that I did for checking is attached. It replaces Dpi_blocking_write call with -1 to simulate fault in Dpi_blocking_write for Dpi_connect_socket().
Introducing the error in Dpi_blocking_write for all functions doesn't work, in this case Dpi_connect_socket is not called at all (because Dpi_start_dpid fails).
With this patch applied, I recompiled dillo and pressed "bookmarks" button several times to trigger dpi code. Each time I pressed "bookmarks" button, new descriptor was leaked:
% ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 7 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 7 8
Then I applied the patch I sent before (with hg qpush) on top of the fault patch and rebuilt dillo: "make clean && make". I ran dillo again with ./src/dillo and did the same test with bookmarks button:
% ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5
diff -r 8e10859b2aab src/IO/dpi.c --- a/src/IO/dpi.c +++ b/src/IO/dpi.c @@ -641,7 +641,7 @@ /* send authentication Key (the server closes sock_fd on error) */ } else if (!(cmd = a_Dpip_build_cmd("cmd=%s msg=%s", "auth", SharedKey))) { MSG_ERR("[Dpi_connect_socket] Can't make auth message.\n"); - } else if (Dpi_blocking_write(sock_fd, cmd, strlen(cmd)) == -1) { + } else if (-1 == -1) { MSG_ERR("[Dpi_connect_socket] Can't send auth message.\n"); } else { ret = sock_fd;
Ah, OK, now I see what you mean. There's a semantic ambiguity, the comment: /* send authentication Key (the server closes sock_fd on error) */ means: "if there's an error with the authentication" (I know because I wrote the comment). (and as the patch removed this comment) I made the same test but with: - } else if (!(cmd = a_Dpip_build_cmd("cmd=%s msg=%s", "auth", SharedKey))) { + } else if (!(cmd = a_Dpip_build_cmd("cmd=%s msg=%s","auth",SharedKey+1))) { and the CCC closes the file descriptor as expected. The error case you noticed exists, and although highly unlikely to happen, requires some thought. As from the test above, the socket is connected, then nothing is written to simulate an error, but the OS never sees an error, as it doesn't exist, and doesn't notify either end (IOW, it leaves the socket passively expecting for data to come). Now, the canonical way to handle the whole process should have been to build the CCC upon connect(), then try to authenticate, and let the CCC handle any errors that may come, but this is probably too much code for little gain, so it was all coded in Dpi_connect_socket(). The problem with closing FDs outside the CCC is that it can introduce race conditions, but in this case the CCC is partially built, so your patch suggestion may be OK. I'll have to give it some more thought and tests... -- Cheers Jorge.-
On Sun, Dec 30, 2012 at 07:11:28PM -0300, Jorge Arellano Cid wrote:
Hi,
On Sat, Dec 29, 2012 at 05:44:22AM +0400, p37sitdu at lavabit.com wrote:
On Thu, Dec 27, 2012 at 04:37:36PM -0300, Jorge Arellano Cid wrote:
As from the patch you sent (same as before), and the explanation you give (ignoring my explanation), I guess you're making conclusions out of reading the code in Dpi_connect_socket() *only*.
I made conclusions from reading Dpi_connect_socket() only, because in the case of error in Dpi_blocking_write file descriptor is left open and not passed anywhere else. If it was passed somewhere else, I would read the code there and check if file descriptor was closed there.
Have you checked what you state as fact? (i.e. that the FD is leaked and not closed by the CCC). FWIW, I did check before writing my previous answer.
Patch that I did for checking is attached. It replaces Dpi_blocking_write call with -1 to simulate fault in Dpi_blocking_write for Dpi_connect_socket().
Introducing the error in Dpi_blocking_write for all functions doesn't work, in this case Dpi_connect_socket is not called at all (because Dpi_start_dpid fails).
With this patch applied, I recompiled dillo and pressed "bookmarks" button several times to trigger dpi code. Each time I pressed "bookmarks" button, new descriptor was leaked:
% ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 7 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 6 7 8
Then I applied the patch I sent before (with hg qpush) on top of the fault patch and rebuilt dillo: "make clean && make". I ran dillo again with ./src/dillo and did the same test with bookmarks button:
% ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5 % ls /proc/$(pidof dillo)/fd 0 1 2 3 4 5
diff -r 8e10859b2aab src/IO/dpi.c --- a/src/IO/dpi.c +++ b/src/IO/dpi.c @@ -641,7 +641,7 @@ /* send authentication Key (the server closes sock_fd on error) */ } else if (!(cmd = a_Dpip_build_cmd("cmd=%s msg=%s", "auth", SharedKey))) { MSG_ERR("[Dpi_connect_socket] Can't make auth message.\n"); - } else if (Dpi_blocking_write(sock_fd, cmd, strlen(cmd)) == -1) { + } else if (-1 == -1) { MSG_ERR("[Dpi_connect_socket] Can't send auth message.\n"); } else { ret = sock_fd;
Ah, OK, now I see what you mean.
There's a semantic ambiguity, the comment:
/* send authentication Key (the server closes sock_fd on error) */
means: "if there's an error with the authentication" (I know because I wrote the comment).
(and as the patch removed this comment) I made the same test but with:
- } else if (!(cmd = a_Dpip_build_cmd("cmd=%s msg=%s", "auth", SharedKey))) { + } else if (!(cmd = a_Dpip_build_cmd("cmd=%s msg=%s","auth",SharedKey+1))) {
and the CCC closes the file descriptor as expected.
The error case you noticed exists, and although highly unlikely to happen, requires some thought. As from the test above, the socket is connected, then nothing is written to simulate an error, but the OS never sees an error, as it doesn't exist, and doesn't notify either end (IOW, it leaves the socket passively expecting for data to come).
Now, the canonical way to handle the whole process should have been to build the CCC upon connect(), then try to authenticate, and let the CCC handle any errors that may come, but this is probably too much code for little gain, so it was all coded in Dpi_connect_socket().
The problem with closing FDs outside the CCC is that it can introduce race conditions, but in this case the CCC is partially built, so your patch suggestion may be OK.
I'll have to give it some more thought and tests...
OK, just committed a slightly modified and bigger patch that also corrects the wrong error error message given (CCC side). Please check how it works, and send feedback. -- Cheers Jorge.-
participants (2)
-
jcid@dillo.org
-
p37sitdu@lavabit.com