Hi, Is anyone here savvy on using dynamic memory allocation in pthreads? The problem: while hunting a bug that showed after the large parser orthogonalization patch, it was finally found between dillo and the file dpi server; it was a critical race. I recoded the file server to use pthreads and it went away, but... ... the new file server started to increase dramatically its virtual size. The threads are allocating memory and freeing it with the usual g_malloc/g_free pair, but it seems like these blocks remain allocated even after the thread is finished. This may be the same root for the reload memory leak pointed somtime ago, do you remember? The IO engine uses a pthread to make the request. It seems that there's an API to use "thread specific data" or TSD by using pthread_key_create() and akin functions. Does anyone know if this is the only way, or an URL of a good tutorial on how to do this? Note that the file server problem can be solved by using a dpi filter instead of a server, but as pthreads are being used in other parts of Dillo, this becomes very important to know. -- Cheers Jorge.-
Hi Jorge, Jorge Arellano Cid writes:
Hi,
Is anyone here savvy on using dynamic memory allocation in pthreads?
The problem: while hunting a bug that showed after the large parser orthogonalization patch, it was finally found between dillo and the file dpi server; it was a critical race. I recoded the file server to use pthreads and it went away, but...
... the new file server started to increase dramatically its virtual size.
The threads are allocating memory and freeing it with the usual g_malloc/g_free pair, but it seems like these blocks remain allocated even after the thread is finished.
I don't have time to really look at this, but are you using g_thread_init(), gdk_threads_init(), gdk_threads_enter(), gdk_threads_leave() and company? Here is some info: http://www.gtk.org/faq/#AEN482 I wouldn't be surprised if GLib gets confused without calling g_thread_init(). Specifically, if I remember correctly, the locking of GLibc's internal variables is only actually done when g_thread_init() is executed. This _might_ explain your problem... but it might be a long shot, since a race condition is needed to activate the possible problem. If you're sure, however, that g_malloc and g_free are the source for your memory leak, that the problem is something else (in GLib 1.2, at least, g_malloc and g_free map almost directly to malloc() and free(), which are thread-safe). The only big caveat of malloc/free is that they are not signal-safe, i.e., you shouldn't use them (along with a hoard of other things in libc and systems calls, etc) in signal handlers.
This may be the same root for the reload memory leak pointed somtime ago, do you remember? The IO engine uses a pthread to make the request.
It seems that there's an API to use "thread specific data" or TSD by using pthread_key_create() and akin functions. Does anyone know if this is the only way, or an URL of a good tutorial on how to do this?
Thread specific data is a solution, but I'd recommend avoiding it as much as possible. There should be a way to figure out what is going on with your memory leaks... Hope this helps, -- Livio B. Soares
Livio, Lars, On Tue, Nov 16, 2004 at 06:05:17PM -0200, Livio Baldini Soares wrote:
Hi Jorge,
Hi Livio! Good to know you're still there.
I don't have time to really look at this, but are you using g_thread_init(), gdk_threads_init(), gdk_threads_enter(), gdk_threads_leave() and company? Here is some info:
http://www.gtk.org/faq/#AEN482
I wouldn't be surprised if GLib gets confused without calling g_thread_init(). Specifically, if I remember correctly, the locking of GLibc's internal variables is only actually done when g_thread_init() is executed.
This _might_ explain your problem... but it might be a long shot, since a race condition is needed to activate the possible problem. If you're sure, however, that g_malloc and g_free are the source for your memory leak, that the problem is something else (in GLib 1.2, at least, g_malloc and g_free map almost directly to malloc() and free(), which are thread-safe).
The only big caveat of malloc/free is that they are not signal-safe, i.e., you shouldn't use them (along with a hoard of other things in libc and systems calls, etc) in signal handlers.
I decided to cut glib out of the problem, then developed a test program and did some research. *Very* interesting results came out of it.
[Lars wrote:] If the threads are allocating of a public heap it might not be possible to reclaim/remap the heap so it will look as if a lot of memory is allocated. There is a similar 'bug' in the C++ standard that means that memory allocated by objects might never be reclaimed under some circumstances.
Yes it look like this kind of problem.
I DO NOT now if this is related to the problem, but it sounds a bit similar (I suppose you have tried running the whole lot under valgrind ?)
I cut glib out and made a test program (attached), that matches mallocs and frees trivially. The results were amazing: compile the program with: gcc pth_mem.c -o pth_mem -lpthread run it like this: ./pth_mem <j|d|cd> and observe the memory footprint from another terminal with: while [ 1 ]; do ps aux | grep [p]th_mem; sleep 4; done The program makes 8 iterations, launching 10 pthreads on each one. Each launched thread allocates memory, sleeps and frees it 10 times using malloc/free, then it exits. The interesting part is that: ./pth_mem d ('d' for detached with pthread_detach()) leaks tons of memory!!! ./pth_mem cd ('cd' for create detached with pthread_attr_setdetachstate) leaks much less memory! ./pth_mem j ('j' for joinable) is almost perfect but sometimes leaks. The first two should be the same, even more, the manual says: <q> `pthread_detach' puts the thread TH in the detached state. This guarantees that the memory resources consumed by TH will be freed immediately when TH terminates. However, this prevents other threads from synchronizing on the termination of TH using `pthread_join'. </q> Unless I'm missing a key point, this looks like a bug in pthreads. Would you mind check it out and point my mistake, or giving me a pointer to pthreads library maintainer? -- Cheers Jorge.-
Thanks, I'm so glad I actually was in the ballpark :-) ... Usually when I have a good idéa of what's happening, it's wrong :-) .. I'll also look at it, since I debugged some stuff like this a long time ago :-) ... ( C++ project big -> bugs -> I rewrote in C and put up some wrappers ... -> worked ( well, it looked good anyhow )). I'm really intrigued now ! / Lars ... On Thu, 18 Nov 2004 13:34:46 -0300 Jorge Arellano Cid <jcid@dillo.org> wrote:
Livio, Lars,
...snip...
[Lars wrote:] If the threads are allocating of a public heap it might not be possible to reclaim/remap the heap so it will look as if a lot of memory is allocated. There is a similar 'bug' in the C++ standard that means that memory allocated by objects might never be reclaimed under some circumstances.
Yes it look like this kind of problem.
...snip...
On Thu, Nov 18, 2004 at 01:34:46PM -0300, Jorge Arellano Cid wrote:
[...] I decided to cut glib out of the problem, then developed a test program and did some research. *Very* interesting results came out of it.
[...]
[Lars wrote:] If the threads are allocating of a public heap it might not be possible to reclaim/remap the heap so it will look as if a lot of memory is allocated. There is a similar 'bug' in the C++ standard that means that memory allocated by objects might never be reclaimed under some circumstances.
Yes it look like this kind of problem.
I DO NOT now if this is related to the problem, but it sounds a bit similar (I suppose you have tried running the whole lot under valgrind ?)
I cut glib out and made a test program (attached), that matches mallocs and frees trivially.
The results were amazing:
compile the program with:
gcc pth_mem.c -o pth_mem -lpthread
run it like this:
./pth_mem <j|d|cd>
and observe the memory footprint from another terminal with:
while [ 1 ]; do ps aux | grep [p]th_mem; sleep 4; done
Has anyone had the opportunity to run the test program under Solaris? This is an interesting test because Solaris has its own threads implementation. I'd expect the detached pthreads test not to leak memory on Solaris. BTW, the new threads-based dpi server is in the CVS now. This fixes a critical race with large files that linked lots of instances of a minor image (a bullet for instance). It worked OK with my tests (up to a 2MB page), uses joinable threads, and didn't show any leaks while testing. Strange no one reported this bug before... Well I guess almost nobody writes large HTML pages these days, as the very big browsers will surely have a hard time with them! :-) -- Cheers Jorge.-
On Wed, Nov 24, 2004 at 10:46:30AM -0300, Jorge Arellano Cid wrote:
Has anyone had the opportunity to run the test program under Solaris? This is an interesting test because Solaris has its own threads implementation. I'd expect the detached pthreads test not to leak memory on Solaris.
did it just now. all three ways leak zero memory. -brian -- "Now you know why I got the everliving hell OUT of Windows administration. Knowing it doesn't make it any easier. It's just broken-as-designed."
i just did a test on my mac, which runs NetBSD 2.0 RC4, and it also does not leak memory in either of the three cases. i know they use the pth library, but i wonder if they do something differently than linux does? -brian -- "Now you know why I got the everliving hell OUT of Windows administration. Knowing it doesn't make it any easier. It's just broken-as-designed."
Hi Brian, Brian Hechinger writes:
i just did a test on my mac, which runs NetBSD 2.0 RC4, and it also does not leak memory in either of the three cases.
i know they use the pth library, but i wonder if they do something differently than linux does?
Did you get a chance to read my bugreport messages? http://lists.auriga.wearlab.de/pipermail/dillo-dev/2004-November/002448.html and http://lists.auriga.wearlab.de/pipermail/dillo-dev/2004-November/002450.html It is possible that different pthread implementations initialize (or set) the threadID (first argument passed to pthread_create()) in different orders. In Linux, I have seen different systems behave randomly (probably because it issues a clone(), and from then on let's the OS take over the scheduling, which is usually non-deterministic). Even if you tried sucessfully on Solaris and *BSD, it does not mean that it is a safe construct to rely on. The OS might happen to schedule the threads in a certain order, or the specific version of the libraries synchronizes the thread creation in a specific manner, etc. Bottom line, the POSIX threads specificiation does *not* require the threadID to be set before the new thread begin execution:
Hi, Well, I searched for where to report bugs in pthreads and the process is tedious to say the least. They want the submitter to test against CVS glibc which is a thing I don't have the time, nor the machine to do. FWIW, a near target is at least the test program on the latest glibc (glibc-2.3.3 at this time), but I have 2.3.2. Can someone please test it with glibc-2.3.3? If the Solaris test shows no leaks, then that's strong evidence of a _serious_ problem within glibc's pthreads and the bug report will most probably be welcomed. After I have the results from glibc-2.3.3 and Solaris, I'll send the report. In the meanwhile I've been playing some workarounds for Dillo and the results is an impressive memory usage reduction so far (BTW, the reload memory leak also had its roots in pthreads leaking memory). What an irony: after years of chasing the most minor memory leaks (up to a few bytes), it ends up being that we've leaked a lot, by Megabyte units, because of an external library. Notwithstanding there's still the possibility of me being wrong in the understanding of its API, this chance is growing thiner. It could also be the kernel (or ps) not reporting correctly the VSZ and RSS. Who knows? At least the workarounds are improving a lot these stats in my tree. -- Cheers Jorge.-
I REALLY understand chasing the leaks and then finding something like this stings ... / regars, Lars Segerlund. On Thu, 25 Nov 2004 12:23:22 -0300 Jorge Arellano Cid <jcid@dillo.org> wrote:
Hi,
Well, I searched for where to report bugs in pthreads and the process is tedious to say the least.
They want the submitter to test against CVS glibc which is a thing I don't have the time, nor the machine to do.
FWIW, a near target is at least the test program on the latest glibc (glibc-2.3.3 at this time), but I have 2.3.2. Can someone please test it with glibc-2.3.3?
If the Solaris test shows no leaks, then that's strong evidence of a _serious_ problem within glibc's pthreads and the bug report will most probably be welcomed.
After I have the results from glibc-2.3.3 and Solaris, I'll send the report. In the meanwhile I've been playing some workarounds for Dillo and the results is an impressive memory usage reduction so far (BTW, the reload memory leak also had its roots in pthreads leaking memory).
What an irony: after years of chasing the most minor memory leaks (up to a few bytes), it ends up being that we've leaked a lot, by Megabyte units, because of an external library.
Notwithstanding there's still the possibility of me being wrong in the understanding of its API, this chance is growing thiner.
It could also be the kernel (or ps) not reporting correctly the VSZ and RSS. Who knows? At least the workarounds are improving a lot these stats in my tree.
-- Cheers Jorge.-
_______________________________________________ Dillo-dev mailing list Dillo-dev@dillo.org http://lists.auriga.wearlab.de/cgi-bin/mailman/listinfo/dillo-dev
On Thu, Nov 25, 2004 at 04:48:07PM +0100, Lars Segerlund wrote:
I REALLY understand chasing the leaks and then finding something like this stings ...
Yes it does. Note that even knowing that it was my fault (assuming wrongly about the API), the irony remains. ...and the soothing side of it, is that now we have a Dillo running on less than 1/5 of the memory it used to. -- Cheers Jorge.-
Hi Jorge, I just got the chance to look at your little test program. It has a few bugs: 1) In pthread_create() you are passing a pointer to the unintialized thrdID[i] variable. Pthreads does not guarantee that the threadID argument is initialized before the start_routine is called. If you try printing the value of p_thrID in the 'thr_function', you'll see you'll get some correct and some random values. 2) Still, it's not a good idea to pass variables from the thread stack. In this particular case, it will work since the main thread outlives all the created ones, but it is usually highly likely to cause you headaches... A minimal fix for your test code is: --- pth_mem.orig.c 2004-11-25 11:16:01.000000000 -0500 +++ pth_mem.c 2004-11-25 11:33:34.000000000 -0500 @@ -7,11 +7,10 @@ int mode; static void *thr_function(void *data) { int i; - pthread_t *p_thrID = data; void *mem; if (mode == 'd') - pthread_detach(*p_thrID); + pthread_detach(pthread_self()); for (i = 0; i < 10; ++i) { mem = malloc(1024*1024); In my box, this gets rid of any leaks. best regards, -- Livio B. Soares
Hi (again), Livio Baldini Soares writes:
Hi Jorge,
I just got the chance to look at your little test program. It has a few bugs:
1) In pthread_create() you are passing a pointer to the unintialized thrdID[i] variable. Pthreads does not guarantee that the threadID argument is initialized before the start_routine is called. If you try printing the value of p_thrID in the 'thr_function', you'll see you'll get some correct and some random values.
I just looked and there are three occurences of this bug in Dillo: dpi/file.c src/IO/IO.c src/dns.c All three of them should have the current calls to: 'pthread_detach(uninitialized-variable)' changed to use 'pthread_detach(phtread_self())' As in all three cases the variable used has some likelihood (in my system, a high likelihood) of not yet being initialized. In some cases (like struct DnsServer) the threadID variable can optionally be altogether removed. The final patch is left as an exercise to the reader ;-) hope this helps, -- Livio B. Soares
On Thu, Nov 25, 2004 at 02:39:01PM -0200, Livio Baldini Soares wrote:
Hi Jorge,
I just got the chance to look at your little test program. It has a few bugs:
1) In pthread_create() you are passing a pointer to the unintialized thrdID[i] variable. Pthreads does not guarantee that the threadID argument is initialized before the start_routine is called. If you try printing the value of p_thrID in the 'thr_function', you'll see you'll get some correct and some random values.
Gotcha! This explains most of it. So the quid is that Pthreads does not guarantee that the threadID argument is initialized before the start_routine is called. My fault! Even though some other pthread implementations seem to do it, the document from the opengroup you cite is clear: <q> APPLICATION USAGE There is no requirement on the implementation that the ID of the created thread be available before the newly created thread starts executing. The calling thread can obtain the ID of the created thread through the return value of the pthread_create() function, and the newly created thread can obtain its ID by a call to pthread_self(). </q> Now, even after patching the test program to use:
- pthread_detach(*p_thrID); + pthread_detach(pthread_self());
it still leaks some memory in my machine! (near half a MB). It is interesting to note that as Brian reports:
[Brian wrote:]
[Jorge wrote:]
Has anyone had the opportunity to run the test program under Solaris? This is an interesting test because Solaris has its own threads implementation. I'd expect the detached pthreads test not to leak memory on Solaris.
did it just now. all three ways leak zero memory.
Zero leak on Solaris. On my machine, the three ways can still leak, but sometimes they don't. For instance, running the test should give: jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd but it usually ends with: jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19027 0.0 0.4 2676 408 pts/10 S+ 08:14 0:00 ./pth_mem2 cd (the same amount leaked either with "d", "cd" or "j" mode). At some point in time "cd" seemed to be the "better" one, but surely more testing is required. Or better, finding the bug. ;) -- Cheers Jorge.-
Errata:
but it usually ends with:
jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19027 0.0 0.4 2676 408 pts/10 S+ 08:14 0:00 ./pth_mem2 cd
Oh, the the PID and time in the above example are quoted wrong, it should say: jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19262 0.0 0.4 2676 408 pts/10 S+ 08:18 0:00 ./pth_mem2 cd -- Cheers Jorge.-
Hi Jorge! Jorge Arellano Cid writes:
Errata:
but it usually ends with:
jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19027 0.0 0.4 2676 408 pts/10 S+ 08:14 0:00 ./pth_mem2 cd
Oh, the the PID and time in the above example are quoted wrong, it should say:
jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19262 0.0 0.4 2676 408 pts/10 S+ 08:18 0:00 ./pth_mem2 cd
I can't really tell if that's a leak or not. In my system, there is an increase use of memory between the first iteration and the second, but from then on, things become _very_ stable (box with Linux 2.4 and libc 2.3.2 from Debian/unstable): livio 20097 0.0 0.0 1664 336 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 32556 468 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 1796 396 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 32556 476 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 1796 396 pts/21 S+ 10:25 0:00 ./pth_mem cd So there is certainly an increase in memory use (in my case a lot smaller than yours). But look, there is no leak of memory, because with each interation, the memory is equal to the previous phase. Or do you get an increase of 0.5-1 MiB for _each_ phase? To make my point clearer, look at how my home box handles the fixed test program (I'm using Linux kernel 2.6 with NTPL, and libc 2.3.2 from Debian/unstable): livio 4300 0.0 0.0 1440 316 pts/7 S+ 10:40 0:00 ./pth_mem d livio 4300 0.0 0.0 93812 432 pts/7 Sl+ 10:40 0:00 ./pth_mem d livio 4300 0.0 0.0 34356 372 pts/7 S+ 10:40 0:00 ./pth_mem d livio 4300 0.0 0.0 93812 436 pts/7 Sl+ 10:40 0:00 ./pth_mem d etc. Even though I use humengeous use of memory, I'm pretty sure it's not leaking (since the numbers are the same after the first iteration). Looking at straces, I see that libc is doing mmap2() on anonymous memory for allocation, and in the first phase, actually allocates 8 MiB worth of stack for each thread! (I know it's stack because it's setting PROT_EXEC on, whoch it doesn't for regular mallocs()). Subsequently, each thread mmap2() _2_ MiB of anonymous memory (instead of only one). Anyways, not wanting to get too much into the details, it's hard to predict what your libc is doing for you. I can see that libc correctly munmaps() all the mapped segments (effectively freeing the memory). _And_ the VSZ stays the same throughout interations. REMINDER: In this case a big VSZ does not mean big memory usage. It means a _potential_ to use a lot of memory (I've seen rare cases where a thread actually makes use of more than 1MiB of stack, for example, let alone 8MiB). If the mmap2() is is never touched, a page never gets assigned to the process' page table. hope this helps, -- Livio B. Soares
On Fri, Nov 26, 2004 at 02:14:53PM -0200, Livio Baldini Soares wrote:
Hi Jorge!
Jorge Arellano Cid writes:
Errata:
but it usually ends with:
jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19027 0.0 0.4 2676 408 pts/10 S+ 08:14 0:00 ./pth_mem2 cd
Oh, the the PID and time in the above example are quoted wrong, it should say:
jcid 19262 0.0 0.4 1652 400 pts/10 S+ 08:18 0:00 ./pth_mem2 cd [...] jcid 19262 0.0 0.4 2676 408 pts/10 S+ 08:18 0:00 ./pth_mem2 cd
I can't really tell if that's a leak or not. In my system, there is an increase use of memory between the first iteration and the second, but from then on, things become _very_ stable (box with Linux 2.4 and libc 2.3.2 from Debian/unstable):
I understand. In my box the "leak" usually occurs near the fifth if at all...
livio 20097 0.0 0.0 1664 336 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 32556 468 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 1796 396 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 32556 476 pts/21 S+ 10:24 0:00 ./pth_mem cd livio 20097 0.0 0.0 1796 396 pts/21 S+ 10:25 0:00 ./pth_mem cd
So there is certainly an increase in memory use (in my case a lot smaller than yours). But look, there is no leak of memory, because with each interation, the memory is equal to the previous phase.
Or do you get an increase of 0.5-1 MiB for _each_ phase?
No but I get a "random" one in one of them. For instance: I just run this one and there was no leak: <q> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND jcid 26194 0.0 0.3 1640 336 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 476 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 484 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 484 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 484 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 484 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 484 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 484 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.5 32412 484 pts/29 S+ 17:50 0:00 ./pth_mem2 cd jcid 26194 0.0 0.4 1652 404 pts/29 S+ 17:50 0:00 ./pth_mem2 cd </q> Then I run this other one (w/leak): <q> jcid 26878 0.0 0.3 1640 336 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 32412 472 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.4 1652 400 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 32412 480 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.4 1652 400 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 32412 480 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.4 1652 400 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 33436 484 pts/29 S+ 18:27 0:00 ./pth_mem2 d * jcid 26878 0.0 0.4 2676 404 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 33436 484 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.4 2676 404 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 33436 484 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.4 2676 404 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 33436 484 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.4 2676 404 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.5 33436 484 pts/29 S+ 18:27 0:00 ./pth_mem2 d jcid 26878 0.0 0.4 2676 404 pts/29 S+ 18:27 0:00 ./pth_mem2 d </q> I can easily get results with an without leaks for the three ways though. As from my tests, it seems like the "leak" can appear on any iteration
To make my point clearer, look at how my home box handles the fixed test program (I'm using Linux kernel 2.6 with NTPL, and libc 2.3.2 from Debian/unstable):
livio 4300 0.0 0.0 1440 316 pts/7 S+ 10:40 0:00 ./pth_mem d livio 4300 0.0 0.0 93812 432 pts/7 Sl+ 10:40 0:00 ./pth_mem d livio 4300 0.0 0.0 34356 372 pts/7 S+ 10:40 0:00 ./pth_mem d livio 4300 0.0 0.0 93812 436 pts/7 Sl+ 10:40 0:00 ./pth_mem d
etc. Even though I use humengeous use of memory, I'm pretty sure it's not leaking (since the numbers are the same after the first iteration).
A thing that's not happening on my box.
Looking at straces, I see that libc is doing mmap2() on anonymous memory for allocation, and in the first phase, actually allocates 8 MiB worth of stack for each thread! (I know it's stack because it's setting PROT_EXEC on, whoch it doesn't for regular mallocs()). Subsequently, each thread mmap2() _2_ MiB of anonymous memory (instead of only one).
Anyways, not wanting to get too much into the details, it's hard to predict what your libc is doing for you. I can see that libc correctly munmaps() all the mapped segments (effectively freeing the memory). _And_ the VSZ stays the same throughout interations.
REMINDER: In this case a big VSZ does not mean big memory usage. It means a _potential_ to use a lot of memory (I've seen rare cases where a thread actually makes use of more than 1MiB of stack, for example, let alone 8MiB). If the mmap2() is is never touched, a page never gets assigned to the process' page table.
BTW, Is the VSZ reserved somehow, on swap at least? or is just a potential size that we can happily ignore? I'm adding six more test results, this time with Dillo and a small page with a hundred tiny images. It has three samples for each technique: Note: I'm using the file dpi server, so dillo uses a pthread for each IO. Using 'd' technique: <q> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 13820 11168 ./dillo 15492 12624 ./dillo 15600 12736 ./dillo </q> Using 'cd' technique: <q> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 15496 12640 ./dillo 9272 6888 ./dillo 10408 7940 ./dillo </q> Here the RSS is smaller for 'cd' so I opted this one for the interim patch just commited to CVS. -- Cheers Jorge.-
In article <20041125152322.GH1676@dillo.org>, Jorge Arellano Cid <jcid@dillo.org> writes
What an irony: after years of chasing the most minor memory leaks (up to a few bytes), it ends up being that we've leaked a lot, by Megabyte units, because of an external library.
well I _did_ say it didn't appear to be leaking on a playstation-2... -- robert w hall
participants (5)
-
Brian Hechinger
-
Jorge Arellano Cid
-
Lars Segerlund
-
Livio Baldini Soares
-
robert w hall