On Sun, Jun 22, 2008 at 08:49:17PM +0200, Johannes Hofmann wrote:
Hi Jorge,
On Sun, Jun 22, 2008 at 10:11:30AM -0400, Jorge Arellano Cid wrote:
Hi there,
I've been reviewing memory usage these days, and found some weird numbers that I'd like to understand.
Taking the huge mysql page (13MB), I made the following memory measures (with ps). Load a bare dillo, load the simple directory where the huge page is located, then the huge page, then back, then forward, back, forward and back.
I did this for dillo2.19Jun, dillo2.21Jun, dillo-fltk (with zone allocators) and dillo1. All in GNU/Linux with image_off.
Here are the numbers:
----------- Memory test -----------
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4144 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun bare 0.2 10376 4756 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun dir list 6.9 176176 144776 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun huge page 5.4 123556 113216 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun back 7.0 176176 146240 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun forward 5.6 127020 117720 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun back 7.0 176172 146240 pts/10 S+ 09:37 0:07 ./dillo-fltk.19Jun forward 5.6 127016 117716 pts/10 S+ 09:37 0:08 ./dillo-fltk.19Jun back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4140 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun bare 0.2 10376 4760 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun dir list 5.9 147904 123988 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun huge page 4.9 111668 101944 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun back 6.0 147780 124792 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun forward 5.0 115008 105780 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun back 6.0 147780 124856 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun forward 5.1 115008 105844 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9676 4084 pts/10 S+ 11:16 0:00 ./dillo-fltk bare 0.2 10464 4768 pts/10 S+ 11:17 0:00 ./dillo-fltk dir list 5.4 136960 112920 pts/10 S+ 11:17 0:04 ./dillo-fltk huge page 4.3 100724 90864 pts/10 S+ 11:17 0:04 ./dillo-fltk back 5.4 136952 113756 pts/10 S+ 11:17 0:09 ./dillo-fltk forward 4.5 104180 94744 pts/10 S+ 11:17 0:09 ./dillo-fltk back 5.4 136888 113776 pts/10 S+ 11:17 0:13 ./dillo-fltk forward 4.5 104116 94764 pts/10 S+ 11:17 0:14 ./dillo-fltk back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 8408 3524 pts/10 S+ 10:59 0:00 ./dillo1 bare 0.1 24788 3564 pts/10 S+ 11:01 0:00 ./dillo1 dir list 8.7 226996 182220 pts/10 S+ 11:01 0:07 ./dillo1 huge page 6.8 165880 141792 pts/10 S+ 11:01 0:09 ./dillo1 back 8.8 226952 184176 pts/10 S+ 11:01 0:17 ./dillo1 forward 6.9 166264 144024 pts/10 S+ 11:01 0:18 ./dillo1 back 8.8 226968 184216 pts/10 S+ 11:01 0:26 ./dillo1 forward 6.9 166264 144060 pts/10 S+ 11:01 0:27 ./dillo1 back
Besides showing clearly a sane memory usage reduction pattern, there's a strange fact I don't understand. Once back from the huge page, I'd expect memory usage to drop to what it was before plus the 13MB of cached data, plus some KB of overhead, BUT the RSS keeps near 90MB.
This happens with dillo1 too.
My thoughts:
* If this memory is not freed, it'd be great to find the huge leak. * If it's freed but not returned to the OS., dillo would consume as much memory as the biggest page it has loaded. Clearly not a desirable situation. * It may be that the O.S. knows it can claim back this memory when necessary but prefers to keep it assigned to the same process until it finds better use for it.
I think this is related to the malloc implementation in Linux/glibc. Here on DragonFly I get:
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND hofmann 3331 0.0 0.3 6732 4436 p3 IL+ 8:37PM 0:00.14 ./dillo-fltk dir list hofmann 3331 26.9 7.6 140116 118580 p3 SL+ 8:37PM 0:04.13 ./dillo-fltk huge page hofmann 3331 10.3 1.2 25488 19064 p3 SL+ 8:37PM 0:05.38 ./dillo-fltk back hofmann 3331 79.2 7.6 140112 118584 p3 SL+ 8:37PM 0:09.90 ./dillo-fltk forward hofmann 3331 11.1 1.3 26252 19840 p3 SL+ 8:37PM 0:11.29 ./dillo-fltk back hofmann 3331 91.1 7.6 140548 119232 p3 SL+ 8:37PM 0:15.67 ./dillo-fltk forward hofmann 3331 27.8 1.3 26816 20392 p3 SL+ 8:37PM 0:16.98 ./dillo-fltk back
Which is not perfect, but reasonable. I think DragonFly has changed the malloc implementation some time ago. If I remember correctly there was a more Linux-like behaviour before.
Well, this explains a lof of things, including why in your system there's a speed improvement when leaving the huge page. There're still 6MB VSZ (2MB RSS) I can't explain considering the page in the cache is 13MB. I've seen leaks even with plain text, but this is not huge. Thanks for the info! -- Cheers Jorge.-