Hi there, I've been reviewing memory usage these days, and found some weird numbers that I'd like to understand. Taking the huge mysql page (13MB), I made the following memory measures (with ps). Load a bare dillo, load the simple directory where the huge page is located, then the huge page, then back, then forward, back, forward and back. I did this for dillo2.19Jun, dillo2.21Jun, dillo-fltk (with zone allocators) and dillo1. All in GNU/Linux with image_off. Here are the numbers: ----------- Memory test ----------- %MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4144 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun bare 0.2 10376 4756 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun dir list 6.9 176176 144776 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun huge page 5.4 123556 113216 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun back 7.0 176176 146240 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun forward 5.6 127020 117720 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun back 7.0 176172 146240 pts/10 S+ 09:37 0:07 ./dillo-fltk.19Jun forward 5.6 127016 117716 pts/10 S+ 09:37 0:08 ./dillo-fltk.19Jun back %MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4140 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun bare 0.2 10376 4760 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun dir list 5.9 147904 123988 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun huge page 4.9 111668 101944 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun back 6.0 147780 124792 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun forward 5.0 115008 105780 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun back 6.0 147780 124856 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun forward 5.1 115008 105844 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun back %MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9676 4084 pts/10 S+ 11:16 0:00 ./dillo-fltk bare 0.2 10464 4768 pts/10 S+ 11:17 0:00 ./dillo-fltk dir list 5.4 136960 112920 pts/10 S+ 11:17 0:04 ./dillo-fltk huge page 4.3 100724 90864 pts/10 S+ 11:17 0:04 ./dillo-fltk back 5.4 136952 113756 pts/10 S+ 11:17 0:09 ./dillo-fltk forward 4.5 104180 94744 pts/10 S+ 11:17 0:09 ./dillo-fltk back 5.4 136888 113776 pts/10 S+ 11:17 0:13 ./dillo-fltk forward 4.5 104116 94764 pts/10 S+ 11:17 0:14 ./dillo-fltk back %MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 8408 3524 pts/10 S+ 10:59 0:00 ./dillo1 bare 0.1 24788 3564 pts/10 S+ 11:01 0:00 ./dillo1 dir list 8.7 226996 182220 pts/10 S+ 11:01 0:07 ./dillo1 huge page 6.8 165880 141792 pts/10 S+ 11:01 0:09 ./dillo1 back 8.8 226952 184176 pts/10 S+ 11:01 0:17 ./dillo1 forward 6.9 166264 144024 pts/10 S+ 11:01 0:18 ./dillo1 back 8.8 226968 184216 pts/10 S+ 11:01 0:26 ./dillo1 forward 6.9 166264 144060 pts/10 S+ 11:01 0:27 ./dillo1 back Besides showing clearly a sane memory usage reduction pattern, there's a strange fact I don't understand. Once back from the huge page, I'd expect memory usage to drop to what it was before plus the 13MB of cached data, plus some KB of overhead, BUT the RSS keeps near 90MB. This happens with dillo1 too. My thoughts: * If this memory is not freed, it'd be great to find the huge leak. * If it's freed but not returned to the OS., dillo would consume as much memory as the biggest page it has loaded. Clearly not a desirable situation. * It may be that the O.S. knows it can claim back this memory when necessary but prefers to keep it assigned to the same process until it finds better use for it. ------------------------------------------- Huge page with different memory allocators: ------------------------------------------- %MEM VSZ RSS TTY STAT START TIME COMMAND 5.4 134912 113740 pts/17 S+ 18:56 0:02 ./dillo-fltk step: 2x 5.4 134916 113752 pts/17 S+ 18:57 0:02 ./dillo-fltk step: 2x 5.4 134828 113660 pts/17 S+ 18:58 0:02 ./dillo-fltk step: 2x 5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1 5.1 117796 106436 pts/17 S+ 18:59 0:02 ./dillo-fltk step: cst 5.1 117568 106404 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst 5.1 117820 106460 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst This data comes from a memory usage reduction by tunning the simpleVector memory allocator. The first is the current one, which allocates in chunks of 2*numAlloc, the second allocates one by one and the third is a custom one I made. The 1-allocator is the best memory usage reducer but it's slower (roughly 20%). The cst-allocator reduces a bit less but is almost as fast as the 2x-allocator (2% slower or so). Memory reduction: VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% ----------------------------- Well, compared with the 1000% "leak" above... :-) -- Cheers Jorge.-
Hi Jorge, On Sun, Jun 22, 2008 at 10:11:30AM -0400, Jorge Arellano Cid wrote:
Hi there,
I've been reviewing memory usage these days, and found some weird numbers that I'd like to understand.
Taking the huge mysql page (13MB), I made the following memory measures (with ps). Load a bare dillo, load the simple directory where the huge page is located, then the huge page, then back, then forward, back, forward and back.
I did this for dillo2.19Jun, dillo2.21Jun, dillo-fltk (with zone allocators) and dillo1. All in GNU/Linux with image_off.
Here are the numbers:
----------- Memory test -----------
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4144 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun bare 0.2 10376 4756 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun dir list 6.9 176176 144776 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun huge page 5.4 123556 113216 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun back 7.0 176176 146240 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun forward 5.6 127020 117720 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun back 7.0 176172 146240 pts/10 S+ 09:37 0:07 ./dillo-fltk.19Jun forward 5.6 127016 117716 pts/10 S+ 09:37 0:08 ./dillo-fltk.19Jun back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4140 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun bare 0.2 10376 4760 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun dir list 5.9 147904 123988 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun huge page 4.9 111668 101944 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun back 6.0 147780 124792 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun forward 5.0 115008 105780 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun back 6.0 147780 124856 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun forward 5.1 115008 105844 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9676 4084 pts/10 S+ 11:16 0:00 ./dillo-fltk bare 0.2 10464 4768 pts/10 S+ 11:17 0:00 ./dillo-fltk dir list 5.4 136960 112920 pts/10 S+ 11:17 0:04 ./dillo-fltk huge page 4.3 100724 90864 pts/10 S+ 11:17 0:04 ./dillo-fltk back 5.4 136952 113756 pts/10 S+ 11:17 0:09 ./dillo-fltk forward 4.5 104180 94744 pts/10 S+ 11:17 0:09 ./dillo-fltk back 5.4 136888 113776 pts/10 S+ 11:17 0:13 ./dillo-fltk forward 4.5 104116 94764 pts/10 S+ 11:17 0:14 ./dillo-fltk back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 8408 3524 pts/10 S+ 10:59 0:00 ./dillo1 bare 0.1 24788 3564 pts/10 S+ 11:01 0:00 ./dillo1 dir list 8.7 226996 182220 pts/10 S+ 11:01 0:07 ./dillo1 huge page 6.8 165880 141792 pts/10 S+ 11:01 0:09 ./dillo1 back 8.8 226952 184176 pts/10 S+ 11:01 0:17 ./dillo1 forward 6.9 166264 144024 pts/10 S+ 11:01 0:18 ./dillo1 back 8.8 226968 184216 pts/10 S+ 11:01 0:26 ./dillo1 forward 6.9 166264 144060 pts/10 S+ 11:01 0:27 ./dillo1 back
Besides showing clearly a sane memory usage reduction pattern, there's a strange fact I don't understand. Once back from the huge page, I'd expect memory usage to drop to what it was before plus the 13MB of cached data, plus some KB of overhead, BUT the RSS keeps near 90MB.
This happens with dillo1 too.
My thoughts:
* If this memory is not freed, it'd be great to find the huge leak. * If it's freed but not returned to the OS., dillo would consume as much memory as the biggest page it has loaded. Clearly not a desirable situation. * It may be that the O.S. knows it can claim back this memory when necessary but prefers to keep it assigned to the same process until it finds better use for it.
I think this is related to the malloc implementation in Linux/glibc. Here on DragonFly I get: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND hofmann 3331 0.0 0.3 6732 4436 p3 IL+ 8:37PM 0:00.14 ./dillo-fltk dir list hofmann 3331 26.9 7.6 140116 118580 p3 SL+ 8:37PM 0:04.13 ./dillo-fltk huge page hofmann 3331 10.3 1.2 25488 19064 p3 SL+ 8:37PM 0:05.38 ./dillo-fltk back hofmann 3331 79.2 7.6 140112 118584 p3 SL+ 8:37PM 0:09.90 ./dillo-fltk forward hofmann 3331 11.1 1.3 26252 19840 p3 SL+ 8:37PM 0:11.29 ./dillo-fltk back hofmann 3331 91.1 7.6 140548 119232 p3 SL+ 8:37PM 0:15.67 ./dillo-fltk forward hofmann 3331 27.8 1.3 26816 20392 p3 SL+ 8:37PM 0:16.98 ./dillo-fltk back Which is not perfect, but reasonable. I think DragonFly has changed the malloc implementation some time ago. If I remember correctly there was a more Linux-like behaviour before. Cheers, Johannes
------------------------------------------- Huge page with different memory allocators: -------------------------------------------
%MEM VSZ RSS TTY STAT START TIME COMMAND 5.4 134912 113740 pts/17 S+ 18:56 0:02 ./dillo-fltk step: 2x 5.4 134916 113752 pts/17 S+ 18:57 0:02 ./dillo-fltk step: 2x 5.4 134828 113660 pts/17 S+ 18:58 0:02 ./dillo-fltk step: 2x
5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1
5.1 117796 106436 pts/17 S+ 18:59 0:02 ./dillo-fltk step: cst 5.1 117568 106404 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst 5.1 117820 106460 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst
This data comes from a memory usage reduction by tunning the simpleVector memory allocator. The first is the current one, which allocates in chunks of 2*numAlloc, the second allocates one by one and the third is a custom one I made.
The 1-allocator is the best memory usage reducer but it's slower (roughly 20%). The cst-allocator reduces a bit less but is almost as fast as the 2x-allocator (2% slower or so).
Memory reduction:
VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% -----------------------------
Well, compared with the 1000% "leak" above... :-)
-- Cheers Jorge.-
_______________________________________________ Dillo-dev mailing list Dillo-dev@dillo.org http://lists.auriga.wearlab.de/cgi-bin/mailman/listinfo/dillo-dev
On Sun, Jun 22, 2008 at 08:49:17PM +0200, Johannes Hofmann wrote:
Hi Jorge,
On Sun, Jun 22, 2008 at 10:11:30AM -0400, Jorge Arellano Cid wrote:
Hi there,
I've been reviewing memory usage these days, and found some weird numbers that I'd like to understand.
Taking the huge mysql page (13MB), I made the following memory measures (with ps). Load a bare dillo, load the simple directory where the huge page is located, then the huge page, then back, then forward, back, forward and back.
I did this for dillo2.19Jun, dillo2.21Jun, dillo-fltk (with zone allocators) and dillo1. All in GNU/Linux with image_off.
Here are the numbers:
----------- Memory test -----------
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4144 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun bare 0.2 10376 4756 pts/10 S+ 09:37 0:00 ./dillo-fltk.19Jun dir list 6.9 176176 144776 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun huge page 5.4 123556 113216 pts/10 S+ 09:37 0:02 ./dillo-fltk.19Jun back 7.0 176176 146240 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun forward 5.6 127020 117720 pts/10 S+ 09:37 0:05 ./dillo-fltk.19Jun back 7.0 176172 146240 pts/10 S+ 09:37 0:07 ./dillo-fltk.19Jun forward 5.6 127016 117716 pts/10 S+ 09:37 0:08 ./dillo-fltk.19Jun back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9672 4140 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun bare 0.2 10376 4760 pts/10 S+ 09:26 0:00 ./dillo-fltk.21Jun dir list 5.9 147904 123988 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun huge page 4.9 111668 101944 pts/10 S+ 09:31 0:02 ./dillo-fltk.21Jun back 6.0 147780 124792 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun forward 5.0 115008 105780 pts/10 S+ 09:31 0:05 ./dillo-fltk.21Jun back 6.0 147780 124856 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun forward 5.1 115008 105844 pts/10 S+ 09:31 0:07 ./dillo-fltk.21Jun back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 9676 4084 pts/10 S+ 11:16 0:00 ./dillo-fltk bare 0.2 10464 4768 pts/10 S+ 11:17 0:00 ./dillo-fltk dir list 5.4 136960 112920 pts/10 S+ 11:17 0:04 ./dillo-fltk huge page 4.3 100724 90864 pts/10 S+ 11:17 0:04 ./dillo-fltk back 5.4 136952 113756 pts/10 S+ 11:17 0:09 ./dillo-fltk forward 4.5 104180 94744 pts/10 S+ 11:17 0:09 ./dillo-fltk back 5.4 136888 113776 pts/10 S+ 11:17 0:13 ./dillo-fltk forward 4.5 104116 94764 pts/10 S+ 11:17 0:14 ./dillo-fltk back
%MEM VSZ RSS TTY STAT START TIME COMMAND 0.1 8408 3524 pts/10 S+ 10:59 0:00 ./dillo1 bare 0.1 24788 3564 pts/10 S+ 11:01 0:00 ./dillo1 dir list 8.7 226996 182220 pts/10 S+ 11:01 0:07 ./dillo1 huge page 6.8 165880 141792 pts/10 S+ 11:01 0:09 ./dillo1 back 8.8 226952 184176 pts/10 S+ 11:01 0:17 ./dillo1 forward 6.9 166264 144024 pts/10 S+ 11:01 0:18 ./dillo1 back 8.8 226968 184216 pts/10 S+ 11:01 0:26 ./dillo1 forward 6.9 166264 144060 pts/10 S+ 11:01 0:27 ./dillo1 back
Besides showing clearly a sane memory usage reduction pattern, there's a strange fact I don't understand. Once back from the huge page, I'd expect memory usage to drop to what it was before plus the 13MB of cached data, plus some KB of overhead, BUT the RSS keeps near 90MB.
This happens with dillo1 too.
My thoughts:
* If this memory is not freed, it'd be great to find the huge leak. * If it's freed but not returned to the OS., dillo would consume as much memory as the biggest page it has loaded. Clearly not a desirable situation. * It may be that the O.S. knows it can claim back this memory when necessary but prefers to keep it assigned to the same process until it finds better use for it.
I think this is related to the malloc implementation in Linux/glibc. Here on DragonFly I get:
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND hofmann 3331 0.0 0.3 6732 4436 p3 IL+ 8:37PM 0:00.14 ./dillo-fltk dir list hofmann 3331 26.9 7.6 140116 118580 p3 SL+ 8:37PM 0:04.13 ./dillo-fltk huge page hofmann 3331 10.3 1.2 25488 19064 p3 SL+ 8:37PM 0:05.38 ./dillo-fltk back hofmann 3331 79.2 7.6 140112 118584 p3 SL+ 8:37PM 0:09.90 ./dillo-fltk forward hofmann 3331 11.1 1.3 26252 19840 p3 SL+ 8:37PM 0:11.29 ./dillo-fltk back hofmann 3331 91.1 7.6 140548 119232 p3 SL+ 8:37PM 0:15.67 ./dillo-fltk forward hofmann 3331 27.8 1.3 26816 20392 p3 SL+ 8:37PM 0:16.98 ./dillo-fltk back
Which is not perfect, but reasonable. I think DragonFly has changed the malloc implementation some time ago. If I remember correctly there was a more Linux-like behaviour before.
Well, this explains a lof of things, including why in your system there's a speed improvement when leaving the huge page. There're still 6MB VSZ (2MB RSS) I can't explain considering the page in the cache is 13MB. I've seen leaks even with plain text, but this is not huge. Thanks for the info! -- Cheers Jorge.-
On Sun, Jun 22, 2008 at 10:11:30AM -0400, Jorge Arellano Cid wrote:
------------------------------------------- Huge page with different memory allocators: -------------------------------------------
%MEM VSZ RSS TTY STAT START TIME COMMAND 5.4 134912 113740 pts/17 S+ 18:56 0:02 ./dillo-fltk step: 2x 5.4 134916 113752 pts/17 S+ 18:57 0:02 ./dillo-fltk step: 2x 5.4 134828 113660 pts/17 S+ 18:58 0:02 ./dillo-fltk step: 2x
5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1
5.1 117796 106436 pts/17 S+ 18:59 0:02 ./dillo-fltk step: cst 5.1 117568 106404 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst 5.1 117820 106460 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst
This data comes from a memory usage reduction by tunning the simpleVector memory allocator. The first is the current one, which allocates in chunks of 2*numAlloc, the second allocates one by one and the third is a custom one I made.
The 1-allocator is the best memory usage reducer but it's slower (roughly 20%). The cst-allocator reduces a bit less but is almost as fast as the 2x-allocator (2% slower or so).
Memory reduction:
VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% -----------------------------
After some more study I got to a second cst2-allocator that reduces almost as much as the 1-allocator but with the same (or better) speed than the 2x one!
%MEM VSZ RSS TTY STAT START TIME COMMAND 5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1
5.1 113224 105820 pts/17 S+ 13:57 0:02 ./dillo-fltk step: cst2 5.0 112948 105740 pts/17 S+ 13:58 0:02 ./dillo-fltk step: cst2 5.0 115528 105628 pts/17 S+ 13:59 0:02 ./dillo-fltk step: cst2 So now we have: ----------------------------- Memory reduction: VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% cst2-allocator 15% 7.3% ----------------------------- Committed. -- Cheers Jorge.-
On Mon, Jun 23, 2008 at 05:44:11PM -0400, Jorge Arellano Cid wrote:
On Sun, Jun 22, 2008 at 10:11:30AM -0400, Jorge Arellano Cid wrote:
------------------------------------------- Huge page with different memory allocators: -------------------------------------------
%MEM VSZ RSS TTY STAT START TIME COMMAND 5.4 134912 113740 pts/17 S+ 18:56 0:02 ./dillo-fltk step: 2x 5.4 134916 113752 pts/17 S+ 18:57 0:02 ./dillo-fltk step: 2x 5.4 134828 113660 pts/17 S+ 18:58 0:02 ./dillo-fltk step: 2x
5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1
5.1 117796 106436 pts/17 S+ 18:59 0:02 ./dillo-fltk step: cst 5.1 117568 106404 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst 5.1 117820 106460 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst
This data comes from a memory usage reduction by tunning the simpleVector memory allocator. The first is the current one, which allocates in chunks of 2*numAlloc, the second allocates one by one and the third is a custom one I made.
The 1-allocator is the best memory usage reducer but it's slower (roughly 20%). The cst-allocator reduces a bit less but is almost as fast as the 2x-allocator (2% slower or so).
Memory reduction:
VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% -----------------------------
After some more study I got to a second cst2-allocator that reduces almost as much as the 1-allocator but with the same (or better) speed than the 2x one!
%MEM VSZ RSS TTY STAT START TIME COMMAND 5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1
5.1 113224 105820 pts/17 S+ 13:57 0:02 ./dillo-fltk step: cst2 5.0 112948 105740 pts/17 S+ 13:58 0:02 ./dillo-fltk step: cst2 5.0 115528 105628 pts/17 S+ 13:59 0:02 ./dillo-fltk step: cst2
So now we have:
----------------------------- Memory reduction:
VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% cst2-allocator 15% 7.3% -----------------------------
Committed.
Here some results from DragonFly. BTW it found out that by switching off speedstep and thereby locking the CPU to 600MHz I can get more consistent timings. Old SimpleVector: real 0m9.284s user 0m7.921s sys 0m0.336s USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND hofmann 18621 19.5 7.7 141288 119732 p2 SL+ 4:39PM 0:04.01 ./dillo-fltk New SimpleVector from cvs: real 0m10.012s user 0m8.005s sys 0m0.585s USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND hofmann 19344 41.6 7.6 126796 119464 p2 SL+ 4:41PM 0:04.39 ./dillo-fltk So there is a considerable memory reduction but also a noticable higher CPU (especially system) load. Cheers, Johannes
Hi Jorge, On Mon, Jun 23, 2008 at 05:44:11PM -0400, Jorge Arellano Cid wrote:
On Sun, Jun 22, 2008 at 10:11:30AM -0400, Jorge Arellano Cid wrote:
------------------------------------------- Huge page with different memory allocators: -------------------------------------------
%MEM VSZ RSS TTY STAT START TIME COMMAND 5.4 134912 113740 pts/17 S+ 18:56 0:02 ./dillo-fltk step: 2x 5.4 134916 113752 pts/17 S+ 18:57 0:02 ./dillo-fltk step: 2x 5.4 134828 113660 pts/17 S+ 18:58 0:02 ./dillo-fltk step: 2x
5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1
5.1 117796 106436 pts/17 S+ 18:59 0:02 ./dillo-fltk step: cst 5.1 117568 106404 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst 5.1 117820 106460 pts/17 S+ 19:00 0:02 ./dillo-fltk step: cst
This data comes from a memory usage reduction by tunning the simpleVector memory allocator. The first is the current one, which allocates in chunks of 2*numAlloc, the second allocates one by one and the third is a custom one I made.
The 1-allocator is the best memory usage reducer but it's slower (roughly 20%). The cst-allocator reduces a bit less but is almost as fast as the 2x-allocator (2% slower or so).
Memory reduction:
VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% -----------------------------
After some more study I got to a second cst2-allocator that reduces almost as much as the 1-allocator but with the same (or better) speed than the 2x one!
%MEM VSZ RSS TTY STAT START TIME COMMAND 5.0 114940 105432 pts/17 S+ 18:53 0:02 ./dillo-fltk step: 1 5.0 111976 105488 pts/17 S+ 18:54 0:03 ./dillo-fltk step: 1 5.0 114872 105416 pts/17 S+ 18:54 0:02 ./dillo-fltk step: 1
5.1 113224 105820 pts/17 S+ 13:57 0:02 ./dillo-fltk step: cst2 5.0 112948 105740 pts/17 S+ 13:58 0:02 ./dillo-fltk step: cst2 5.0 115528 105628 pts/17 S+ 13:59 0:02 ./dillo-fltk step: cst2
So now we have:
----------------------------- Memory reduction:
VSZ RSS ----------------------------- 1-allocator 15% 7.3% cst-allocator 13% 6.5% cst2-allocator 15% 7.3% -----------------------------
Committed.
I have some concerns regarding the new SimpleVector implementation: * It introduces two arbitrary numbers (100 as a limit to switch to a different bahaviour, and the 1/10 factor) which are optimized for a specific page on a specific platform. The original 2x method feels more general to me even though it may be a bit less memory efficient. * It probabely only works fast, as most malloc libraries use some heuristics to avoid memcpy for every realloc. Depending on this behaviour may create problems on more exotic platforms. * Rather than igoring the initAlloc parameter I would propose to reconsider the places where SimpleVector() is inititalized with initAlloc > 1. There also is an interesting comment about initAlloc in the Textblock constructor. Having said all this, it's not a big issue for me. If CPU usage should really become a problem on some platform, we could easily change the code. It's just 2 or 3 lines after all. Cheers, Johannes
participants (2)
-
jcid@dillo.org
-
Johannes.Hofmann@gmx.de