Hi, attached is patch to reduce the memory usage of dw2: * Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok. * Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general. * SimpleVector no longer preallocates memory if it's not necessary. This actually saves some megabytes here! With these changes memory usage on the mysql page is 179M here. Cheers, Johannes
On Sat, May 24, 2008 at 02:58:08PM +0200, Johannes Hofmann wrote:
Hi,
attached is patch to reduce the memory usage of dw2:
Excellent! (AFAIU this is near 30% memory usage reduction in dw2) Committed.
* Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok.
* Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general.
It looks OK at first sight, but to be sure it'd be good to know what happens when the limit is surpassed. BTW, "unsigned short" looks like a good idea for origSpace and effSpace.
Also memory usage gets pretty high (250M with images disabled). I'm experimenting with reducing the size of struct Word in textblock.hh by moving the highlight stuff out of struct Word and into Textblock. This brings memory usage down to 199M. However, I don't know the highlighting stuff enough to be sure I don't break anything. This probabely falls into the "premature optimization" category anyway.
hlStart and hlEnd use -1 as a value, so they can't be unsigned. I don't know whether they can be taken out, and how much impact it can have (alignment issues are strange sometimes).
* SimpleVector no longer preallocates memory if it's not necessary. This actually saves some megabytes here!
With these changes memory usage on the mysql page is 179M here.
Great. -- Cheers Jorge.-
On Sat, May 24, 2008 at 04:39:42PM -0400, Jorge Arellano Cid wrote:
On Sat, May 24, 2008 at 02:58:08PM +0200, Johannes Hofmann wrote:
Hi,
attached is patch to reduce the memory usage of dw2:
Excellent! (AFAIU this is near 30% memory usage reduction in dw2)
Committed.
* Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok.
* Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general.
It looks OK at first sight, but to be sure it'd be good to know what happens when the limit is surpassed.
BTW, "unsigned short" looks like a good idea for origSpace and effSpace.
Also memory usage gets pretty high (250M with images disabled). I'm experimenting with reducing the size of struct Word in textblock.hh by moving the highlight stuff out of struct Word and into Textblock. This brings memory usage down to 199M. However, I don't know the highlighting stuff enough to be sure I don't break anything. This probabely falls into the "premature optimization" category anyway.
hlStart and hlEnd use -1 as a value, so they can't be unsigned. I don't know whether they can be taken out, and how much impact it can have (alignment issues are strange sometimes).
I have a version here where hlStart and hlEnd are taken out of struct Word. Instead I have just one hlStart and hlEnd containing the also the word index in Textblock. This is under the assumption that highlighting is only possible for one contiguous region per layer. Unfortunately this patch is still in the "works almost" state. Cheers, Johannes
Hi, On Sat, May 24, 2008 at 04:39:42PM -0400, Jorge Arellano Cid wrote:
On Sat, May 24, 2008 at 02:58:08PM +0200, Johannes Hofmann wrote:
Hi,
attached is patch to reduce the memory usage of dw2:
Excellent! (AFAIU this is near 30% memory usage reduction in dw2)
Committed.
* Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok.
* Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general.
It looks OK at first sight, but to be sure it'd be good to know what happens when the limit is surpassed.
Just tried it on a page with 40000 char long word. With short for hlStart/hlEnd, selection is - obviously - no longer possible. That's clearly a regression. Please revert with attached patch.
BTW, "unsigned short" looks like a good idea for origSpace and effSpace.
Adjusted in patch. Cheers, Johannes
On Sun, May 25, 2008 at 10:08:12AM +0200, Johannes Hofmann wrote:
Hi,
On Sat, May 24, 2008 at 04:39:42PM -0400, Jorge Arellano Cid wrote:
On Sat, May 24, 2008 at 02:58:08PM +0200, Johannes Hofmann wrote:
Hi,
attached is patch to reduce the memory usage of dw2:
Excellent! (AFAIU this is near 30% memory usage reduction in dw2)
Committed.
* Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok.
* Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general.
It looks OK at first sight, but to be sure it'd be good to know what happens when the limit is surpassed.
Just tried it on a page with 40000 char long word. With short for hlStart/hlEnd, selection is - obviously - no longer possible. That's clearly a regression. Please revert with attached patch.
A max of 32K for a single word, looks quite reasonable to me. I meant, if the browser doesn't crash, and fails gracefully, why not. BTW, it can be extended to 64K with unsigned, by using 65535 as sentinel value (instead of -1), but again, is it worth? It'd be good to know how much "struct Word" is reduced in size (after compiler alignment), and how much it impacts overall memory usage. I mean if struct word is reduced 20% and that yields a 4% overall memory reduction, it may be not worth complicating the data type. OTOH, ...
BTW, "unsigned short" looks like a good idea for origSpace and effSpace.
Committed the "unsigned short" part. -- Cheers Jorge.-
On Sun, May 25, 2008 at 09:12:11AM -0400, Jorge Arellano Cid wrote:
On Sun, May 25, 2008 at 10:08:12AM +0200, Johannes Hofmann wrote:
Hi,
On Sat, May 24, 2008 at 04:39:42PM -0400, Jorge Arellano Cid wrote:
On Sat, May 24, 2008 at 02:58:08PM +0200, Johannes Hofmann wrote:
Hi,
attached is patch to reduce the memory usage of dw2:
Excellent! (AFAIU this is near 30% memory usage reduction in dw2)
Committed.
* Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok.
* Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general.
It looks OK at first sight, but to be sure it'd be good to know what happens when the limit is surpassed.
Just tried it on a page with 40000 char long word. With short for hlStart/hlEnd, selection is - obviously - no longer possible. That's clearly a regression. Please revert with attached patch.
A max of 32K for a single word, looks quite reasonable to me. I meant, if the browser doesn't crash, and fails gracefully, why not.
BTW, it can be extended to 64K with unsigned, by using 65535 as sentinel value (instead of -1), but again, is it worth?
It'd be good to know how much "struct Word" is reduced in size (after compiler alignment), and how much it impacts overall memory usage. I mean if struct word is reduced 20% and that yields a 4% overall memory reduction, it may be not worth complicating the data type.
OTOH, ...
On the very long mysql page, addWord is called 1192256 times. But due to the allocation in 2^n steps in SimpleVector memory for some more of them could be allocated structs on 32bit systems are 8 byte aligned. struct Word with highlight stuff as int is 48 byte. With short it's 40 byte. On the mysql page I observe a total memory reduction of about 9M by using short's for the highlight stuff. Apart from the highlight stuff dillo can handle those long words just fine. So we would add this restriction just to save 9M of memory on this rather extreme page. I would keep int's for now and try to get hlStart and hlEnd out of struct Word and in Textblock which would bring down struct Word to 32 bytes - hopefully without any restrictions. Cheers, Johannes
On Sun, May 25, 2008 at 06:47:11PM +0200, Johannes Hofmann wrote:
On Sun, May 25, 2008 at 09:12:11AM -0400, Jorge Arellano Cid wrote:
On Sun, May 25, 2008 at 10:08:12AM +0200, Johannes Hofmann wrote:
Hi,
On Sat, May 24, 2008 at 04:39:42PM -0400, Jorge Arellano Cid wrote:
On Sat, May 24, 2008 at 02:58:08PM +0200, Johannes Hofmann wrote:
Hi,
attached is patch to reduce the memory usage of dw2:
Excellent! (AFAIU this is near 30% memory usage reduction in dw2)
Committed.
* Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok.
* Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general.
It looks OK at first sight, but to be sure it'd be good to know what happens when the limit is surpassed.
Just tried it on a page with 40000 char long word. With short for hlStart/hlEnd, selection is - obviously - no longer possible. That's clearly a regression. Please revert with attached patch.
A max of 32K for a single word, looks quite reasonable to me. I meant, if the browser doesn't crash, and fails gracefully, why not.
BTW, it can be extended to 64K with unsigned, by using 65535 as sentinel value (instead of -1), but again, is it worth?
It'd be good to know how much "struct Word" is reduced in size (after compiler alignment), and how much it impacts overall memory usage. I mean if struct word is reduced 20% and that yields a 4% overall memory reduction, it may be not worth complicating the data type.
OTOH, ...
On the very long mysql page, addWord is called 1192256 times. But due to the allocation in 2^n steps in SimpleVector memory for some more of them could be allocated structs on 32bit systems are 8 byte aligned. struct Word with highlight stuff as int is 48 byte. With short it's 40 byte. On the mysql page I observe a total memory reduction of about 9M by using short's for the highlight stuff.
Apart from the highlight stuff dillo can handle those long words just fine. So we would add this restriction just to save 9M of memory on this rather extreme page.
I would keep int's for now and try to get hlStart and hlEnd out of struct Word and in Textblock which would bring down struct Word to 32 bytes - hopefully without any restrictions.
OK, patch reverted. Sorry for asking for this important but boring-to-collect data. :) Let's see, if it was 250MB with the old scheme, down to 225M with the style reuse, these shorts are roughly a 4% memory reduction. -- Cheers Jorge.-
On Sun, May 25, 2008 at 06:47:11PM +0200, Johannes Hofmann wrote:
On Sun, May 25, 2008 at 09:12:11AM -0400, Jorge Arellano Cid wrote:
On Sun, May 25, 2008 at 10:08:12AM +0200, Johannes Hofmann wrote:
Hi,
On Sat, May 24, 2008 at 04:39:42PM -0400, Jorge Arellano Cid wrote:
On Sat, May 24, 2008 at 02:58:08PM +0200, Johannes Hofmann wrote:
Hi,
attached is patch to reduce the memory usage of dw2:
Excellent! (AFAIU this is near 30% memory usage reduction in dw2)
Committed.
* Styles are now reused if they are equal. This brings down the number of Style objects on the mysql page from around 180000 to 23000 and reduces memory usage by about 25M at the cost of an additional HashTable. Maybe the hash table should be part of fltkplatform as the hash tables for Color and Font? There would be even more sharing of styles possible, if x_link would not be part of style. On the other hand about 3M for styles on the huge mysql page is pretty ok.
* Size of struct Word is reduced by 4 integers by using shorts. I know that this is questionable, but I think in the case of struct Word that can be allocated a lot it is worth the hassle. Please check whether my assumption that space lengths and word lengths can be stored in shorts are valid in general.
It looks OK at first sight, but to be sure it'd be good to know what happens when the limit is surpassed.
Just tried it on a page with 40000 char long word. With short for hlStart/hlEnd, selection is - obviously - no longer possible. That's clearly a regression. Please revert with attached patch.
A max of 32K for a single word, looks quite reasonable to me. I meant, if the browser doesn't crash, and fails gracefully, why not.
BTW, it can be extended to 64K with unsigned, by using 65535 as sentinel value (instead of -1), but again, is it worth?
It'd be good to know how much "struct Word" is reduced in size (after compiler alignment), and how much it impacts overall memory usage. I mean if struct word is reduced 20% and that yields a 4% overall memory reduction, it may be not worth complicating the data type.
OTOH, ...
On the very long mysql page, addWord is called 1192256 times. But due to the allocation in 2^n steps in SimpleVector memory for some more of them could be allocated structs on 32bit systems are 8 byte aligned. struct Word with
Argh, structs are 4byte aligned of course! Cheers, Johannes
On Mon, May 26, 2008 at 05:01:34PM +0200, Johannes Hofmann wrote:
On the very long mysql page, addWord is called 1192256 times. But due to the allocation in 2^n steps in SimpleVector memory for some more of them could be allocated structs on 32bit systems are 8 byte aligned. struct Word with
Argh, structs are 4byte aligned of course!
On most sane architectures a structshave the same alignment as the strictest type in it. There's one important exception -- ARM, where the minimal alignment is 16 Byte (IIRC). Joerg
participants (3)
-
jcid@dillo.org
-
joerg.sonnenberger@web.de
-
Johannes.Hofmann@gmx.de