Re: [patch] improve scroll performance of dillo-fltk
Hi Matthias, I just made a xtrace (http://packages.debian.org/unstable/x11/xtrace) when scrolling the http://www.w3.org/TR/xslt page with patched dillo (result below). I think this is pretty good already. Just the newly exposed glyphs get rendered. Each word is sent to the X-Server seperately. This might be an area for improvement. Could you try the same on your system, so that we find out, whether the patch does not work for you or whether scrolling is still too CPU consuming even with the working patch? Here is what I did to get the trace: xtrace > /tmp/xtrace.out export DISPLAY=:9 dillo-fltk http://www.w3.org/TR/xslt Then scroll using the triangle of the scrollbar for a while and stop dillo. The resulting file is huge but I am only interested in the repeating sections that result from the scrolling and start with CopyArea. Gruss, Johannes 000:<:c474: 16: Request(56): ChangeGC gc=0x01c00009 values={foreground=0x00000000} 000:<:c475: 28: Request(62): CopyArea src-drawable=0x01c00008 dst-drawable=0x01c00008 gc=0x01c00009 src-x=0 src-y=93 dst-x=0 dst-y=73 width=765 height=467 000:>:c475: Event NoExposure(14) drawable=0x01c00008 minor-opcode=0x0000 major-opcode=0x3e 000:<:c476: 20: Request(59): SetClipRectangles ordering=YXBanded(0x03) gc=0x01c00009 clip-x-origin=0 clip-y-origin=0 rectangles ={x=0 y=540 w=765 h=20}; 000:<:c477: 20: RENDERRequest(154): RenderSetPictureClipRectangles picture=0x01c00036 xOrigin=0 yOrigin=0 rectangles={x=0 y=540 w=765 h=20}; 000:<:c478: 20: Request(59): SetClipRectangles ordering=YXBanded(0x03) gc=0x01c00009 clip-x-origin=0 clip-y-origin=0 rectangles ={x=0 y=540 w=765 h=20}; 000:<:c479: 16: Request(56): ChangeGC gc=0x01c00009 values={foreground=0x00e0e0e0} 000:<:c47a: 20: Request(70): PolyFillRectangle drawable=0x01c00008 gc=0x01c00009 rectangles={x=0 y=0 w=780 h=580}; 000:<:c47b: 20: Request(59): SetClipRectangles ordering=YXBanded(0x03) gc=0x01c00009 clip-x-origin=0 clip-y-origin=0 rectangles ={x=0 y=540 w=765 h=20}; 000:<:c47c: 16: Request(56): ChangeGC gc=0x01c00009 values={foreground=0x00dcd1ba} 000:<:c47d: 20: Request(70): PolyFillRectangle drawable=0x01c00008 gc=0x01c00009 rectangles={x=0 y=540 w=765 h=20}; 000:<:c47e: 60: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00063 xSrc=0 ySrc=0 glyphcmds={deltax=5 deltay=530 glyphs=0x1f,0x5b,0x56,0x4f,0x1d,0x44,0x5 3,0x53,0x4f,0x5c,0x10,0x57,0x48,0x50,0x53,0x4f,0x44,0x57,0x48,0x56,0x12,0x21; }; 000:<:c47f: 40: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=5 deltay=554 glyphs=0x5a,0x4b,0x48,0x51; }; 000:<:c480: 40: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=38 deltay=554 glyphs=0x4c,0x57; }; 000:<:c481: 44: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=47 deltay=554 glyphs=0x5a,0x44,0x51,0x57,0x56; }; 000:<:c482: 40: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=82 deltay=554 glyphs=0x57,0x52; }; 000:<:c483: 40: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=95 deltay=554 glyphs=0x46,0x52,0x53,0x5c; }; 000:<:c484: 40: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=124 deltay=554 glyphs=0x57,0x4b,0x48; }; 000:<:c485: 44: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00063 xSrc=0 ySrc=0 glyphcmds={deltax=144 deltay=554 glyphs=0x5b,0x50,0x4f,0x1d,0x4f,0x44,0 x51,0x4a; }; 000:<:c486: 48: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=203 deltay=554 glyphs=0x44,0x57,0x57,0x55,0x4c,0x45,0 x58,0x57,0x48,0x11; };
On Fri, Oct 12, 2007 at 07:59:34PM +0200, Johannes Hofmann wrote:
Could you try the same on your system, so that we find out, whether the patch does not work for you or whether scrolling is still too CPU consuming even with the working patch?
Hi Johannes, here is my trace: 000:<:877c: 28: Request(62): CopyArea src-drawable=0x00800008 dst-drawable=0x00800008 gc=0x00800009 src-x=0 src-y=74 dst-x=0 dst-y=73 width=1009 height=631 000:>:877c: Event NoExposure(14) drawable=0x00800008 minor-opcode=0x0000 major-opcode=0x3e 000:<:877d: 20: Request(59): SetClipRectangles ordering=YXBanded(0x03) gc=0x00800009 clip-x-origin=0 clip-y-origin=0 rectangles ={x=0 y=704 w=1009 h=1}; 000:<:877e: 20: RENDERRequest(151): RenderSetPictureClipRectangles picture=0x00800057 xOrigin=0 yOrigin=0 rectangles={x=0 y=704 w=1009 h=1}; 000:<:877f: 20: Request(59): SetClipRectangles ordering=YXBanded(0x03) gc=0x00800009 clip-x-origin=0 clip-y-origin=0 rectangles ={x=0 y=704 w=1009 h=1}; 000:<:8780: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00e0e0e0} 000:<:8781: 20: Request(70): PolyFillRectangle drawable=0x00800008 gc=0x00800009 rectangles={x=0 y=0 w=1024 h=725}; 000:<:8782: 20: Request(59): SetClipRectangles ordering=YXBanded(0x03) gc=0x00800009 clip-x-origin=0 clip-y-origin=0 rectangles ={x=0 y=704 w=1009 h=1}; 000:<:8783: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00dcd1ba} 000:<:8784: 20: Request(70): PolyFillRectangle drawable=0x00800008 gc=0x00800009 rectangles={x=0 y=704 w=1009 h=1}; 000:<:8785: 44: RENDERRequest(151): RenderCompositeGlyphs8 op=Over(0x03) src=0x0080000f dst=0x00800057 maskFormat=0x00000024 glyphset=0x0080003a xSrc=0 ySrc=0 glyphcmds={deltax=5 deltay=711 glyphs=0x36,0x57,0x44,0x57,0x58,0x56; }; 000:<:8786: 40: RENDERRequest(151): RenderCompositeGlyphs8 op=Over(0x03) src=0x0080000f dst=0x00800057 maskFormat=0x00000024 glyphset=0x0080003a xSrc=0 ySrc=0 glyphcmds={deltax=65 deltay=711 glyphs=0x52,0x49; }; 000:<:8787: 40: RENDERRequest(151): RenderCompositeGlyphs8 op=Over(0x03) src=0x0080000f dst=0x00800057 maskFormat=0x00000024 glyphset=0x0080003a xSrc=0 ySrc=0 glyphcmds={deltax=88 deltay=711 glyphs=0x57,0x4b,0x4c,0x56; }; 000:<:8788: 44: RENDERRequest(151): RenderCompositeGlyphs8 op=Over(0x03) src=0x0080000f dst=0x00800057 maskFormat=0x00000024 glyphset=0x0080003a xSrc=0 ySrc=0 glyphcmds={deltax=125 deltay=711 glyphs=0x47,0x52,0x46,0x58,0x50,0x48,0x51,0x57; }; 000:<:8789: 20: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00000000 clip-mask=None(0x00000000)} 000:<:878a: 16: RENDERRequest(151): RenderChangePicture picture=0x00800057 mask=0x00000040 values={repeat=false(0x00)} 000:<:878b: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00ababab} 000:<:878c: 20: Request(70): PolyFillRectangle drawable=0x00800008 gc=0x00800009 rectangles={x=1009 y=88 w=15 h=602}; 000:<:878d: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00000000} 000:<:878e: 28: Request(66): PolySegment drawable=0x00800008 gc=0x00800009 segments={x1=1009 y1=102 x2=1023 y2=102},{x1=1023 y1=88 x2=1023 y2=101}; 000:<:878f: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00f9f9f9} 000:<:8790: 28: Request(66): PolySegment drawable=0x00800008 gc=0x00800009 segments={x1=1009 y1=88 x2=1022 y2=88},{x1=1009 y1=89 x2=1009 y2=101}; 000:<:8791: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00ababab} 000:<:8792: 28: Request(66): PolySegment drawable=0x00800008 gc=0x00800009 segments={x1=1010 y1=101 x2=1022 y2=101},{x1=1022 y1=89 x2=1022 y2=100}; 000:<:8793: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00e0e0e0} 000:<:8794: 20: Request(70): PolyFillRectangle drawable=0x00800008 gc=0x00800009 rectangles={x=1010 y=89 w=12 h=12}; 000:<:8795: 16: Request(128): unknown 000:<:8796: 16: Request(56): ChangeGC gc=0x00800009 values={foreground=0x00000000} Cheers, -- Matthias Franz
On Fri, Oct 12, 2007 at 07:59:34PM +0200, Johannes Hofmann wrote:
000:<:c47f: 40: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=5 deltay=554 glyphs=0x5a,0x4b,0x48,0x51; };
Hi Johannes, two additonal observation: 1) I've had a look at the xtrace output of dillo1. It seems not to use the RENDER extension (apart from two requests 000:<:0029: 12: RENDERRequest(151): RenderQueryVersion majorVersion=0 minorVersion=10 000:<:002a: 4: RENDERRequest(151): RenderQueryPictFormats at the beginning). 2) I've played around with 2D-acceleration and compared CPU load while scrolling. Result: dillo1: XAA: excellent (almost no load), EXA: very bad (100%, delays) dillo2: XAA: bad (100%) EXA: slightly better (75%) I've read that the EXA implementation of my Xorg server (7.1.1) is very bad, at least for my video driver (savage). On the other hand, RENDER support is said to be better in EXA than in XAA. So could it be that it has to do with RENDER and that most of you have decent hadware acceleration for that? Cheers, -- Matthias Franz
Hi Matthias, On Fri, Oct 12, 2007 at 09:26:57PM +0200, Matthias Franz wrote:
On Fri, Oct 12, 2007 at 07:59:34PM +0200, Johannes Hofmann wrote:
000:<:c47f: 40: RENDERRequest(154): RenderCompositeGlyphs8 op=Over(0x03) src=0x01c0000f dst=0x01c00036 maskFormat=0x00000026 glyphset=0x01c00037 xSrc=0 ySrc=0 glyphcmds={deltax=5 deltay=554 glyphs=0x5a,0x4b,0x48,0x51; };
Hi Johannes,
two additonal observation:
1) I've had a look at the xtrace output of dillo1. It seems not to use the RENDER extension (apart from two requests
000:<:0029: 12: RENDERRequest(151): RenderQueryVersion majorVersion=0 minorVersion=10 000:<:002a: 4: RENDERRequest(151): RenderQueryPictFormats
at the beginning).
2) I've played around with 2D-acceleration and compared CPU load while scrolling. Result:
dillo1: XAA: excellent (almost no load), EXA: very bad (100%, delays) dillo2: XAA: bad (100%) EXA: slightly better (75%)
I've read that the EXA implementation of my Xorg server (7.1.1) is very bad, at least for my video driver (savage). On the other hand, RENDER support is said to be better in EXA than in XAA.
So could it be that it has to do with RENDER and that most of you have decent hadware acceleration for that?
Yes, I guess that's the reason. You can compile fltk with the --disable-xft flag. In this case you will get the artefacts you mentioned before, but the performance should be similar to dillo1. We should try to get rid of those artefacts for users that can't use xft. Cheers, Johannes
On Fri, Oct 12, 2007 at 10:28:38PM +0200, Johannes Hofmann wrote:
So could it be that it has to do with RENDER and that most of you have decent hadware acceleration for that?
Yes, I guess that's the reason. You can compile fltk with the --disable-xft flag. In this case you will get the artefacts you mentioned before, but the performance should be similar to dillo1.
I've compiled fltk with "--disable-xft". Now it doesn't use RENDER any more (checked with xtrace). But I don't notice any performance difference to dillo2 with xft. In particular, performance is far from being that of dillo1. So, xft / RENDER does not seem to be the reason. Maybe a detailed comparison of the xtrace logs of dillo1 and dillo2-noxft could help? I've seen that dillo2 uses many "ChangeGC", "SetClipRectangles" and "PolyFillRectangle" requests, which are not used by dillo1. I don't know which requests are particularly expensive.
We should try to get rid of those artefacts for users that can't use xft.
I agree. As Johannes has found out, this is an fltk bug. Cheers, -- Matthias Franz
On Sat, Oct 13, 2007 at 05:04:19PM +0200, Matthias Franz wrote:
On Fri, Oct 12, 2007 at 10:28:38PM +0200, Johannes Hofmann wrote:
So could it be that it has to do with RENDER and that most of you have decent hadware acceleration for that?
Yes, I guess that's the reason. You can compile fltk with the --disable-xft flag. In this case you will get the artefacts you mentioned before, but the performance should be similar to dillo1.
I've compiled fltk with "--disable-xft". Now it doesn't use RENDER any more (checked with xtrace). But I don't notice any performance difference to dillo2 with xft. In particular, performance is far from being that of dillo1. So, xft / RENDER does not seem to be the reason. Maybe a detailed comparison of the xtrace logs of dillo1 and dillo2-noxft could help? I've seen that dillo2 uses many "ChangeGC", "SetClipRectangles" and "PolyFillRectangle" requests, which are not used by dillo1. I don't know which requests are particularly expensive.
AFAIR, dillo2 was passing trough the whole widget tree instead of just the to-be-rendered part. We were testing with Sebastian and optimizations were postponed. I also see a huge difference between dillo1 and dillo2. When Sebastian gets some free time he'll probably comment on that issue. If a simple but long FLTK2 widget can be scrolled with the same technique as in dillo2, with the same performance (or close) to dillo1, we'd know it is a bug within dw2. If not, we may ask in FKTK-dev. BTW, I preferred to take patches for dw2 and to commit them to CVS instead of pushing them for Sebastian to examine. That way he can catch up later while the patches improve. -- Cheers Jorge.-
Hi,
We should try to get rid of those artefacts for users that can't use xft.
I agree. As Johannes has found out, this is an fltk bug.
BTW, we should try FLTK2-r5941 before asking in FLTK-dev. I'll try that one ASAP. -- Cheers Jorge.-
Hi, I've started comparing the xtraces of dillo1 and dillo2 during scrolling a long, _empty_ page (just 100 times "<br>"). The dillo2 trace contains the request 000:<:0392: 28: Request(62): CopyArea src-drawable=0x00a0000a dst-drawable=0x00a0000a gc=0x00a0000b src-x=0 src-y=74 dst-x=0 dst-y=73 width=1009 height=631 (A similar line appears in the previous traces of Johannes and mine.) For dillo1, I get 000:<:05b6: 28: Request(62): CopyArea src-drawable=0x00a00087 dst-drawable=0x00a0006d gc=0x00a00008 src-x=0 src-y=645 dst-x=0 dst-y=645 width=990 height=10 I don't have a detailed understanding of what's going on, but I find the difference in the "height" parameter remarkable. It seems that dillo2 copies an area - roughly the whole screen - which is 60 (!) times larger than that copied by dillo1. 60 could also be the performance difference we observe. Cheers, -- Matthias Franz
Hi, On Sat, Oct 13, 2007 at 07:58:41PM +0200, Matthias Franz wrote:
Hi,
I've started comparing the xtraces of dillo1 and dillo2 during scrolling a long, _empty_ page (just 100 times "<br>").
The dillo2 trace contains the request
000:<:0392: 28: Request(62): CopyArea src-drawable=0x00a0000a dst-drawable=0x00a0000a gc=0x00a0000b src-x=0 src-y=74 dst-x=0 dst-y=73 width=1009 height=631
(A similar line appears in the previous traces of Johannes and mine.)
For dillo1, I get
000:<:05b6: 28: Request(62): CopyArea src-drawable=0x00a00087 dst-drawable=0x00a0006d gc=0x00a00008 src-x=0 src-y=645 dst-x=0 dst-y=645 width=990 height=10
I don't have a detailed understanding of what's going on, but I find the difference in the "height" parameter remarkable. It seems that dillo2 copies an area - roughly the whole screen - which is 60 (!) times larger than that copied by dillo1. 60 could also be the performance difference we observe.
That's interesting. dillo2 (with patch) scrolls by copying the screen content that was visible before scrolling and still is after scrolling. So the large CopyArea is expected. Then any freshly exposed area is redrawn from scratch. dillo1 / gtk1 seems to have a more optimized way of doing that, that does not need to copy that much of the screen. There is a link in the gtk code to this document: http://www.gtk.org/~otaylor/whitepapers/guffaw-scrolling.txt which might be related. Cheers, Johannes
Cheers, -- Matthias Franz
_______________________________________________ Dillo-dev mailing list Dillo-dev@dillo.org http://lists.auriga.wearlab.de/cgi-bin/mailman/listinfo/dillo-dev
participants (3)
-
bogus@does.not.exist.com
-
Johannes.Hofmann@gmx.de
-
matthias.franz@ujf-grenoble.fr