On Mon, Mar 02, 2009 at 01:06:52PM -0300, Jorge Arellano Cid wrote:
On Sun, Mar 01, 2009 at 12:31:18PM +0100, Hofmann Johannes wrote:
On Sat, Feb 28, 2009 at 12:22:32AM +0100, Hofmann Johannes wrote:
On Fri, Feb 27, 2009 at 07:06:27PM +0100, Hofmann Johannes wrote:
On Fri, Feb 27, 2009 at 03:29:55PM +0000, corvid wrote:
On Fri, Feb 27, 2009 at 02:26:03PM +0000, corvid wrote: > > Committed. I hoped to be able to make that code a bit shorter, but > > could not find a reasonable solution... > > Yeah. In any case, it's just going to get worse/rearranged again > for negative numbers...
Exactly. I started my simplification attempts, added negative number support - and blew it all up :) I'm really considering a flex based scanner now. What do you think?
I only touched flex very briefly for a class at school long ago, and I don't remember anything from the experience, but the idea interests me since it has to be less trouble than trying to do it by hand...
This turned out to be easier than expected. Attached patch adds a flex based scanner for CSS data. It's not optimized or polished but seems to work here.
To make reviewing easier I created a repo for the flex experiment: http://freehg.org/u/dillo/flex/ I think the use of flex makes the code simpler and more maintainable. For some reason rendering is a bit different, e.g. on wikipedia.org maybe because negative numbers are supported. It would be nice if people could test it regarding the performance impact.
What do you think. Is it worth to add the flex dependency?
I'm a bit worried about the flex dependency.
It's being a long time since I saw/used flex/bison/yacc & friends. AFAIR, there was a myriad of slightly incompatible flavours.
I think what the patch uses is pretty standard and automake takes care of the detection and correct parameters.
It may be safer to generate a C-source parser with the tool, and to include it as a source file.
The problem is that we also need to (statically) link libfl.a, so we actually need lex/flex installed on the build system.
AFAIU the patch feeds a CSS grammar to libflex, and it acts as an interpreter that parses.
In it's current form only the scanner/tokenizer is replaced by a flex-generated one. The parsing is still done using the hand-written code.
You should consider the performance, the complexity (against tunning our current parser), and if the dependency can be avoided.
Until recently Css_next_token() was small and simple enough so that it was certainly not worth the additional dependency. However even adding support for floats that start with '.' increased it's complexity. Next is negative numbers, urls and maybe more. Does anyone know whether there are still compatibility/integration issues with flex on exotic platforms? Cheers, Johannes