[Dillo-dev] Re: css parsing numbers

March 2, 2009

      On Mon, Mar 02, 2009 at 01:06:52PM -0300, Jorge Arellano Cid wrote:
...
On Sun, Mar 01, 2009 at 12:31:18PM +0100, Hofmann Johannes wrote:
...
On Sat, Feb 28, 2009 at 12:22:32AM +0100, Hofmann Johannes wrote:
...
On Fri, Feb 27, 2009 at 07:06:27PM +0100, Hofmann Johannes wrote:
...
On Fri, Feb 27, 2009 at 03:29:55PM +0000, corvid wrote:
...
...
On Fri, Feb 27, 2009 at 02:26:03PM +0000, corvid wrote:
> > Committed. I hoped to be able to make that code a bit shorter, but
> > could not find a reasonable solution...
> 
> Yeah. In any case, it's just going to get worse/rearranged again
> for negative numbers...
Exactly. I started my simplification attempts, added negative number
support - and blew it all up :)
I'm really considering a flex based scanner now. What do you think?
I only touched flex very briefly for a class at school long ago,
and I don't remember anything from the experience, but the
idea interests me since it has to be less trouble than trying
to do it by hand...
This turned out to be easier than expected. Attached patch adds a flex
based scanner for CSS data.
It's not optimized or polished but seems to work here.
To make reviewing easier I created a repo for the flex experiment:
http://freehg.org/u/dillo/flex/
I think the use of flex makes the code simpler and more
maintainable. For some reason rendering is a bit different, e.g. on
wikipedia.org maybe because negative numbers are supported.
It would be nice if people could test it regarding the performance
impact.
What do you think. Is it worth to add the flex dependency?
I'm a bit worried about the flex dependency.
It's  being  a  long  time  since  I saw/used flex/bison/yacc &
friends.  AFAIR,  there  was  a  myriad  of slightly incompatible
flavours.
I think what the patch uses is pretty standard and automake takes
care of the detection and correct parameters.
...
It  may  be  safer to generate a C-source parser with the tool,
and to include it as a source file.
The problem is that we also need to (statically) link libfl.a, so we
actually need lex/flex installed on the build system.
...
AFAIU  the patch feeds a CSS grammar to libflex, and it acts as
an interpreter that parses.
In it's current form only the scanner/tokenizer is replaced by a
flex-generated one. The parsing is still done using the hand-written
code.
...
You  should  consider  the performance, the complexity (against
tunning  our  current  parser),  and  if  the  dependency  can be
avoided.
Until recently Css_next_token() was small and simple enough so that
it was certainly not worth the additional dependency.
However even adding support for floats that start with '.' increased
it's complexity. Next is negative numbers, urls and maybe more.

Does anyone know whether there are still compatibility/integration
issues with flex on exotic platforms?

Cheers,
Johannes

[Dillo-dev] Re: css parsing numbers

Johannes.Hofmann＠gmx.de