On Tue, Mar 03, 2009 at 08:34:09AM -0300, Jorge Arellano Cid wrote:
On Mon, Mar 02, 2009 at 09:54:43PM +0100, Hofmann Johannes wrote:
On Mon, Mar 02, 2009 at 01:06:52PM -0300, Jorge Arellano Cid wrote:
On Sun, Mar 01, 2009 at 12:31:18PM +0100, Hofmann Johannes wrote:
On Sat, Feb 28, 2009 at 12:22:32AM +0100, Hofmann Johannes wrote:
On Fri, Feb 27, 2009 at 07:06:27PM +0100, Hofmann Johannes wrote:
On Fri, Feb 27, 2009 at 03:29:55PM +0000, corvid wrote: > > On Fri, Feb 27, 2009 at 02:26:03PM +0000, corvid wrote: > > > > Committed. I hoped to be able to make that code a bit shorter, but > > > > could not find a reasonable solution... > > > > > > Yeah. In any case, it's just going to get worse/rearranged again > > > for negative numbers... > > > > Exactly. I started my simplification attempts, added negative number > > support - and blew it all up :) > > I'm really considering a flex based scanner now. What do you think? > > I only touched flex very briefly for a class at school long ago, > and I don't remember anything from the experience, but the > idea interests me since it has to be less trouble than trying > to do it by hand... >
This turned out to be easier than expected. Attached patch adds a flex based scanner for CSS data. It's not optimized or polished but seems to work here.
To make reviewing easier I created a repo for the flex experiment: http://freehg.org/u/dillo/flex/ I think the use of flex makes the code simpler and more maintainable. For some reason rendering is a bit different, e.g. on wikipedia.org maybe because negative numbers are supported. It would be nice if people could test it regarding the performance impact.
What do you think. Is it worth to add the flex dependency?
I'm a bit worried about the flex dependency.
It's being a long time since I saw/used flex/bison/yacc & friends. AFAIR, there was a myriad of slightly incompatible flavours.
It may be safer to generate a C-source parser with the tool, and to include it as a source file.
I'm currently checking re2c (http://re2c.org/) as a flex alternative and it looks pretty cool. It does not need a lib and produces pretty clean C code. So we might ship the generated code in the release tarballs.
I'll post a prototype when I'm ready.
Great. C code looks like the more portable way to go.
BTW, quoting from the description of Flex ('aptitude show flex'):
<q> [...] The behaviour of Flex has undergone a major change since version 2.5.4a. Flex scanners are now reentrant, and it is now possible to have multiple scanners in the same program with differing sets of defaults, and the scanners play nicer with modern C and C++ compilers. The Flip side is that Flex no longer conforms to the POSIX lex behaviour, and the scanners require conforming implementations when flex is used in ANSI C mode. The package flex-old provides the older behaviour. </q>
OTOH, re2c looks like a suitable tool we may use in other areas too. Go ahead!
The re2c version now more or less works (escaped characters are not yet supported). You can find the code at http://freehg.org/u/dillo/flex/ Cheers, Johannes