Johannes wrote:
On Fri, Jul 24, 2009 at 07:51:46PM +0200, Joerg Sonnenberger wrote:
On Fri, Jul 24, 2009 at 11:33:28AM -0400, Jorge Arellano Cid wrote:
* Add blocker (this also has to do with privacy). A full RE-based one may have significant overhead (test required), but simple string-matching may do most of the trick. I've also thought for a long time that a "don't load from other sites" preference may help a lot. The whole topic needs some thought but it's not hard to implement.
I'd be very careful with that assumptions. A good regex engine needs a single pass over the input string independent of the number of strings to search for. Note that the OS regex might not work that well, but e.g. TRE isn't that big.
Yes, here is a nice article about regexp matching: http://swtch.com/~rsc/regexp/regexp1.html
IRC corvid already did some testing with regexps for adblocking.
So far as I can recall, - using a small number of rules and some fnmatch() didn't give any obvious slowdown, although I didn't gprof it. - trying to make one big rule naively for regexec() used a huge amount of memory. and that's as far as I got.