Hi, On Wed, Jun 19, 2024 at 10:03:20AM +1000, Kevin Koster wrote:
Notice that the HTTP content can be compressed. So, for example, this simple rewrite script:
#!/bin/sh sed 's_www.youtube.com_inv.vern.cc_g'
Would only work well in the SED node *after* the HTTP content is uncompressed and the headers removed. The rewrite rules should indicate in which position of the chain they apply.
This mechanism might suit an idea I've had to do remote downscaling of extremely large images, which are increasingly being included in web pages. The script would send a list of URLs in all <img> tags within the HTML to a remote server (eg. on a VPS), or ideally just the ones for large image files, then rewrite the URLs in the HTML to point to the remote server where the converted images are available over HTTP/S.
You can create a script that rewrites the <img> src attribute <img src="https://foo.com/img1.png"> To point to an endpoint of your server: <img src="https://yourserver.com/downscale?url=https://foo.com/img1.png"> And then in the server you simply downscale it. Here is how you could do it with rules: # Script that would rewrite images to a server for downscaling action downscale filter 'rewrite-img.sh' define mime header 'Content-Type' match mime 'text/html' action downscale
Or a deeper approach would be to apply the same approach as this rewrite engine to binary content as well, and have Dillo do it transparently via 'rewrite'/convert rules for image MIME types. Then the HTML would stay the same and Dillo would trigger a command that requested a downscaled image from the converter server instead of the original image's server. That would be more elegant, but expands the scope of your proposed system a little.
Rewriting the binary image directly would be possible, but then you would have wasted the bandwidth bringing it to Dillo, and now you have to send it to the server to downscale it. You probably want to use the previous approach for this case. In any case, imagine you want to downscale it locally anyway. Here is how I can think about it: # Script that would downscale an image and write to stdout action downscale filter 'downscale-img.sh' # Define headers from the HTTP content with shorter names define mime header 'Content-Type' define size header 'Content-Length' # Downscale big images match mime =~ 'image/.*' and size > 10K action downscale Notice that this can be triggered for any image, not only ones provided via HTTP/HTTPS, but also via other protocols like gemini that are adapted to speak HTTP and also provide a Content-Length header. I added the =~ and > operators, as the former would match a regex and the latter will use a numeric comparator. You can assume that the default if the header is not present is to make any comparison fail. I have also added the "define" keyword to define properties like "mime" or "size" which are parsed from the HTTP headers and are shorter and easier to write.
Maybe since it still requires a remote Web server this problem would be better solved via a Web proxy (I did look into Squid before, but drowned in confusing documentation). But I just thought I'd mention it as an example of a more complex usage for this proposed rewrite system.
But then you will need to pass all the traffic through the server so it performs the substitution there. Another solution which may be better is to mark from Dillo which requests are being done from img elements (filtering them before going to the network). If Dillo marks those requests in the HTTP headers for example, then you could do: # Script that transforms image HTTP requests to a server that # downscales the image action downscale filter 'downscale-req.sh' define source header 'Dillo-Request-Source' # Downscale images comming from <img> elements match source 'img' action downscale This would have the benefit that Dillo already performs the parsing of the HTML for you, and only the images that are loaded are passed to the downscaling server. Additionally, cookies would be sent in the HTTP request, so you can access login protected images this way too. Working on these examples is very helpful to design the rule system, so feel free to mention more cases. Best, Rodrigo.