As mentioned in the discussion about link handlers and webpage content rewriting, I once looked into using Web proxy software to do the latter. One complication was that Dillo doesn't support a feature equivalent to setting the https_proxy environment variable for GNU Wget, whereby the proxy acts as the HTTPS endpoint for encrypted Web server connections and sends the data back to the browser unencrypted. This allows the proxy to manipulate page content without needing to transform HTTP requests into HTTPS requests and rewrite https:// in page URLs as http://. Besides proxies, for old computers where performance or available TLS library versions prevent using modern encryption, this also allows the encrypted communications to be offloaded to more modern computer. Or using a different encryption library that's optimised for old computers, as described here: http://oldvcr.blogspot.com/2020/11/fun-with-crypto-ancienne-tls-for.html For using this with old computers, it would be ideal if the HTTPS proxy feature was available even if Dillo is compiled without linking to a TLS library. Also the https_proxy environment variable should be set to the HTTPS proxy hostname when running Wget for downloads so that it applies to those as well.
Hi, On Wed, Jun 19, 2024 at 01:15:52PM +1000, Kevin Koster wrote:
As mentioned in the discussion about link handlers and webpage content rewriting, I once looked into using Web proxy software to do the latter. One complication was that Dillo doesn't support a feature equivalent to setting the https_proxy environment variable for GNU Wget, whereby the proxy acts as the HTTPS endpoint for encrypted Web server connections and sends the data back to the browser unencrypted. This allows the proxy to manipulate page content without needing to transform HTTP requests into HTTPS requests and rewrite https:// in page URLs as http://.
This should be easy to do, as Dillo can forward the http_proxy variable to the wget command. However, if you start Dillo from a shell where the http_proxy variable is set in the environment, I would imagine it would be already picked by Dillo and the internal wget command. The dpid daemon has to start from this shell, so you should do a "dpidc stop" command before to ensure it.
Besides proxies, for old computers where performance or available TLS library versions prevent using modern encryption, this also allows the encrypted communications to be offloaded to more modern computer. Or using a different encryption library that's optimised for old computers, as described here:
http://oldvcr.blogspot.com/2020/11/fun-with-crypto-ancienne-tls-for.html
For using this with old computers, it would be ideal if the HTTPS proxy feature was available even if Dillo is compiled without linking to a TLS library. Also the https_proxy environment variable should be set to the HTTPS proxy hostname when running Wget for downloads so that it applies to those as well.
Sounds doable, but I would have to review how the HTTPS traffic is handled in Dillo when a proxy is present. Best, Rodrigo.
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
On Wed, Jun 19, 2024 at 01:15:52PM +1000, Kevin Koster wrote:
As mentioned in the discussion about link handlers and webpage content rewriting, I once looked into using Web proxy software to do the latter. One complication was that Dillo doesn't support a feature equivalent to setting the https_proxy environment variable for GNU Wget, whereby the proxy acts as the HTTPS endpoint for encrypted Web server connections and sends the data back to the browser unencrypted. This allows the proxy to manipulate page content without needing to transform HTTP requests into HTTPS requests and rewrite https:// in page URLs as http://.
This should be easy to do, as Dillo can forward the http_proxy variable to the wget command.
However, if you start Dillo from a shell where the http_proxy variable is set in the environment, I would imagine it would be already picked by Dillo and the internal wget command. The dpid daemon has to start from this shell, so you should do a "dpidc stop" command before to ensure it.
I'm not sure if we're on the same page, Wget supports http_proxy and https_proxy environment variables. The former may or may not tunnel HTTPS requests through a Web proxy, but that's not what I want since then the proxy can't manipulate the data and the client needs to handle the encryption. https_proxy with Wget uses the proxy to do all the encryption and therefore those issues don't apply. More browsers support the HTTPS tunneling proxy feature, but that doesn't achieve anything that I'm interested in personally. Anyway Wget https_proxy is only a common example of the feature (which is described in more detail at the blog link), I'm not so much interested in using it when Dillo launches Wget for downloads, but for data retrieved by Dillo itself to display within the browser.
Besides proxies, for old computers where performance or available TLS library versions prevent using modern encryption, this also allows the encrypted communications to be offloaded to more modern computer. Or using a different encryption library that's optimised for old computers, as described here:
http://oldvcr.blogspot.com/2020/11/fun-with-crypto-ancienne-tls-for.html
For using this with old computers, it would be ideal if the HTTPS proxy feature was available even if Dillo is compiled without linking to a TLS library. Also the https_proxy environment variable should be set to the HTTPS proxy hostname when running Wget for downloads so that it applies to those as well.
Sounds doable, but I would have to review how the HTTPS traffic is handled in Dillo when a proxy is present.
I did look into this with Dillo 3.0.5 and I'm confident this HTTPS proxy feature wasn't supported already there, but I never worked out exactly where the feature would fit into that code (which is likely quite different now anyway since moving out of the DPI environment).
On Thu, 20 Jun 2024 10:03:59 +1000 "Kevin Koster" <dillo@ombertech.com> wrote:
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
However, if you start Dillo from a shell where the http_proxy variable is set in the environment, I would imagine it would be already picked by Dillo and the internal wget command. The dpid daemon has to start from this shell, so you should do a "dpidc stop" command before to ensure it.
I'm not sure if we're on the same page, Wget supports http_proxy and https_proxy environment variables. The former may or may not tunnel HTTPS requests through a Web proxy, but that's not what I want since then the proxy can't manipulate the data and the client needs to handle the encryption. https_proxy with Wget uses the proxy to do all the encryption and therefore those issues don't apply. More browsers support the HTTPS tunneling proxy feature, but that doesn't achieve anything that I'm interested in personally.
I did some tests and yes the https_proxy environment variable is picked up by Wget for the downloads DPI. It doesn't affect Dillo itself. The http_proxy environment variable is picked up by Dillo, but as I say that doesn't do what I want with Dillo or Wget. It appears that GNU Wget's behaviour has changed. Newer versions _do_ try to tunnel the HTTPS connection through the proxy set to https_proxy (not what I want), whereas older versions ask that proxy to establish the HTTPS connection and send data back unencrypted (which is what I want). So I picked a bad example with Wget. See that blog post for other examples like Lynx, but I haven't tried them myself. I can't find this change in the Wget changelog. One version that works as I intend it to is GNU Wget 1.9, circa 2003. Looking through the Wget mailing list archives it seems like many people either considered the old behaviour a bug or a useless feature, so I may be swimming against the tide. My content-modifying proxy application might be very niche, but I still think it would be more widely useful for people running Dillo on very old computers. Let me know whether it's something that might be accepted for Dillo even if I decide to try implementing it myself. Sorry for the confusion.
Hi Kevin, On Thu, Jun 20, 2024 at 04:07:27PM +1000, Kevin Koster wrote:
On Thu, 20 Jun 2024 10:03:59 +1000 "Kevin Koster" <dillo@ombertech.com> wrote:
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
However, if you start Dillo from a shell where the http_proxy variable is set in the environment, I would imagine it would be already picked by Dillo and the internal wget command. The dpid daemon has to start from this shell, so you should do a "dpidc stop" command before to ensure it.
I'm not sure if we're on the same page, Wget supports http_proxy and https_proxy environment variables. The former may or may not tunnel HTTPS requests through a Web proxy, but that's not what I want since then the proxy can't manipulate the data and the client needs to handle the encryption. https_proxy with Wget uses the proxy to do all the encryption and therefore those issues don't apply. More browsers support the HTTPS tunneling proxy feature, but that doesn't achieve anything that I'm interested in personally.
I did some tests and yes the https_proxy environment variable is picked up by Wget for the downloads DPI. It doesn't affect Dillo itself. The http_proxy environment variable is picked up by Dillo, but as I say that doesn't do what I want with Dillo or Wget.
It appears that GNU Wget's behaviour has changed. Newer versions _do_ try to tunnel the HTTPS connection through the proxy set to https_proxy (not what I want), whereas older versions ask that proxy to establish the HTTPS connection and send data back unencrypted (which is what I want). So I picked a bad example with Wget. See that blog post for other examples like Lynx, but I haven't tried them myself.
I can't find this change in the Wget changelog. One version that works as I intend it to is GNU Wget 1.9, circa 2003. Looking through the Wget mailing list archives it seems like many people either considered the old behaviour a bug or a useless feature, so I may be swimming against the tide.
My content-modifying proxy application might be very niche, but I still think it would be more widely useful for people running Dillo on very old computers. Let me know whether it's something that might be accepted for Dillo even if I decide to try implementing it myself.
Sorry for the confusion.
Provided that those machines cannot deal with HTTPS traffic, I think is a reasonable use case. I haven't had time to dig a bit more into this issue to see if there is a way to make those old machines use TLS instead, which I think it would be preferred. I won't oppose to it, but we probably want to ensure that we don't send plain traffic to a proxy when browsing HTTPS pages by accident. We should add an option like "allow_insecure_proxy" (false by default) or similar, so it is harder to leak HTTPS traffic accidentally. In any case, support for https_proxy should be added to Dillo first. Best, Rodrigo.
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
On Thu, Jun 20, 2024 at 04:07:27PM +1000, Kevin Koster wrote:
My content-modifying proxy application might be very niche, but I still think it would be more widely useful for people running Dillo on very old computers. Let me know whether it's something that might be accepted for Dillo even if I decide to try implementing it myself.
Provided that those machines cannot deal with HTTPS traffic, I think is a reasonable use case. I haven't had time to dig a bit more into this issue to see if there is a way to make those old machines use TLS instead
Oh there is, the Crypto Ancienne TLS library is designed for this: https://github.com/classilla/cryanc I started work on supporting it for HTTPS in Dillo, and got far enough that I'm sure the library would work for it, though I didn't finish debugging my attempt. That was while Dillo 3.1 looked abandoned so I did it as a HTTPS DPI plug-in, which makes my work obsolete now. But the proxy approach allows old versions/binaries to keep working too. If Dillo 3.0.5 had supported it, then the HTTPS issues from lacking SNI support could have been worked around by running an 'old-style' HTTPS proxy with SNI support on localhost.
I won't oppose to it, but we probably want to ensure that we don't send plain traffic to a proxy when browsing HTTPS pages by accident. We should add an option like "allow_insecure_proxy" (false by default) or similar, so it is harder to leak HTTPS traffic accidentally.
Sure, or actually I'd suggest using a separate variable/option name to set that proxy server because using "https_proxy" for both proxy systems is what's already confused me now with Wget. Something like "https_endpoint=[hostname]" maybe?
Hi, On Sat, Jun 22, 2024 at 09:43:00AM +1000, Kevin Koster wrote:
But the proxy approach allows old versions/binaries to keep working too. If Dillo 3.0.5 had supported it, then the HTTPS issues from lacking SNI support could have been worked around by running an 'old-style' HTTPS proxy with SNI support on localhost.
A related term seems to be "reverse proxy": https://en.wikipedia.org/wiki/Reverse_proxy From the "Fun with Crypto Ancienne" post I understand that you want Dillo to get a HTTP or HTTPS URL and always perform a HTTP GET towards your proxy, as the Mosaic configuration suggests:
https 127.0.0.1 8765 http http 127.0.0.1 8765 http
Other that those old browsers, I don't think you can do this with any (relatively) modern tool. This is what Dillo is currently doing for https and http URLs: hop% http_proxy=http://localhost:1234 dillo http://www.google.com hop% nc -l 1234 GET http://www.google.com/ HTTP/1.1 Host: www.google.com User-Agent: Dillo/3.1.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip, deflate DNT: 1 Referer: http://www.google.com/ Connection: keep-alive hop% http_proxy=http://localhost:1234 dillo https://www.google.com hop% nc -l 1234 CONNECT www.google.com:443 HTTP/1.1 Host: www.google.com:443 While for the latter you'll want: GET https://www.google.com/ HTTP/1.1 Host: www.google.com User-Agent: Dillo/3.1.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip, deflate DNT: 1 Referer: https://www.google.com/ Connection: keep-alive You'll need to instruct Dillo to ignore the HTTPS handling and just treat it as an HTTP GET request to the proxy (using the HTTPS url), so no CONNECT is used.
Sure, or actually I'd suggest using a separate variable/option name to set that proxy server because using "https_proxy" for both proxy systems is what's already confused me now with Wget. Something like "https_endpoint=[hostname]" maybe?
I think I would have to see it implemented to determine what would be a good name that prevents users from accidentally misconfiguring it. Best, Rodrigo.
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
On Sat, Jun 22, 2024 at 09:43:00AM +1000, Kevin Koster wrote:
But the proxy approach allows old versions/binaries to keep working too. If Dillo 3.0.5 had supported it, then the HTTPS issues from lacking SNI support could have been worked around by running an 'old-style' HTTPS proxy with SNI support on localhost.
A related term seems to be "reverse proxy":
Yes, though in practice that tends to imply a caching proxy run by website operators, hence "reverse" because a normal proxy is run by the person accessing the content on other people's servers, but these are run by the server operators. But the mechanism is the same as my usage. Real-world "reverse proxies" just tend to be geared away from internet-wide usage to serve requests to any server, not least because HTTPS requires the proxy to have the target website's certificate. Or at least based on looking at Squid and Nginx, internet-wide "reverse" proxy isage isn't documented clearly enough for me to spot the easy way of doing it (while also manipulating the content being retrieved). I did see a very hard way to do it with Squid and an abandoned 3rd-party library that wouldn't even compile for me.
From the "Fun with Crypto Ancienne" post I understand that you want Dillo to get a HTTP or HTTPS URL and always perform a HTTP GET towards your proxy, as the Mosaic configuration suggests:
https 127.0.0.1 8765 http http 127.0.0.1 8765 http
Yes. Which, if it helps to decode my earlier posts, is equivalent to this Wget command before they secretly changed its https_proxy behaviour: http_proxy=127.0.0.1:8765 https_proxy=127.0.0.1:8765 wget https://example.com
Other that those old browsers, I don't think you can do this with any (relatively) modern tool.
Which is annoying because I didn't need it until (relatively) recently. It looks like current Lynx is designed to use HTTPS tunneling through proxies now, based on use of the do_connect variable in WWW/Library/Implementation/HTTP.c of its source code. Though with Lynx 2.8.9rel.1 and 2.9.0dev.6 I can't actually get it to talk to anything (inc. Netcat on localhost) with http_proxy or https_proxy, even though setting each variable causes page loads to fail over the respective protocol. Not a firewall issue.
This is what Dillo is currently doing for https and http URLs:
hop% http_proxy=http://localhost:1234 dillo http://www.google.com hop% nc -l 1234 GET http://www.google.com/ HTTP/1.1 Host: www.google.com User-Agent: Dillo/3.1.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip, deflate DNT: 1 Referer: http://www.google.com/ Connection: keep-alive
hop% http_proxy=http://localhost:1234 dillo https://www.google.com hop% nc -l 1234 CONNECT www.google.com:443 HTTP/1.1 Host: www.google.com:443
While for the latter you'll want:
GET https://www.google.com/ HTTP/1.1 Host: www.google.com User-Agent: Dillo/3.1.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip, deflate DNT: 1 Referer: https://www.google.com/ Connection: keep-alive
You'll need to instruct Dillo to ignore the HTTPS handling and just treat it as an HTTP GET request to the proxy (using the HTTPS url), so no CONNECT is used.
Yes, though ideally like in Wget this behaviour wouldn't be caused by setting 'http_proxy'. You'd have a variable/setting like Wget's 'https_proxy' which only causes HTTPS connections to try and use the proxy, while HTTP connections go through directly. Then you could set 'http_proxy' as well if you wanted to use a proxy for HTTP connections too. For example (with 'https_endpoint' instead of the confusing 'https_proxy' variable/setting name I proposed earlier): http_proxy=127.0.0.1:1234 dillo http://www.google.com [ 127.0.0.1:1234 ] GET http://www.google.com/ HTTP/1.1 http_proxy=127.0.0.1:1234 dillo https://www.google.com [ 127.0.0.1:1234 ] CONNECT www.google.com:443 HTTP/1.1 https_endpoint=127.0.0.1:5678 dillo http://www.google.com [ www.google.com:80 ] GET http://www.google.com/ HTTP/1.1 https_endpoint=127.0.0.1:5678 dillo https://www.google.com [ 127.0.0.1:5678 ] GET https://www.google.com/ HTTP/1.1 http_proxy=127.0.0.1:1234 https_endpoint=127.0.0.1:5678 dillo http://www.google.com [ 127.0.0.1:1234 ] GET http://www.google.com/ HTTP/1.1 http_proxy=127.0.0.1:1234 https_endpoint=127.0.0.1:5678 dillo https://www.google.com [ 127.0.0.1:5678 ] GET https://www.google.com/ HTTP/1.1 So the behaviour of 'http_proxy' alone isn't changed. Also I don't think that 'https_endpoint' actually needs to be set from an environment variable, just a dillorc setting would do.
Hi, On Sun, Jun 23, 2024 at 11:02:54AM +1000, Kevin Koster wrote:
Or at least based on looking at Squid and Nginx, internet-wide "reverse" proxy isage isn't documented clearly enough for me to spot the easy way of doing it (while also manipulating the content being retrieved). I did see a very hard way to do it with Squid and an abandoned 3rd-party library that wouldn't even compile for me.
Yeah, probably using carl is much easier to setup.
Also I don't think that 'https_endpoint' actually needs to be set from an environment variable, just a dillorc setting would do.
I also think is good if it is only a dillorc option. I'm not sure if I like more something like: https_proxy=http://localhost:1234 # Setting this to TRUE sends HTTPS traffic unencrypted to the proxy. # Don't change unless you know what you are doing. #https_proxy_use_http=FALSE (default) https_proxy_use_http=TRUE For normal uses, setting the https proxy with https_proxy would make more sense than using http_proxy for HTTPS traffic anyway. Best, Rodrigo.
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Also I don't think that 'https_endpoint' actually needs to be set from an environment variable, just a dillorc setting would do.
I also think is good if it is only a dillorc option. I'm not sure if I like more something like:
https_proxy=http://localhost:1234
Ahh, that's what I was doing wrong with Lynx! I didn't include the "http://" part (which Wget doesn't require, though I see now that the Wget manual does say it should be set to a URL, not a hostname). I really do make a meal of this proxy stuff.
# Setting this to TRUE sends HTTPS traffic unencrypted to the proxy. # Don't change unless you know what you are doing. #https_proxy_use_http=FALSE (default) https_proxy_use_http=TRUE
For normal uses, setting the https proxy with https_proxy would make more sense than using http_proxy for HTTPS traffic anyway.
If the 'http_proxy' behaviour changes to only apply to HTTP like in other software, everyone using the 'http_proxy' setting will have to change their configuration to set 'https_proxy' as well. That'd be an annoying behaviour change. But otherwise using the setting name 'https_proxy' makes it unclear how that relates to the 'http_proxy' setting that applies to both HTTP and HTTPS. If I set: http_proxy=http://localhost:1234 https_proxy=http://localhost:5678 It at least needs a comment explaining the difference between the settings, if indeed you intend there to be one and the second setting doesn't override the first entirely. Since the 'http_proxy' environment variable overrides the dillorc 'http_proxy' setting, 'https_proxy' implies that the same will apply to that, and it's anyone's guess which will apply if both are set since other software (Wget, Lynx) doesn't apply the 'http_proxy' environment variable to https. This is why I now think it best to avoid the 'https_proxy' name, because the behaviour with it will never be consistent with other software without changing how Dillo already handles the 'http_proxy' variable/setting.
Hi, On Wed, Jun 26, 2024 at 08:52:37AM +1000, Kevin Koster wrote:
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Also I don't think that 'https_endpoint' actually needs to be set from an environment variable, just a dillorc setting would do.
I also think is good if it is only a dillorc option. I'm not sure if I like more something like:
https_proxy=http://localhost:1234
Ahh, that's what I was doing wrong with Lynx! I didn't include the "http://" part (which Wget doesn't require, though I see now that the Wget manual does say it should be set to a URL, not a hostname). I really do make a meal of this proxy stuff.
Yeah, it is a bit confusing :-)
# Setting this to TRUE sends HTTPS traffic unencrypted to the proxy. # Don't change unless you know what you are doing. #https_proxy_use_http=FALSE (default) https_proxy_use_http=TRUE
For normal uses, setting the https proxy with https_proxy would make more sense than using http_proxy for HTTPS traffic anyway.
If the 'http_proxy' behaviour changes to only apply to HTTP like in other software, everyone using the 'http_proxy' setting will have to change their configuration to set 'https_proxy' as well. That'd be an annoying behaviour change.
But otherwise using the setting name 'https_proxy' makes it unclear how that relates to the 'http_proxy' setting that applies to both HTTP and HTTPS.
Sadly, there is no standard (AFAIK) that specifies how to use those variables, but most software tools seem to use them in a more or less coherent way. And I think new users would expect it to follow the same usage than curl, wget or other browsers. It doesn't have to be annoying, we can slowly deprecate the current mechanism by displaying a warning when only http_proxy is set that https_proxy should be set too, and change the behavior in a new release that is clearly documented in the release notes.
If I set:
http_proxy=http://localhost:1234 https_proxy=http://localhost:5678
It at least needs a comment explaining the difference between the settings, if indeed you intend there to be one and the second setting doesn't override the first entirely.
Since the 'http_proxy' environment variable overrides the dillorc 'http_proxy' setting, 'https_proxy' implies that the same will apply to that, and it's anyone's guess which will apply if both are set since other software (Wget, Lynx) doesn't apply the 'http_proxy' environment variable to https.
Exactly, I think the logical behavior makes sense. Having both http_proxy and https_proxy as environment variables *and* dillorc options. Each controlling the HTTP and HTTPS proxy independently. The same in the future for ftp_proxy to route FTP traffic.
This is why I now think it best to avoid the 'https_proxy' name, because the behaviour with it will never be consistent with other software without changing how Dillo already handles the 'http_proxy' variable/setting.
I undertand, but I think this is a good moment to plan how to change how Dillo handles the http_proxy variable, so it only affects HTTP traffic and not HTTPS too. What about this: Setting the env vars as: http_proxy=http://localhost:1234 https_proxy= Now causes a warning (maybe also in the UI): $ dillo https://www.google.com WARNING: Setting http_proxy to specify the HTTPS proxy is deprecated, use https_proxy to define it instead. But still makes Dillo use the http_proxy both for HTTP and HTTPS for now. Also: http_proxy=http://localhost:1234 https_proxy=http://localhost:1235 No warning: $ dillo https://www.google.com But Dillo uses both proxies independently (like curl or wget), instead of the current implementation that would use http_proxy for HTTP and HTTPS. Same with the dillorc options http_proxy and https_proxy (new), keeping the preference for the env vars if defined. The only unexpected change may be for a user that had previously set https_proxy that was not being used by Dillo, and now it would be. But I think that 1) is unlikely they use a different value for HTTP and HTTPS and 2) it is probably intended to use the https_proxy for HTTPS anyway. We could emit a warning about this too, but maybe not needed if we explain that Dillo will handle now the https_proxy variable. In any case, this should be properly explained in the release notes, leaving one release with the deprecation warning for users to take time to change the behavior, before we stop using http_proxy as the HTTPS proxy. I think this has the benefit of being the implementation most users would expect, as it is consisten with other tools, while having the least amount of pain for current users of proxies as they would have time to realize the deprecation notice. That would allow you to specify *only* the https_proxy like this: http_proxy= https_proxy=http://localhost:1235 $ dillo https://www.google.com No warning and Dillo uses https_proxy for HTTPS and HTTP doesn't use a proxy. And over this change, you can add your specific option in dillorc only to change the way Dillo communicates with the HTTPS proxy to avoid the CONNECT, with the "https_proxy_use_http=TRUE" or similar option. Lastly, a similar option was mentioned by Daniel (curl lead developer) some time ago: https://stackoverflow.com/a/34268925 Best, Rodrigo
Rodrigo Arias wrote:
On Wed, Jun 26, 2024 at 08:52:37AM +1000, Kevin Koster wrote:
Rodrigo Arias <rodarima-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> wrote: This is why I now think it best to avoid the 'https_proxy' name, because the behaviour with it will never be consistent with other software without changing how Dillo already handles the 'http_proxy' variable/setting.
I undertand, but I think this is a good moment to plan how to change how Dillo handles the http_proxy variable, so it only affects HTTP traffic and not HTTPS too.
What about this:
Setting the env vars as: http_proxy=http://localhost:1234 https_proxy=
Now causes a warning (maybe also in the UI):
$ dillo https://www.google.com WARNING: Setting http_proxy to specify the HTTPS proxy is deprecated, use https_proxy to define it instead.
But still makes Dillo use the http_proxy both for HTTP and HTTPS for now.
Also:
http_proxy=http://localhost:1234 https_proxy=http://localhost:1235
No warning:
$ dillo https://www.google.com
But Dillo uses both proxies independently (like curl or wget), instead of the current implementation that would use http_proxy for HTTP and HTTPS.
Same with the dillorc options http_proxy and https_proxy (new), keeping the preference for the env vars if defined.
Yes that sounds like a good way to bring Dillo into line with other software, and also means that the environment variables apply consistently to downloads since they use Wget.
The only unexpected change may be for a user that had previously set https_proxy that was not being used by Dillo, and now it would be. But I think that 1) is unlikely they use a different value for HTTP and HTTPS and 2) it is probably intended to use the https_proxy for HTTPS anyway. We could emit a warning about this too, but maybe not needed if we explain that Dillo will handle now the https_proxy variable.
Affected users would have already been seeing inconsistent behaviour with file downloads since Wget is already using https_proxy.
In any case, this should be properly explained in the release notes, leaving one release with the deprecation warning for users to take time to change the behavior, before we stop using http_proxy as the HTTPS proxy.
Yep, and obviously in the changelog too, to be better than Wget and its undocumented past switch to tunneling HTTPS connections through https_proxy.
I think this has the benefit of being the implementation most users would expect, as it is consisten with other tools, while having the least amount of pain for current users of proxies as they would have time to realize the deprecation notice.
Yes I agree.
That would allow you to specify *only* the https_proxy like this:
http_proxy= https_proxy=http://localhost:1235
$ dillo https://www.google.com
No warning and Dillo uses https_proxy for HTTPS and HTTP doesn't use a proxy.
And over this change, you can add your specific option in dillorc only to change the way Dillo communicates with the HTTPS proxy to avoid the CONNECT, with the "https_proxy_use_http=TRUE" or similar option.
That would be fine.
Lastly, a similar option was mentioned by Daniel (curl lead developer) some time ago: https://stackoverflow.com/a/34268925
It doesn't appear that anyone implemented it in Curl though.
participants (2)
-
Kevin Koster
-
Rodrigo Arias