patch: multipart/form-data encoding for form submission
Currently, dillo always submits forms using url encoding regardless of the encoding specified by the form. Wikipedia, for example, specifies multipart/form-data because the user may submit huge blocks of text in languages where url encoding will be very inefficient. - This patch always uses the same boundary, but a different one REALLY should be used for every form submission. At the least, the string should be scrambled up somehow. - It adds a URL_MultipartEnc flag. - For the Content-type header, I pull the boundary out of URL_DATA because I didn't want to clutter up the URL with a field for something not commonly used. ----------------------------------------------------------- For multipart/form-data, a submission goes like: [headers] Content-type: multipart/form-data; boundary="[boundary]" --[boundary] Content-Disposition: form-data; name="[name1]" [value1] --[boundary] Content-Disposition: form-data; name="[name2]" [value2] --[boundary] ...and eventually... Content-Disposition: form-data; name="[namen]" [valuen] --[boundary]--
On Sun, Feb 10, 2008 at 11:42:10PM +0000, place wrote:
Currently, dillo always submits forms using url encoding regardless of the encoding specified by the form. Wikipedia, for example, specifies multipart/form-data because the user may submit huge blocks of text in languages where url encoding will be very inefficient.
OK, committed.
- This patch always uses the same boundary, but a different one REALLY should be used for every form submission. At the least, the string should be scrambled up somehow.
Yes, please finish that part. Probably checking the boundary string against URL_DATA, with strstr is a good idea too.
- It adds a URL_MultipartEnc flag. - For the Content-type header, I pull the boundary out of URL_DATA because I didn't want to clutter up the URL with a field for something not commonly used.
OK. -- Cheers Jorge.-
Jorge wrote:
- This patch always uses the same boundary, but a different one REALLY should be used for every form submission. At the least, the string should be scrambled up somehow.
Yes, please finish that part.
Probably checking the boundary string against URL_DATA, with strstr is a good idea too.
Attached. I'm not proud of this code, but special cases were getting in the way of attempts at cleanup.
On Fri, Feb 29, 2008 at 06:19:31PM +0000, place wrote:
Jorge wrote:
- This patch always uses the same boundary, but a different one REALLY should be used for every form submission. At the least, the string should be scrambled up somehow.
Yes, please finish that part.
Probably checking the boundary string against URL_DATA, with strstr is a good idea too.
Attached.
I'm not proud of this code, but special cases were getting in the way of attempts at cleanup.
Have you tried something like?: </pseudocode> Html_submit_form2() ... if (ENC_MULTIPART) { append every single "name" "value" pair into a Dstr do { make boundary } while (strstr(Dstr, boundary) || maxtries); } ... encode as usual. </pseudocode> If yes, why does it fail? -- Cheers Jorge.- PS: I usually manually delete the CC of my replies to you, thinking that you'll get it from dillo-dev. Do you want the extra copy?
Jorge wrote:
On Fri, Feb 29, 2008 at 06:19:31PM +0000, place wrote:
Jorge wrote:
- This patch always uses the same boundary, but a different one REALLY should be used for every form submission. At the least, the string should be scrambled up somehow.
Yes, please finish that part.
Probably checking the boundary string against URL_DATA, with strstr is a good idea too.
Attached.
I'm not proud of this code, but special cases were getting in the way of attempts at cleanup.
Have you tried something like?:
</pseudocode>
Html_submit_form2() ... if (ENC_MULTIPART) { append every single "name" "value" pair into a Dstr do { make boundary } while (strstr(Dstr, boundary) || maxtries); } ... encode as usual.
</pseudocode>
If yes, why does it fail?
I hadn't looked at that specifically. Let's see... To avoid duplicating the big case statement, I'd want to make a char *Html_get_input_value(input) { switch (input->type) { ... } return value; } if (ENC_MULTIPART) { for (each input in the form) { value = Html_get_input_value(input); dStr_append(data, name); dStr_append(data, value); } ...boundary... etc. ... } But for a file input, it could have embedded nulls, so make Html_get_input_value() return a Dstr* instead. Okay. But for an image input, thanks to clickpos, I'd have to make it return a list of name/value pairs. Maybe that's not too bad, but it does put me in a bit of a "stupid HTML and its stupid special cases!" mood. Random thought: I wonder whether url->data gets eradicated after sending somehow, or if copies of it stick around in copies of the url on, I don't know, the nav stack or whatever. If there are copies floating around, that'll be no fun if they were each to contain a big file input in their data.
PS: I usually manually delete the CC of my replies to you, thinking that you'll get it from dillo-dev. Do you want the extra copy?
I know someone who, when cc:ed, tends to say "I read the list" :)
Hi, On Fri, Feb 29, 2008 at 10:37:04PM +0000, place wrote:
Have you tried something like?:
</pseudocode>
Html_submit_form2() ... if (ENC_MULTIPART) { append every single "name" "value" pair into a Dstr do { make boundary } while (strstr(Dstr, boundary) || maxtries); } ... encode as usual.
</pseudocode>
If yes, why does it fail?
I hadn't looked at that specifically. Let's see...
To avoid duplicating the big case statement, I'd want to make a char *Html_get_input_value(input) { switch (input->type) { ... } return value; }
if (ENC_MULTIPART) { for (each input in the form) { value = Html_get_input_value(input); dStr_append(data, name); dStr_append(data, value); } ...boundary... etc. ... }
OK.
But for a file input, it could have embedded nulls, so make Html_get_input_value() return a Dstr* instead. Okay.
But for an image input, thanks to clickpos, I'd have to make it return a list of name/value pairs. Maybe that's not too bad, but it does put me in a bit of a "stupid HTML and its stupid special cases!" mood.
Don't worry about clickpos. Provided the current boundary skeleton it doesn't need to be checked. A comment stating this should be added.
Random thought: I wonder whether url->data gets eradicated after sending somehow, or if copies of it stick around in copies of the url on, I don't know, the nav stack or whatever. If there are copies floating around, that'll be no fun if they were each to contain a big file input in their data.
Yes, it should be checked when file input is implemented.
PS: I usually manually delete the CC of my replies to you, thinking that you'll get it from dillo-dev. Do you want the extra copy?
I know someone who, when cc:ed, tends to say "I read the list" :)
Wise guy! ;-) -- Cheers Jorge.-
Jorge wrote:
On Fri, Feb 29, 2008 at 10:37:04PM +0000, place wrote:
Have you tried something like?:
</pseudocode>
Html_submit_form2() ... if (ENC_MULTIPART) { append every single "name" "value" pair into a Dstr do { make boundary } while (strstr(Dstr, boundary) || maxtries); } ... encode as usual.
</pseudocode>
I gave it a try and ended up with the attached compiles-but-hardly-tested code. I had somehow forgotten earlier that I was going to have to use Dlists of values due to SELECT elements. This runs through the inputs once for boundary creation, and a second time for url->data creation. I was tempted to try - go through once, making a Dlist of name/value pairs - if (MULTIPART) determine a boundary, strstr()ing names and values - give each pair in the Dlist to Html_append_input but that's no good because I wouldn't know whether I had data from a file input when making url->data (Content-Disposition is different). Thoughts?
Hi, On Sun, Mar 02, 2008 at 01:25:28AM +0000, place wrote:
Jorge wrote:
On Fri, Feb 29, 2008 at 10:37:04PM +0000, place wrote:
Have you tried something like?:
</pseudocode>
Html_submit_form2() ... if (ENC_MULTIPART) { append every single "name" "value" pair into a Dstr do { make boundary } while (strstr(Dstr, boundary) || maxtries); } ... encode as usual.
</pseudocode>
I gave it a try and ended up with the attached compiles-but-hardly-tested code. I had somehow forgotten earlier that I was going to have to use Dlists of values due to SELECT elements.
This runs through the inputs once for boundary creation, and a second time for url->data creation.
I was tempted to try - go through once, making a Dlist of name/value pairs - if (MULTIPART) determine a boundary, strstr()ing names and values - give each pair in the Dlist to Html_append_input but that's no good because I wouldn't know whether I had data from a file input when making url->data (Content-Disposition is different).
Thoughts?
Yes. As form-submitting is not a frequent operartion, much less with multipart, there's no compelling need to optimize it further. This version of the patch is easier to read and understand. Please comment the Html_get_input_values() function. For:
+ // BUG: strstr stops at NULL. Cannot use for file inputs. + success = !strstr(DataStr->str, boundary->str);
please use memcmp(). Some basic testing would be welcomed, but the main probe will be done by usage when committed. -- Cheers Jorge.-
participants (2)
-
jcid@dillo.org
-
place@gobigwest.com