This makes url_normalize take care of whitespace in a fairly useful way,
consistent with other browsers:
- Leading and trailing whitespace is trimmed
- Internal whitespace is urlescaped
For example,
" http://www.google.co.uk/search?q=hello world "
becomes
"http://www.google.co.uk/search?q=hello%20world"
Explicit trailing whitespace, e.g. "...hello world%20", is left alone.
The upshot is that if you sloppily copy-paste a URL from IRC or whatnot
into the address bar, NetSurf no longer silently ignores you if you
caught some adjacent whitespace.
svn path=/trunk/netsurf/; revision=4198
This fixes all those sites that brokenly assume that it's required and thus break when the client doesn't send one (here's looking at you royalmail.com).
libcurl's default Accept header is "*/*", which is semantically equivalent to not sending a header at all (no header implies the client accepts all content types).
svn path=/trunk/netsurf/; revision=4196
We always assumed that the keycode type was 32bits wide, anyway. wchar_t isn't guaranteed to be that big, so isn't remotely portable.
svn path=/trunk/netsurf/; revision=4165
Do not change the locale globally, else things will break in weird and
wonderful ways.
Introduce utils/locale.[ch], which provide locale-specific wrappers for various
functions (currently just the <ctype.h> ones).
Fix up the few places I can see that actually require that the underlying
locale is paid attention to.
Some notes:
1) The GTK frontend code has not been touched. It is possible that reading of
numeric values (e.g. from the preferences dialogue) may break with this
change, particularly in locales that use something other than '.' as their
decimal separator.
2) The search code is left unchanged (i.e. assuming a locale of "C").
This may break case insensitive matching of non-ASCII characters.
I doubt that ever actually worked, anyway. In future, it should use
Unicode case conversion to achieve the same effect.
3) The text input handling in the core makes use of isspace() to detect
word boundaries. This is fine for western languages (even in the C locale,
which it's currently assuming). It will, however, break for CJK et. al.
(this has always been the case, rather than being a new issue)
4) text-transform uses locale-specific variants of to{lower,upper}. In future
this should probably be performing Unicode case conversion. This is the
only part of the core code that makes use of locale information.
In future, if you require locale-specific behaviour, do the following:
setlocale(LC_<whatever>, "");
<your operation(s) here>
setlocale(LC_<whatever>, "C");
The first setlocale will change the current locale to the native environment.
The second setlocale will reset the current locale to "C".
Any value other than "" or "C" is probably a bug, unless there's a really
good reason for it.
In the long term, it is expected that all locale-dependent code will reside in
platform frontends -- the core being wholly locale agnostic (though assuming
"C" for things like decimal separators).
svn path=/trunk/netsurf/; revision=4153