Split documentation into files by topic and convert to AsciiDoc format.

svn path=/trunk/netsurf/; revision=2975
This commit is contained in:
James Bursa 2006-10-02 21:58:33 +00:00
parent 118f435133
commit c88b268f84
5 changed files with 224 additions and 0 deletions

58
Docs/00-overview Normal file
View File

@ -0,0 +1,58 @@
NetSurf Documentation for Developers
====================================
The documents in this directory describe how the NetSurf code works, and any
other information useful to developers.
Directory Structure
-------------------
The source is split at top level as follows:
content:: Fetching, managing, and converting content
render:: HTML processing and layout
css:: CSS parser
image:: Image conversion
desktop:: Non-platform specific front-end
riscos:: RISC OS specific code
debug:: Unix debug build specific code
gtk:: GTK specific code
utils:: Misc. useful functions
Other Documentation
-------------------
RISC OS specific protocols:
- Plugin http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/funcspec.html[]
http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/browse-plugins.html[]
- URI http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/uri.html[]
- URL http://www.vigay.com/inet/inet_url.html[]
- Nested WIMP http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/nested.html[]
Specifications:
- HTML 4.01 http://www.w3.org/TR/html401/[]
(see also http://www.w3.org/MarkUp/[])
- XHTML 1.0 http://www.w3.org/TR/xhtml1/[]
- CSS 2.1 http://www.w3.org/TR/CSS21/[]
- HTTP/1.1 http://www.w3.org/Protocols/rfc2616/rfc2616.html[]
and errata http://purl.org/NET/http-errata[]
(see also http://www.w3.org/Protocols/[])
- HTTP Authentication http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2617.html[]
- PNG http://www.w3.org/Graphics/PNG/[]
- URI http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2396.html[]
(see also http://www.w3.org/Addressing/[] and RFC 2616)
- Cookies http://wp.netscape.com/newsref/std/cookie_spec.html[] and
http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2109.html[]
Libraries
---------
Get these compiled for RISC OS with headers from
http://netsurf.strcprstskrzkrk.co.uk/developer/[]
- libxml (XML and HTML parser) http://www.xmlsoft.org/[]
- libcurl (HTTP, FTP, etc) http://curl.haxx.se/libcurl/[]
- OSLib (C interface to RISC OS SWIs) http://ro-oslib.sourceforge.net/[]
- libmng (PNG, JNG, MNG support) http://www.libmng.com/[]
- libjpeg (JPEG support) http://www.ijg.org/[]
- zlib http://www.gzip.org/zlib/[]
- OpenSSL (HTTPS support) http://www.openssl.org/[]

24
Docs/01-content Normal file
View File

@ -0,0 +1,24 @@
Fetching, managing, and converting content
==========================================
The modules in the content directory provide the infrastructure for fetching
data, managing it in memory, and converting it for display.
Struct Content
--------------
Each URL is stored in a struct ::content. This structure contains the
content_type and a union with fields for each type of data (HTML, CSS,
images). The content_* functions provide a general interface for handling these
structures. For example, content_redraw() calls html_redraw() or
nsjpeg_redraw(), etc., depending on the type of content. See content.h and
content.c.
Fetching
--------
A high-level interface to starting the process of fetching and converting an URL
is provided by the fetchcache functions, which check the memory cache for a url
and fetch, convert, and cache it if not present. See fetchcache.h and
fetchcache.c.
The fetch module provides a low-level URL fetching interface. See fetch.h and
fetch.c.

31
Docs/02-layout Normal file
View File

@ -0,0 +1,31 @@
HTML processing and layout
==========================
The modules in the layout directory process and layout HTML pages.
Overview
--------
This is the process to render an HTML document:
First the HTML is parsed to a tree of xmlNodes using the HTML parser in libxml.
This happens simultaneously with the fetch [html_process_data()].
Any stylesheets which the document depends on are fetched and parsed.
The tree is converted to a 'box tree' by xml_to_box(). The box tree contains a
node for each block, inline element, table, etc. The aim of this stage is to
determine the 'display' or 'float' CSS property of each element, and create the
corresponding node in the box tree. At this stage the style for each element is
also calculated (from CSS rules and element attributes). The tree is normalised
so that each node only has children of permitted types (eg. TABLE_CELLs must be
within TABLE_ROWs) by adding missing boxes.
The box tree is passed to the layout engine [layout_document()], which finds the
space required by each element and assigns coordinates to the boxes, based on
the style of each element and the available width. This includes formatting
inline elements into lines, laying out tables, and positioning floats. The
layout engine can be invoked again on a already laid out box tree to reformat it
to a new width. Coordinates in the box tree are relative to the position of the
parent node.
The box tree can then be rendered using each node's coordinates.

81
Docs/03-css Normal file
View File

@ -0,0 +1,81 @@
CSS parser
==========
CSS is tokenised by a re2c-generated scanner (scanner.l), and then parsed into a
memory representation by a lemon-generated parser (parser.y, ruleset.c).
Styles are retrieved using css_get_style(). They can be cascaded by
css_cascade().
Implementing a new CSS property
-------------------------------
In this section I go through adding a CSS property to NetSurf, using the
'white-space' property as an example. -- James Bursa
First read and understand the description of the property in the CSS
specification (I have worked from CSS 2, but now 2.1 is probably better).
Add the property to css_enums. This file is used to generate css_enum.h and
css_enum.c:
css_white_space inherit normal nowrap pre
(I'm not doing pre-wrap and pre-line for now.)
Add fields to struct css_style to represent the property:
css_white_space white_space;
Add a parser function for the property to ruleset.c. Declare a new function:
static void parse_white_space(struct css_style * const s, const struct css_node * const v);
and add it to property_table:
{ "white-space", parse_white_space },
This will cause the function to be called when the parser comes to a rule giving
a value for white-space. The function is passed a linked list of struct
::css_node, each of which corresponds to a token in the CSS source, and must
update s to correspond to that rule. For white-space, the implementation is
simply:
void parse_white_space(struct css_style * const s, const struct css_node * const v)
{
css_white_space z;
if (v->type != CSS_NODE_IDENT || v->next != 0)
return;
z = css_white_space_parse(v->data, v->data_length);
if (z != CSS_WHITE_SPACE_UNKNOWN)
s->white_space = z;
}
First we check that the value consists of exactly one identifier, as described
in the specification. If it is not, we ignore it, since it may be some future
CSS. The css_white_space_parse() function is generated in css_enum.c, and
converts a string giving a value to a constant. If the conversion succeeds, the
style s is updated.
Add defaults for the style to css_base_style, css_empty_style, and
css_blank_style in css.c. The value in css_base_style should be the one given as
'Initial' in the spec, and the value in css_empty_style should be inherit. If
'Inherited' is yes in the spec, the value in css_blank_style should be inherit,
otherwise it should be the one given as 'Initial'. Thus for white-space, which
has "Initial: normal, Inherited: yes" in the spec, we use CSS_WHITE_SPACE_NORMAL
in css_base_style and CSS_WHITE_SPACE_INHERIT in the other two.
Edit css_cascade() and css_merge() in css.c to handle the property. In both
cases for white-space this looks like:
if (apply->white_space != CSS_WHITE_SPACE_INHERIT)
style->white_space = apply->white_space;
Add the property to css_dump_style() (not essential).
Now the box, layout and / or redraw code needs to be changed to use the new
style property. This varies much more depending on the property.
For white-space, convert_xml_to_box() was changed to split text at newlines if
white-space was pre, and to replace spaces with hard spaces for nowrap.
Additionally, calculate_inline_container_widths() was changed to give the
appropriate minimum width for pre and nowrap.

30
Docs/04-errors Normal file
View File

@ -0,0 +1,30 @@
Error handling
==============
This section describes error handling in the code.
The most common serious error is memory exhaustion. If malloc(), strdup(), etc.
fails, clean up and free any partially complete structures leaving data in a
consistent state, and return a value which indicates failure, eg. 0 for
functions which return a pointer (document the value in the function
documentation). The caller should then propagate the failure up in the same way.
At some point, the error should stop being passed up and be reported to the user
using
warn_user("NoMemory", 0);
The other common error is one returned by a RISC OS SWI. Always use "X" SWIs,
something like this:
os_error *error;
error = xwimp_get_pointer_info(&pointer);
if (error) {
LOG(("xwimp_get_pointer_info: 0x%x: %s\n",
error->errnum, error->errmess));
warn_user("WimpError", error->errmess);
return false;
}
If an error occurs during initialisation, in most cases exit immediately using
die(), since this indicates that there is already insufficient memory, or a
resource file is corrupted, etc.