127 lines
5.8 KiB
Plaintext
127 lines
5.8 KiB
Plaintext
Fetching, managing, and converting content
|
|
==========================================
|
|
|
|
The modules in the content directory provide the infrastructure for fetching
|
|
data, managing it in memory, and converting it for display.
|
|
|
|
Contents
|
|
--------
|
|
The data related to each URL used by NetSurf is stored in a 'struct content'
|
|
(known as a "content"). A content contains
|
|
|
|
* a 'content type' which corresponds to the MIME type of the URL (for example
|
|
CONTENT_HTML, CONTENT_JPEG, or CONTENT_OTHER)
|
|
* a status (for example LOADING, DONE, or ERROR)
|
|
* type independent data such as the URL and raw source bytes
|
|
* a union of structs for type dependent data (for example 'struct
|
|
content_html_data')
|
|
|
|
Contents are stored in a global linked list 'content_list', also known as the
|
|
"memory cache".
|
|
|
|
The content_* functions provide a general interface for handling these
|
|
structures. They use a table of handlers to call type-specific code
|
|
('handler_map'). For example, content_redraw() may call html_redraw() or
|
|
nsjpeg_redraw() depending on the type of content.
|
|
|
|
Each content has a list of users. A user is a callback function which is sent a
|
|
message (called) when something interesting happens to the content (for example,
|
|
it's ready to be displayed). Examples of users are browser windows (of HTML
|
|
contents) and HTML contents (of JPEG contents).
|
|
|
|
Some content types may not be shared among users: an HTML content is dependent
|
|
on the width of the window, so sharing by two or more windows wouldn't work.
|
|
Thus there may be more than one content with the same URL in memory.
|
|
|
|
Content status
|
|
--------------
|
|
The status of a content follows a fixed order. Certain content functions change
|
|
the status, and each change of status results in a message to all users of the
|
|
content:
|
|
|
|
- content_create() creates a content in status TYPE_UNKNOWN
|
|
- content_set_type() takes a content TYPE_UNKNOWN to one of
|
|
* LOADING (sends optional MSG_NEWPTR followed by MSG_LOADING)
|
|
* ERROR (sends MSG_ERROR)
|
|
- content_process_data() takes LOADING to one of
|
|
* LOADING (no message)
|
|
* ERROR (MSG_ERROR)
|
|
- content_convert() takes LOADING to one of
|
|
* READY (MSG_READY)
|
|
* DONE (MSG_READY, MSG_DONE)
|
|
* ERROR (MSG_ERROR)
|
|
- a content can move from READY to DONE by itself, for example HTML contents
|
|
become DONE when all images are fetched and the document is reformatted
|
|
(MSG_DONE)
|
|
- content_stop() aborts loading of a READY content and results in status DONE
|
|
(MSG_DONE)
|
|
|
|
Type functions
|
|
--------------
|
|
[[typefunc]]
|
|
The type-specific functions for a content are as follows (where 'type' is
|
|
replaced by something):
|
|
|
|
type_create():: called to initialise type-specific fields in the content
|
|
structure. Optional.
|
|
type_process_data():: called when some data arrives. Optional.
|
|
type_convert():: called when data has finished arriving. The content needs to be
|
|
converted for display. Must set the status to one of
|
|
CONTENT_STATUS_READY or CONTENT_STATUS_DONE if no error occurs.
|
|
Optional, but probably required for non-trivial types.
|
|
type_reformat():: called when, for example, the window has been resized, and the
|
|
content needs reformatting for the new size. Optional.
|
|
type_destroy():: called when the content is being destroyed. Free all resources.
|
|
Optional.
|
|
type_redraw():: called to plot the content to screen.
|
|
type_redraw_tiled():: called to plot the content tiled across the screen.
|
|
Optional.
|
|
type_stop(): called when the user interrupts in status CONTENT_STATUS_READY.
|
|
Must stop any processing and set the status to CONTENT_STATUS_DONE.
|
|
Required iff the status can be CONTENT_STATUS_READY.
|
|
type_open(): called when a window containing the content is opened. Probably
|
|
only makes sense if no_share is set for the content type in
|
|
handler_map. Optional.
|
|
type_close():: called when the window containing the content is closed.
|
|
Optional.
|
|
|
|
If an error occurs in type_create(), type_process_data(), type_convert(),
|
|
CONTENT_MSG_ERROR must be broadcast and false returned. The _destroy function
|
|
will be called soon after.
|
|
|
|
Memory allocation
|
|
-----------------
|
|
Each content structure is allocated using talloc, and all data related to a
|
|
content should be allocated as a child block of the content structure using
|
|
talloc. This will ensure that all memory used by a content is freed.
|
|
|
|
Contents must keep an estimate of non-talloc allocations in the total_size
|
|
attribute. This is used to control the size of the memory cache.
|
|
|
|
Creating and fetching contents
|
|
------------------------------
|
|
A high-level interface to starting the process of fetching and converting an URL
|
|
is provided by the fetchcache functions, which check the memory cache for a url
|
|
and fetch, convert, and cache it if not present.
|
|
|
|
The fetch module provides a low-level URL fetching interface.
|
|
|
|
Adding support for a new content type
|
|
-------------------------------------
|
|
Addition of support for new content types is fairly simple and the process is
|
|
as follows:
|
|
|
|
- Implement, or at least stub out, the new content type handler. See the
|
|
<<typefunc,Type Functions>> section above for details of the type handler API.
|
|
- Add a type value to the 'content_type' enumeration (content_type.h)
|
|
- Add an entry for the new type's private data in the 'data' union within
|
|
'struct content' (content.h)
|
|
- Add appropriate mappings in the 'mime_map' table from MIME type strings to
|
|
the 'content_type' value created. (content.c)
|
|
- Add a textual name for the new content type to 'content_type_name'. This
|
|
array is indexed by 'content_type'. (content.c)
|
|
- Add an entry for the new content type's handler in the 'handler_map' array.
|
|
This array is indexed by 'content_type'. (content.c)
|
|
|
|
For examples of content type handlers, consult the image/ directory. The JPEG
|
|
handler is fairly self-explanatory. |