Merge branch 'vince/llcache'

This commit is contained in:
Vincent Sanders 2014-05-13 15:55:24 +01:00
commit 6c466c985f
24 changed files with 2780 additions and 247 deletions

View File

@ -0,0 +1,194 @@
Source Object (low level) cache backing store
=============================================
Introduction
------------
The source object cache provides a system to extend the life of source
objects (html files, images etc.) after they are no longer immediately
being used.
Only fetch types where we have well defined rules on caching are
considered, in practice this limits us to HTTP(S). The section in
RFC2616 [1] on caching specifies these rules.
To futher extend the objects lifetime they can be pushed into a
backing store where the objects are available for reuse less quickly
than from RAM but faster than retriving from the network again.
The backing store implementation provides a key:value infrastructure
with a simple store, retrive and invalidate interface.
Generic filesystem backing store
--------------------------------
Although the backing store interface is fully pluggable a generic
implementation based on storing objects on the filesystem in a
heirachy of directories.
The option to alter the backing store format exists and is controled
by a version field. It is implementation defined what happens if a
version mis-match occours.
As the backing store only holds cache data one should not expect a
great deal of effort to be expended converting formats (i.e. the cache
may simply be discarded).
Layout version 1
----------------
An object has an identifier value generated from the url (NetSurf
backing stores uses the url as the unique key). The value used is
obtained using nsurl_hash() which is currently a 32 bit FNV so is
directly usable.
This identifier is adequate to ensure the collision rate for the
hashed url values (a collision for every 2^16 urls added) is
sufficiently low the overhead of returning the wrong object (which
backing stores are permitted to do) is not significat.
An entry list is maintained which contains all the metadata about a
given identifier. This list is limited in length to constrain the
resources necessary to maintain it. It is made persistant to avoid the
overhead of reconstructing it at initialisation and to keep the data
used to improve the eviction decisions.
Each object is stored and retrived directly into the filesystem using
a filename generated from a base64url encoding of an address
value. The objects address is derived from the identifier by cropping
it to a shorter length.
A mapping between the object address and its entry is maintained which
uses storage directly proportional to the size of the address length.
The cropping length is stored in the control file with the default
values set at compile time. This allows existing backing stores to
continue operating with existing data independantly of new default
setting. This setting gives some ability to tune the default cache
index size to values suitable for a specific host operating system.
E.g. Linux based systems can easily cope with several megabytes of
mmaped index but RISC OS might want to limit this to a few megabytes
of heap at most.
The files are stored on disc using their base64url address value.
By creating a directory for each character of the encoded filename
(except the last which is of course the leafname) we create a
directory structure where no directory has more than 64 entries.
E.g. A 19bit address of 0x1 would be base64url encoded into AAAB
resulting in the data being stored in a file path of
"/store/prefix/data/B/A/A/BAAAAA".
An address of 0x00040001 encodes to BAAB and a file path of
"/store/prefix/meta/B/A/A/BAABAA"
Control files
~~~~~~~~~~~~~
control
+++++++
A control file is used to hold a list of values describing how the
other files in the backing store should be used.
entries
+++++++
this file contains a table of entries describing the files held on the
filesystem.
Each control file table entry is 28 bytes and consists of
- signed 64 but value for last use time
- 32bit full url hash allowing for index reconstruction and
addiitonal collision detection. Also the possibility of increasing
the ADDRESS_LENGTH although this would require renaming all the
existing files in the cache and is not currently implemented.
- unsigned 32bit length for data
- unsigned 32bit length for metadata
- unsigned 16bit value for number of times used.
- unsigned 16bit value for flags
- unsigned 16bit value for data block index (unused)
- unsigned 16bit value for metatdata block index (unused)
Address to entry index
~~~~~~~~~~~~~~~~~~~~~~
An entry index is held in RAM that allows looking up the address to
map to an entry in the control file.
The index is the only data structure whose size is directly depndant
on the length of the hash specificaly:
(2 ^ (ADDRESS_BITS - 3)) * ENTRY_BITS) in bytes
where ADDRESS_BITS is how long the address is in bits and ENTRY_BITS
is how many entries the control file (and hence the while
cache) may hold.
RISCOS values
+++++++++++++
By limiting the ENTRY_BITS size to 14 (16,384 entries) the entries
list is limited to 448kilobytes.
The typical values for RISC OS would set ADDRESS_BITS to 18. This
spreads the entries over 262144 hash values which uses 512 kilobytes
for the index. Limiting the hash space like this reduces the
efectiveness of the cache.
A small ADDRESS_LENGTH causes a collision (two urls with the same
address) to happen roughly for every 2 ^ (ADDRESS_BITS / 2) = 2 ^ 9 =
512 objects stored. This roughly translates to a cache miss due to
collision every ten pages navigated to.
Larger systems
++++++++++++++
In general ENTRY_BITS set to 16 as this limits the store to 65536
objects which given the average size of an object at 8 kilobytes
yeilds half a gigabyte of disc used which is judged to be sufficient.
For larger systems e.g. those using GTK frontend we would most likely
select ADDRESS_BITS as 22 resulting in a collision every 2048 objects
but the index using some 8 Megabytes
Typical values
--------------
Example 1
~~~~~~~~~
For a store with 1034 objects genrated from a random navigation of
pages linked from the about:welcome page.
Metadata total size is 593608 bytes an average of 574 bytes. The
majority of the storage is used to hold the urls and headers.
Data total size is 9180475 bytes a mean of 8879 bytes 1648726 in the
largest 10 entries which if excluded gives 7355 bytes average size
Example 2
~~~~~~~~~
355 pages navigated in 80 minutes from about:welcome page and a
handful of additional sites (google image search and reddit)
2018 objects in cache at quit. 400 objects from news.bbc.co.uk alone
Metadata total 987,439 bytes mean of 489 bytes
data total 33,127,831 bytes mean of 16,416 bytes
with one single 5,000,811 byte gif
data totals without gif is 28,127,020 mean 13,945
[1] http://tools.ietf.org/html/rfc2616#section-13

View File

@ -86,6 +86,11 @@ NETSURF_HOMEPAGE := "about:welcome"
# Valid options: YES, NO # Valid options: YES, NO
NETSURF_USE_LIBICONV_PLUG := YES NETSURF_USE_LIBICONV_PLUG := YES
# Enable building the source object cache filesystem based backing store.
# implementation.
# Valid options: YES, NO
NETSURF_FS_BACKING_STORE := NO
# Initial CFLAGS. Optimisation level etc. tend to be target specific. # Initial CFLAGS. Optimisation level etc. tend to be target specific.
CFLAGS := CFLAGS :=

View File

@ -5294,7 +5294,7 @@ int main(int argc, char** argv)
if (ami_locate_resource(messages, "Messages") == false) if (ami_locate_resource(messages, "Messages") == false)
die("Cannot open Messages file"); die("Cannot open Messages file");
ret = netsurf_init(messages); ret = netsurf_init(messages, NULL);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");
} }

View File

@ -1126,7 +1126,7 @@ int main(int argc, char** argv)
/* common initialisation */ /* common initialisation */
LOG(("Initialising core...")); LOG(("Initialising core..."));
ret = netsurf_init(messages); ret = netsurf_init(messages, NULL);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");
} }

View File

@ -1062,7 +1062,7 @@ int main(int argc, char** argv)
/* common initialisation */ /* common initialisation */
BPath messages = get_messages_path(); BPath messages = get_messages_path();
ret = netsurf_init(messages.Path()); ret = netsurf_init(messages.Path(), NULL);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");
} }
@ -1115,7 +1115,7 @@ int gui_init_replicant(int argc, char** argv)
/* common initialisation */ /* common initialisation */
BPath messages = get_messages_path(); BPath messages = get_messages_path();
ret = netsurf_init(messages.Path()); ret = netsurf_init(messages.Path(), NULL);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
// FIXME: must not die when in replicant! // FIXME: must not die when in replicant!
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");

View File

@ -217,7 +217,7 @@ int main( int argc, char **argv )
nsoption_commandline(&argc, argv, NULL); nsoption_commandline(&argc, argv, NULL);
/* common initialisation */ /* common initialisation */
error = netsurf_init(messages); error = netsurf_init(messages, NULL);
if (error != NSERROR_OK) { if (error != NSERROR_OK) {
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");
} }

View File

@ -1,6 +1,11 @@
# Content sources # Content sources
S_CONTENT := content.c content_factory.c dirlist.c fetch.c hlcache.c \ S_CONTENT := content.c content_factory.c dirlist.c fetch.c hlcache.c \
llcache.c mimesniff.c urldb.c llcache.c mimesniff.c urldb.c no_backing_store.c
S_CONTENT := $(addprefix content/,$(S_CONTENT)) # Make filesystem backing store available
ifeq ($(NETSURF_FS_BACKING_STORE),YES)
S_CONTENT += fs_backing_store.c
endif
S_CONTENT := $(addprefix content/,$(S_CONTENT))

100
content/backing_store.h Normal file
View File

@ -0,0 +1,100 @@
/*
* Copyright 2014 Vincent Sanders <vince@netsurf-browser.org>
*
* This file is part of NetSurf, http://www.netsurf-browser.org/
*
* NetSurf is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; version 2 of the License.
*
* NetSurf is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/** \file
* Low-level source data cache backing store interface
*/
#ifndef NETSURF_CONTENT_LLCACHE_PRIVATE_H_
#define NETSURF_CONTENT_LLCACHE_PRIVATE_H_
#include "content/llcache.h"
/** storage control flags */
enum backing_store_flags {
BACKING_STORE_NONE = 0, /**< no special processing */
BACKING_STORE_META = 1, /**< data is metadata */
BACKING_STORE_MMAP = 2, /**< when data is retrived this indicates the
* returned buffer may be memory mapped,
* flag must be cleared if the storage is
* allocated and is not memory mapped.
*/
};
/** low level cache backing store operation table
*
* The low level cache (source objects) has the capability to make
* objects and their metadata (headers etc) persistant by writing to a
* backing store using these operations.
*/
struct gui_llcache_table {
/**
* Initialise the backing store.
*
* @param parameters to configure backing store.
* @return NSERROR_OK on success or error code on faliure.
*/
nserror (*initialise)(const struct llcache_store_parameters *parameters);
/**
* Finalise the backing store.
*
* @return NSERROR_OK on success or error code on faliure.
*/
nserror (*finalise)(void);
/**
* Place an object in the backing store.
*
* @param url The url is used as the unique primary key for the data.
* @param flags The flags to control how the obejct is stored.
* @param data The objects data.
* @param datalen The length of the \a data.
* @return NSERROR_OK on success or error code on faliure.
*/
nserror (*store)(struct nsurl *url, enum backing_store_flags flags,
const uint8_t *data, const size_t datalen);
/**
* Retrive an object from the backing store.
*
* @param url The url is used as the unique primary key for the data.
* @param flags The flags to control how the object is retrived.
* @param data The objects data.
* @param datalen The length of the \a data retrieved.
* @return NSERROR_OK on success or error code on faliure.
*/
nserror (*fetch)(struct nsurl *url, enum backing_store_flags *flags,
uint8_t **data, size_t *datalen);
/**
* Invalidate a source object from the backing store.
*
* The entry (if present in the backing store) must no longer
* be returned as a result to the fetch or meta operations.
*
* @param url The url is used as the unique primary key to invalidate.
* @return NSERROR_OK on success or error code on faliure.
*/
nserror (*invalidate)(struct nsurl *url);
};
extern struct gui_llcache_table* null_llcache_table;
extern struct gui_llcache_table* filesystem_llcache_table;
#endif

1197
content/fs_backing_store.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -339,9 +339,10 @@ static nserror hlcache_migrate_ctx(hlcache_retrieval_ctx *ctx,
ctx->migrate_target = true; ctx->migrate_target = true;
if (effective_type != NULL && if ((effective_type != NULL) &&
hlcache_type_is_acceptable(effective_type, hlcache_type_is_acceptable(effective_type,
ctx->accepted_types, &type)) { ctx->accepted_types,
&type)) {
error = hlcache_find_content(ctx, effective_type); error = hlcache_find_content(ctx, effective_type);
if (error != NSERROR_OK && error != NSERROR_NEED_DATA) { if (error != NSERROR_OK && error != NSERROR_NEED_DATA) {
if (ctx->handle->cb != NULL) { if (ctx->handle->cb != NULL) {
@ -524,9 +525,7 @@ hlcache_initialise(const struct hlcache_parameters *hlcache_parameters)
return NSERROR_NOMEM; return NSERROR_NOMEM;
} }
ret = llcache_initialise(hlcache_parameters->cb, ret = llcache_initialise(&hlcache_parameters->llcache);
hlcache_parameters->cb_ctx,
hlcache_parameters->limit);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
free(hlcache); free(hlcache);
hlcache = NULL; hlcache = NULL;

View File

@ -23,11 +23,12 @@
#ifndef NETSURF_CONTENT_HLCACHE_H_ #ifndef NETSURF_CONTENT_HLCACHE_H_
#define NETSURF_CONTENT_HLCACHE_H_ #define NETSURF_CONTENT_HLCACHE_H_
#include "content/content.h"
#include "content/llcache.h"
#include "utils/errors.h" #include "utils/errors.h"
#include "utils/nsurl.h" #include "utils/nsurl.h"
#include "content/content.h"
#include "content/llcache.h"
/** High-level cache handle */ /** High-level cache handle */
typedef struct hlcache_handle hlcache_handle; typedef struct hlcache_handle hlcache_handle;
@ -44,18 +45,10 @@ typedef struct {
} hlcache_event; } hlcache_event;
struct hlcache_parameters { struct hlcache_parameters {
llcache_query_callback cb; /**< Query handler for llcache */
void *cb_ctx; /**< Pointer to llcache query handler data */
/** How frequently the background cache clean process is run (ms) */ /** How frequently the background cache clean process is run (ms) */
unsigned int bg_clean_time; unsigned int bg_clean_time;
/** The target upper bound for the cache size */ struct llcache_parameters llcache;
size_t limit;
/** The hysteresis allowed round the target size */
size_t hysteresis;
}; };
/** /**
@ -67,13 +60,13 @@ struct hlcache_parameters {
* \return NSERROR_OK on success, appropriate error otherwise. * \return NSERROR_OK on success, appropriate error otherwise.
*/ */
typedef nserror (*hlcache_handle_callback)(hlcache_handle *handle, typedef nserror (*hlcache_handle_callback)(hlcache_handle *handle,
const hlcache_event *event, void *pw); const hlcache_event *event, void *pw);
/** Flags for high-level cache object retrieval */ /** Flags for high-level cache object retrieval */
enum hlcache_retrieve_flag { enum hlcache_retrieve_flag {
/* Note: low-level cache retrieval flags occupy the bottom 16 bits of /* Note: low-level cache retrieval flags occupy the bottom 16 bits of
* the flags word. High-level cache flags occupy the top 16 bits. * the flags word. High-level cache flags occupy the top 16 bits.
* To avoid confusion, high-level flags are allocated from bit 31 down. * To avoid confusion, high-level flags are allocated from bit 31 down.
*/ */
/** It's permitted to convert this request into a download */ /** It's permitted to convert this request into a download */
HLCACHE_RETRIEVE_MAY_DOWNLOAD = (1 << 31), HLCACHE_RETRIEVE_MAY_DOWNLOAD = (1 << 31),
@ -84,7 +77,7 @@ enum hlcache_retrieve_flag {
/** /**
* Initialise the high-level cache, preparing the llcache also. * Initialise the high-level cache, preparing the llcache also.
* *
* \param hlcache_parameters Settings to initialise cache with * \param hlcache_parameters Settings to initialise cache with
* \return NSERROR_OK on success, appropriate error otherwise. * \return NSERROR_OK on success, appropriate error otherwise.
*/ */
nserror hlcache_initialise(const struct hlcache_parameters *hlcache_parameters); nserror hlcache_initialise(const struct hlcache_parameters *hlcache_parameters);
@ -133,7 +126,7 @@ nserror hlcache_poll(void);
nserror hlcache_handle_retrieve(nsurl *url, uint32_t flags, nserror hlcache_handle_retrieve(nsurl *url, uint32_t flags,
nsurl *referer, llcache_post_data *post, nsurl *referer, llcache_post_data *post,
hlcache_handle_callback cb, void *pw, hlcache_handle_callback cb, void *pw,
hlcache_child_context *child, hlcache_child_context *child,
content_type accepted_types, hlcache_handle **result); content_type accepted_types, hlcache_handle **result);
/** /**
@ -169,13 +162,13 @@ nserror hlcache_handle_replace_callback(hlcache_handle *handle,
* \param handle Cache handle to dereference * \param handle Cache handle to dereference
* \return Pointer to content object, or NULL if there is none * \return Pointer to content object, or NULL if there is none
* *
* \todo This may not be correct. Ideally, the client should never need to * \todo This may not be correct. Ideally, the client should never need to
* directly access a content object. It may, therefore, be better to provide a * directly access a content object. It may, therefore, be better to provide a
* bunch of veneers here that take a hlcache_handle and invoke the * bunch of veneers here that take a hlcache_handle and invoke the
* corresponding content_ API. If there's no content object associated with the * corresponding content_ API. If there's no content object associated with the
* hlcache_handle (e.g. because the source data is still being fetched, so it * hlcache_handle (e.g. because the source data is still being fetched, so it
* doesn't exist yet), then these veneers would behave as a NOP. The important * doesn't exist yet), then these veneers would behave as a NOP. The important
* thing being that the client need not care about this possibility and can * thing being that the client need not care about this possibility and can
* just call the functions with impugnity. * just call the functions with impugnity.
*/ */
struct content *hlcache_handle_get_content(const hlcache_handle *handle); struct content *hlcache_handle_get_content(const hlcache_handle *handle);

File diff suppressed because it is too large Load Diff

View File

@ -76,7 +76,7 @@ typedef struct {
} data; /**< Event data */ } data; /**< Event data */
} llcache_event; } llcache_event;
/** /**
* Client callback for low-level cache events * Client callback for low-level cache events
* *
* \param handle Handle for which event is issued * \param handle Handle for which event is issued
@ -84,18 +84,18 @@ typedef struct {
* \param pw Pointer to client-specific data * \param pw Pointer to client-specific data
* \return NSERROR_OK on success, appropriate error otherwise. * \return NSERROR_OK on success, appropriate error otherwise.
*/ */
typedef nserror (*llcache_handle_callback)(llcache_handle *handle, typedef nserror (*llcache_handle_callback)(llcache_handle *handle,
const llcache_event *event, void *pw); const llcache_event *event, void *pw);
/** Flags for low-level cache object retrieval */ /** Flags for low-level cache object retrieval */
enum llcache_retrieve_flag { enum llcache_retrieve_flag {
/* Note: We're permitted a maximum of 16 flags which must reside in the /* Note: We're permitted a maximum of 16 flags which must reside in the
* bottom 16 bits of the flags word. See hlcache.h for further details. * bottom 16 bits of the flags word. See hlcache.h for further details.
*/ */
/** Force a new fetch */ /** Force a new fetch */
LLCACHE_RETRIEVE_FORCE_FETCH = (1 << 0), LLCACHE_RETRIEVE_FORCE_FETCH = (1 << 0),
/** Requested URL was verified */ /** Requested URL was verified */
LLCACHE_RETRIEVE_VERIFIABLE = (1 << 1), LLCACHE_RETRIEVE_VERIFIABLE = (1 << 1),
/**< No error pages */ /**< No error pages */
LLCACHE_RETRIEVE_NO_ERROR_PAGES = (1 << 2), LLCACHE_RETRIEVE_NO_ERROR_PAGES = (1 << 2),
/**< Stream data (implies that object is not cacheable) */ /**< Stream data (implies that object is not cacheable) */
@ -149,13 +149,81 @@ typedef nserror (*llcache_query_response)(bool proceed, void *cbpw);
* \param cbpw Opaque value to pass into \a cb * \param cbpw Opaque value to pass into \a cb
* \return NSERROR_OK on success, appropriate error otherwise * \return NSERROR_OK on success, appropriate error otherwise
* *
* \note This callback should return immediately. Once a suitable answer to * \note This callback should return immediately. Once a suitable answer to
* the query has been obtained, the provided response callback should be * the query has been obtained, the provided response callback should be
* called. This is intended to be an entirely asynchronous process. * called. This is intended to be an entirely asynchronous process.
*/ */
typedef nserror (*llcache_query_callback)(const llcache_query *query, void *pw, typedef nserror (*llcache_query_callback)(const llcache_query *query, void *pw,
llcache_query_response cb, void *cbpw); llcache_query_response cb, void *cbpw);
/**
* Parameters to configure the low level cache backing store.
*/
struct llcache_store_parameters {
const char *path; /**< The path to the backing store */
size_t limit; /**< The backing store upper bound target size */
size_t hysteresis; /**< The hysteresis around the target size */
/** log2 of the default maximum number of entries the cache
* can track.
*
* If unset this defaults to 16 (65536 entries) The cache
* control file takes precedence so cache data remains
* portable between builds with differing defaults.
*/
unsigned int entry_size;
/** log2 of the default number of entries in the mapping between
* the url and cache entries.
*
* @note This is exposing an internal implementation detail of
* the filesystem based default backing store implementation.
* However it is likely any backing store implementation will
* need some way to map url to cache entries so it is a
* generally useful configuration value.
*
* Too small a value will cause unecessary collisions and
* cache misses and larger values cause proportionaly larger
* amounts of memory to be used.
*
* The "birthday paradox" means that the hash will experience
* a collision in every 2^(address_size/2) urls the cache
* stores.
*
* A value of 20 means one object stored in every 1024 will
* cause a collion and a cache miss while using two megabytes
* of storage.
*
* If unset this defaults to 20 (1048576 entries using two
* megabytes) The cache control file takes precedence so cache
* data remains portable between builds with differing
* defaults.
*/
unsigned int address_size;
};
/**
* Parameters to configure the low level cache.
*/
struct llcache_parameters {
llcache_query_callback cb; /**< Query handler for llcache */
void *cb_ctx; /**< Pointer to llcache query handler data */
size_t limit; /**< The target upper bound for the RAM cache size */
size_t hysteresis; /**< The hysteresis around the target size */
int minimum_lifetime; /**< The minimum lifetime to consider
* sending objects to backing store.
*/
size_t bandwidth; /**< The maximum bandwidth to allow the
* backing store to use.
*/
struct llcache_store_parameters store;
};
/** /**
* Initialise the low-level cache * Initialise the low-level cache
* *
@ -163,7 +231,7 @@ typedef nserror (*llcache_query_callback)(const llcache_query *query, void *pw,
* \param pw Pointer to query handler data * \param pw Pointer to query handler data
* \return NSERROR_OK on success, appropriate error otherwise. * \return NSERROR_OK on success, appropriate error otherwise.
*/ */
nserror llcache_initialise(llcache_query_callback cb, void *pw, uint32_t llcache_limit); nserror llcache_initialise(const struct llcache_parameters *parameters);
/** /**
* Finalise the low-level cache * Finalise the low-level cache
@ -280,12 +348,12 @@ const uint8_t *llcache_handle_get_source_data(const llcache_handle *handle,
* \return Header value, or NULL if header does not exist * \return Header value, or NULL if header does not exist
* *
* \todo Make the key an enumeration, to avoid needless string comparisons * \todo Make the key an enumeration, to avoid needless string comparisons
* \todo Forcing the client to parse the header value seems wrong. * \todo Forcing the client to parse the header value seems wrong.
* Better would be to return the actual value part and an array of * Better would be to return the actual value part and an array of
* key-value pairs for any additional parameters. * key-value pairs for any additional parameters.
* \todo Deal with multiple headers of the same key (e.g. Set-Cookie) * \todo Deal with multiple headers of the same key (e.g. Set-Cookie)
*/ */
const char *llcache_handle_get_header(const llcache_handle *handle, const char *llcache_handle_get_header(const llcache_handle *handle,
const char *key); const char *key);
/** /**
@ -295,7 +363,7 @@ const char *llcache_handle_get_header(const llcache_handle *handle,
* \param b Second handle * \param b Second handle
* \return True if handles reference the same object, false otherwise * \return True if handles reference the same object, false otherwise
*/ */
bool llcache_handle_references_same_object(const llcache_handle *a, bool llcache_handle_references_same_object(const llcache_handle *a,
const llcache_handle *b); const llcache_handle *b);
#endif #endif

View File

@ -0,0 +1,68 @@
/*
* Copyright 2014 Vincent Sanders <vince@netsurf-browser.org>
*
* This file is part of NetSurf, http://www.netsurf-browser.org/
*
* NetSurf is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; version 2 of the License.
*
* NetSurf is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/** \file
* Low-level resource cache null persistant storage implementation.
*/
#include "utils/nsurl.h"
#include "content/backing_store.h"
/* default to disabled backing store */
static nserror initialise(const struct llcache_store_parameters *parameters)
{
return NSERROR_OK;
}
static nserror finalise(void)
{
return NSERROR_OK;
}
static nserror store(nsurl *url,
enum backing_store_flags flags,
const uint8_t *data,
const size_t datalen)
{
return NSERROR_SAVE_FAILED;
}
static nserror fetch(nsurl *url,
enum backing_store_flags *flags,
uint8_t **data_out,
size_t *datalen_out)
{
return NSERROR_NOT_FOUND;
}
static nserror invalidate(nsurl *url)
{
return NSERROR_NOT_FOUND;
}
static struct gui_llcache_table llcache_table = {
.initialise = initialise,
.finalise = finalise,
.store = store,
.fetch = fetch,
.invalidate = invalidate,
};
struct gui_llcache_table *null_llcache_table = &llcache_table;

View File

@ -69,6 +69,7 @@ struct hlcache_handle;
struct download_context; struct download_context;
struct nsurl; struct nsurl;
struct gui_file_table; struct gui_file_table;
struct gui_llcache_table;
typedef struct nsnsclipboard_styles { typedef struct nsnsclipboard_styles {
size_t start; /**< Start of run */ size_t start; /**< Start of run */
@ -520,7 +521,6 @@ struct gui_browser_table {
}; };
/** /**
* NetSurf operation function table * NetSurf operation function table
* *
@ -572,6 +572,15 @@ struct netsurf_table {
* Provides routines for the interactive text search on a page. * Provides routines for the interactive text search on a page.
*/ */
struct gui_search_table *search; struct gui_search_table *search;
/**
* Low level cache table.
*
* Used by the low level cache to push objects to persistant
* storage. The table is optional and may be NULL which
* uses the default implementation.
*/
struct gui_llcache_table *llcache;
}; };

View File

@ -17,6 +17,8 @@
*/ */
#include "content/hlcache.h" #include "content/hlcache.h"
#include "content/backing_store.h"
#include "desktop/download.h" #include "desktop/download.h"
#include "desktop/gui_factory.h" #include "desktop/gui_factory.h"
#include "utils/file.h" #include "utils/file.h"
@ -25,7 +27,6 @@
struct netsurf_table *guit = NULL; struct netsurf_table *guit = NULL;
static void gui_default_window_set_title(struct gui_window *g, const char *title) static void gui_default_window_set_title(struct gui_window *g, const char *title)
{ {
} }
@ -400,6 +401,34 @@ static nserror verify_search_register(struct gui_search_table *gst)
return NSERROR_OK; return NSERROR_OK;
} }
/** verify low level cache persistant backing store table is valid */
static nserror verify_llcache_register(struct gui_llcache_table *glt)
{
/* check table is present */
if (glt == NULL) {
return NSERROR_BAD_PARAMETER;
}
/* mandantory operations */
if (glt->store == NULL) {
return NSERROR_BAD_PARAMETER;
}
if (glt->fetch == NULL) {
return NSERROR_BAD_PARAMETER;
}
if (glt->invalidate == NULL) {
return NSERROR_BAD_PARAMETER;
}
if (glt->initialise == NULL) {
return NSERROR_BAD_PARAMETER;
}
if (glt->finalise == NULL) {
return NSERROR_BAD_PARAMETER;
}
return NSERROR_OK;
}
static nsurl *gui_default_get_resource_url(const char *path) static nsurl *gui_default_get_resource_url(const char *path)
{ {
return NULL; return NULL;
@ -622,6 +651,16 @@ nserror gui_factory_register(struct netsurf_table *gt)
return err; return err;
} }
/* llcache table */
if (gt->llcache == NULL) {
/* set default backing store table */
gt->llcache = null_llcache_table;
}
err = verify_llcache_register(gt->llcache);
if (err != NSERROR_OK) {
return err;
}
guit = gt; guit = gt;
return NSERROR_OK; return NSERROR_OK;

View File

@ -67,11 +67,23 @@
*/ */
#define SPECULATE_SMALL 4096 #define SPECULATE_SMALL 4096
/* the time between cache clean runs in ms */ /** the time between image cache clean runs in ms. */
#define IMAGE_CACHE_CLEAN_TIME (10 * 1000) #define IMAGE_CACHE_CLEAN_TIME (10 * 1000)
/** default time between content cache cleans. */
#define HL_CACHE_CLEAN_TIME (2 * IMAGE_CACHE_CLEAN_TIME) #define HL_CACHE_CLEAN_TIME (2 * IMAGE_CACHE_CLEAN_TIME)
/** default minimum object time before object is pushed to backing store. */
#define LLCACHE_MIN_DISC_LIFETIME (60 * 30)
/** default maximum bandwidth for backing store writeout. */
#define LLCACHE_MAX_DISC_BANDWIDTH (128 * 1024)
/** ensure there is a minimal amount of memory for source objetcs and
* decoded bitmaps.
*/
#define MINIMUM_MEMORY_CACHE_SIZE (2 * 1024 * 1024)
bool netsurf_quit = false; bool netsurf_quit = false;
static void netsurf_lwc_iterator(lwc_string *str, void *pw) static void netsurf_lwc_iterator(lwc_string *str, void *pw)
@ -108,8 +120,6 @@ static nserror netsurf_llcache_query_handler(const llcache_query *query,
return NSERROR_OK; return NSERROR_OK;
} }
#define MINIMUM_MEMORY_CACHE_SIZE (2 * 1024 * 1024)
/* exported interface documented in desktop/netsurf.h */ /* exported interface documented in desktop/netsurf.h */
nserror netsurf_register(struct netsurf_table *table) nserror netsurf_register(struct netsurf_table *table)
{ {
@ -118,14 +128,17 @@ nserror netsurf_register(struct netsurf_table *table)
} }
/* exported interface documented in desktop/netsurf.h */ /* exported interface documented in desktop/netsurf.h */
nserror netsurf_init(const char *messages) nserror netsurf_init(const char *messages, const char *store_path)
{ {
nserror error; nserror ret;
struct utsname utsname; struct utsname utsname;
nserror ret = NSERROR_OK;
struct hlcache_parameters hlcache_parameters = { struct hlcache_parameters hlcache_parameters = {
.bg_clean_time = HL_CACHE_CLEAN_TIME, .bg_clean_time = HL_CACHE_CLEAN_TIME,
.cb = netsurf_llcache_query_handler, .llcache = {
.cb = netsurf_llcache_query_handler,
.minimum_lifetime = LLCACHE_MIN_DISC_LIFETIME,
.bandwidth = LLCACHE_MAX_DISC_BANDWIDTH,
}
}; };
struct image_cache_parameters image_cache_parameters = { struct image_cache_parameters image_cache_parameters = {
.bg_clean_time = IMAGE_CACHE_CLEAN_TIME, .bg_clean_time = IMAGE_CACHE_CLEAN_TIME,
@ -155,75 +168,86 @@ nserror netsurf_init(const char *messages)
messages_load(messages); messages_load(messages);
/* corestrings init */ /* corestrings init */
error = corestrings_init(); ret = corestrings_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
/* set up cache limits based on the memory cache size option */ /* set up cache limits based on the memory cache size option */
hlcache_parameters.limit = nsoption_int(memory_cache_size); hlcache_parameters.llcache.limit = nsoption_int(memory_cache_size);
if (hlcache_parameters.limit < MINIMUM_MEMORY_CACHE_SIZE) { if (hlcache_parameters.llcache.limit < MINIMUM_MEMORY_CACHE_SIZE) {
hlcache_parameters.limit = MINIMUM_MEMORY_CACHE_SIZE; hlcache_parameters.llcache.limit = MINIMUM_MEMORY_CACHE_SIZE;
LOG(("Setting minimum memory cache size to %d", LOG(("Setting minimum memory cache size %d",
hlcache_parameters.limit)); hlcache_parameters.llcache.limit));
} }
/* image cache is 25% of total memory cache size */ /* image cache is 25% of total memory cache size */
image_cache_parameters.limit = (hlcache_parameters.limit * 25) / 100; image_cache_parameters.limit = (hlcache_parameters.llcache.limit * 25) / 100;
/* image cache hysteresis is 20% of the image cache size */ /* image cache hysteresis is 20% of the image cache size */
image_cache_parameters.hysteresis = (image_cache_parameters.limit * 20) / 100; image_cache_parameters.hysteresis = (image_cache_parameters.limit * 20) / 100;
/* account for image cache use from total */ /* account for image cache use from total */
hlcache_parameters.limit -= image_cache_parameters.limit; hlcache_parameters.llcache.limit -= image_cache_parameters.limit;
/* set backing store target limit */
hlcache_parameters.llcache.store.limit = nsoption_int(disc_cache_size);
/* set backing store hysterissi to 20% */
hlcache_parameters.llcache.store.hysteresis = (hlcache_parameters.llcache.store.limit * 20) / 100;;
/* set the path to the backing store */
hlcache_parameters.llcache.store.path = store_path;
/* image handler bitmap cache */ /* image handler bitmap cache */
error = image_cache_init(&image_cache_parameters); ret = image_cache_init(&image_cache_parameters);
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
/* content handler initialisation */ /* content handler initialisation */
error = nscss_init(); ret = nscss_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
error = html_init(); ret = html_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
error = image_init(); ret = image_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
error = textplain_init(); ret = textplain_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
error = mimesniff_init(); ret = mimesniff_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
url_init(); url_init();
setlocale(LC_ALL, "C"); setlocale(LC_ALL, "C");
/* initialise the fetchers */ /* initialise the fetchers */
error = fetch_init(); ret = fetch_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
/* Initialise the hlcache and allow it to init the llcache for us */ /* Initialise the hlcache and allow it to init the llcache for us */
hlcache_initialise(&hlcache_parameters); ret = hlcache_initialise(&hlcache_parameters);
if (ret != NSERROR_OK)
return ret;
/* Initialize system colours */ /* Initialize system colours */
error = ns_system_colour_init(); ret = ns_system_colour_init();
if (error != NSERROR_OK) if (ret != NSERROR_OK)
return error; return ret;
js_initialise(); js_initialise();
return ret; return NSERROR_OK;
} }

View File

@ -43,7 +43,7 @@ nserror netsurf_register(struct netsurf_table *table);
* @param messages path to translation mesage file. * @param messages path to translation mesage file.
* @return NSERROR_OK on success or error code on faliure. * @return NSERROR_OK on success or error code on faliure.
*/ */
nserror netsurf_init(const char *messages); nserror netsurf_init(const char *messages, const char *store_path);
/** /**
* Run event loop. * Run event loop.

View File

@ -1837,7 +1837,7 @@ main(int argc, char** argv)
/* common initialisation */ /* common initialisation */
messages = filepath_find(respaths, "Messages"); messages = filepath_find(respaths, "Messages");
ret = netsurf_init(messages); ret = netsurf_init(messages, NULL);
free(messages); free(messages);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");

View File

@ -21,6 +21,9 @@ NETSURF_USE_NSSVG := AUTO
# Valid options: YES, NO, AUTO # Valid options: YES, NO, AUTO
NETSURF_USE_ROSPRITE := AUTO NETSURF_USE_ROSPRITE := AUTO
# Enable building the source object cache filesystem based backing store.
NETSURF_FS_BACKING_STORE := YES
# Configuration overrides for Mac OS X # Configuration overrides for Mac OS X
ifeq ($(HOST),macosx) ifeq ($(HOST),macosx)
NETSURF_USE_LIBICONV_PLUG := NO NETSURF_USE_LIBICONV_PLUG := NO

116
gtk/gui.c
View File

@ -44,6 +44,7 @@
#include "content/fetchers/resource.h" #include "content/fetchers/resource.h"
#include "content/hlcache.h" #include "content/hlcache.h"
#include "content/urldb.h" #include "content/urldb.h"
#include "content/backing_store.h"
#include "desktop/browser.h" #include "desktop/browser.h"
#include "desktop/gui.h" #include "desktop/gui.h"
#include "desktop/netsurf.h" #include "desktop/netsurf.h"
@ -1100,11 +1101,111 @@ static nserror create_config_home(char **config_home_out)
/* strip the trailing separator */ /* strip the trailing separator */
config_home[strlen(config_home) - 1] = 0; config_home[strlen(config_home) - 1] = 0;
LOG(("\"%s\"", config_home));
*config_home_out = config_home; *config_home_out = config_home;
return NSERROR_OK; return NSERROR_OK;
} }
/**
* Get the path to the cache directory.
*
* @param cache_home_out Path to cache directory.
* @return NSERROR_OK on sucess and \a cache_home_out updated else error code.
*/
static nserror get_cache_home(char **cache_home_out)
{
nserror ret;
char *xdg_cache_dir;
char *cache_home;
char *home_dir;
/* $XDG_CACHE_HOME defines the base directory relative to
* which user specific non-essential data files should be
* stored.
*/
xdg_cache_dir = getenv("XDG_CACHE_HOME");
if ((xdg_cache_dir == NULL) || (*xdg_cache_dir == 0)) {
/* If $XDG_CACHE_HOME is either not set or empty, a
* default equal to $HOME/.cache should be used.
*/
home_dir = getenv("HOME");
/* the HOME envvar is required */
if (home_dir == NULL) {
return NSERROR_NOT_DIRECTORY;
}
ret = check_dirname(home_dir, ".cache/netsurf", &cache_home);
if (ret != NSERROR_OK) {
return ret;
}
} else {
ret = check_dirname(xdg_cache_dir, "netsurf", &cache_home);
if (ret != NSERROR_OK) {
return ret;
}
}
LOG(("\"%s\"", cache_home));
*cache_home_out = cache_home;
return NSERROR_OK;
}
static nserror create_cache_home(char **cache_home_out)
{
char *cache_home = NULL;
char *home_dir;
char *xdg_cache_dir;
nserror ret;
LOG(("Attempting to create configuration directory"));
/* $XDG_CACHE_HOME defines the base directory
* relative to which user specific cache files
* should be stored.
*/
xdg_cache_dir = getenv("XDG_CACHE_HOME");
if ((xdg_cache_dir == NULL) || (*xdg_cache_dir == 0)) {
home_dir = getenv("HOME");
if ((home_dir == NULL) || (*home_dir == 0)) {
return NSERROR_NOT_DIRECTORY;
}
ret = netsurf_mkpath(&cache_home, NULL, 4, home_dir, ".cache", "netsurf", "/");
if (ret != NSERROR_OK) {
return ret;
}
} else {
ret = netsurf_mkpath(&cache_home, NULL, 3, xdg_cache_dir, "netsurf", "/");
if (ret != NSERROR_OK) {
return ret;
}
}
/* ensure all elements of path exist (the trailing / is required) */
ret = filepath_mkdir_all(cache_home);
if (ret != NSERROR_OK) {
free(cache_home);
return ret;
}
/* strip the trailing separator */
cache_home[strlen(cache_home) - 1] = 0;
LOG(("\"%s\"", cache_home));
*cache_home_out = cache_home;
return NSERROR_OK;
}
static nserror nsgtk_option_init(int *pargc, char** argv) static nserror nsgtk_option_init(int *pargc, char** argv)
{ {
nserror ret; nserror ret;
@ -1162,6 +1263,7 @@ static struct gui_browser_table nsgtk_browser_table = {
int main(int argc, char** argv) int main(int argc, char** argv)
{ {
char *messages; char *messages;
char *cache_home = NULL;
nserror ret; nserror ret;
struct netsurf_table nsgtk_table = { struct netsurf_table nsgtk_table = {
.browser = &nsgtk_browser_table, .browser = &nsgtk_browser_table,
@ -1170,6 +1272,7 @@ int main(int argc, char** argv)
.download = nsgtk_download_table, .download = nsgtk_download_table,
.fetch = nsgtk_fetch_table, .fetch = nsgtk_fetch_table,
.search = nsgtk_search_table, .search = nsgtk_search_table,
.llcache = filesystem_llcache_table,
}; };
ret = netsurf_register(&nsgtk_table); ret = netsurf_register(&nsgtk_table);
@ -1210,9 +1313,20 @@ int main(int argc, char** argv)
/* Obtain path to messages */ /* Obtain path to messages */
messages = filepath_find(respaths, "Messages"); messages = filepath_find(respaths, "Messages");
/* Locate the correct user cache directory path */
ret = get_cache_home(&cache_home);
if (ret == NSERROR_NOT_FOUND) {
/* no cache directory exists yet so try to create one */
ret = create_cache_home(&cache_home);
}
if (ret != NSERROR_OK) {
LOG(("Unable to locate a cache directory."));
}
/* core initialisation */ /* core initialisation */
ret = netsurf_init(messages); ret = netsurf_init(messages, cache_home);
free(messages); free(messages);
free(cache_home);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
fprintf(stderr, "NetSurf core failed to initialise (%s)\n", fprintf(stderr, "NetSurf core failed to initialise (%s)\n",
messages_get_errorcode(ret)); messages_get_errorcode(ret));

View File

@ -155,7 +155,7 @@ main(int argc, char **argv)
/* common initialisation */ /* common initialisation */
messages = filepath_find(respaths, "Messages"); messages = filepath_find(respaths, "Messages");
ret = netsurf_init(messages); ret = netsurf_init(messages, NULL);
free(messages); free(messages);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");

View File

@ -2542,7 +2542,7 @@ int main(int argc, char** argv)
} }
/* common initialisation */ /* common initialisation */
ret = netsurf_init(path); ret = netsurf_init(path, NULL);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
die("NetSurf failed to initialise"); die("NetSurf failed to initialise");
} }

View File

@ -164,7 +164,7 @@ WinMain(HINSTANCE hInstance, HINSTANCE hLastInstance, LPSTR lpcli, int ncmd)
/* common initialisation */ /* common initialisation */
messages = filepath_find(respaths, "messages"); messages = filepath_find(respaths, "messages");
ret = netsurf_init(messages); ret = netsurf_init(messages, NULL);
free(messages); free(messages);
if (ret != NSERROR_OK) { if (ret != NSERROR_OK) {
free(options_file_location); free(options_file_location);