2011-08-10 01:46:13 +04:00
|
|
|
/*
|
2013-02-07 06:05:00 +04:00
|
|
|
* Copyright 2011 Haiku, Inc. All rights reserved.
|
2011-08-10 01:46:13 +04:00
|
|
|
* Distributed under the terms of the MIT License.
|
|
|
|
*
|
|
|
|
* Authors:
|
|
|
|
* Axel Dörfler, axeld@pinc-software.de
|
|
|
|
* Adrien Destugues <pulkomandy@pulkomandy.ath.cx>
|
|
|
|
* John Scipione, jscipione@gmail.com
|
|
|
|
*
|
|
|
|
* Corresponds to:
|
2013-02-07 06:05:00 +04:00
|
|
|
* headers/os/locale/Collator.h rev 42274
|
|
|
|
* src/kits/locale/Collator.cpp rev 42274
|
2011-08-10 01:46:13 +04:00
|
|
|
*/
|
|
|
|
|
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\file Collator.h
|
2013-02-07 06:05:00 +04:00
|
|
|
\ingroup locale
|
|
|
|
\ingroup libbe
|
2011-08-10 01:46:13 +04:00
|
|
|
\brief Provides the BCollator class.
|
|
|
|
*/
|
2010-08-10 16:39:48 +04:00
|
|
|
|
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
/*!
|
|
|
|
\class BCollator
|
|
|
|
\ingroup locale
|
2013-02-07 06:05:00 +04:00
|
|
|
\ingroup libbe
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Class for handling locale-aware collation (sorting) of strings.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
BCollator is designed to handle collation (sorting) of strings. Unlike
|
|
|
|
string sorting using strcmp() or similar functions that compare raw bytes
|
|
|
|
the collation is done using a set of rules that changes from one locale
|
|
|
|
to another. For example, in Spanish, 'ch' is considered to be a letter
|
|
|
|
and is sorted between 'c' and 'd'. This class is also able to perform
|
|
|
|
natural number sorting so that 2 is sorted before 10 unlike byte-based
|
|
|
|
sorting.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\warning This class is not multithread-safe, as Compare() change the
|
|
|
|
ICUCollator (the strength). So if you want to use a BCollator from
|
|
|
|
more than one thread you need to protect it with a lock.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn BCollator::BCollator()
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Construct a collator with the default locale and strength.
|
|
|
|
|
|
|
|
\attention The default collator should be constructed by the BLocale
|
|
|
|
instead since it is aware of the currently defined locale.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
This constructor uses \c B_COLLATE_PRIMARY strength.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn BCollator::BCollator(const char* locale,
|
|
|
|
int8 strength = B_COLLATE_PRIMARY, bool ignorePunctuation = false)
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Construct a collator for the given \a locale and \a strength.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
This constructor loads the data for the given locale. You can also
|
2011-11-11 07:55:09 +04:00
|
|
|
set the \a strength and choose if the collator should take
|
|
|
|
punctuation into account or not.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\param locale The \a locale to build the constructor for.
|
|
|
|
\param strength The collator class provide four level of \a strength.
|
2011-08-10 01:46:13 +04:00
|
|
|
\li \c B_COLLATE_PRIMARY doesn't differentiate e from é,
|
|
|
|
\li \c B_COLLATE_SECONDARY takes letter accents into account,
|
|
|
|
\li \c B_COLLATE_TERTIARY is case sensitive,
|
|
|
|
\li \c B_COLLATE_QUATERNARY is very strict. Most of the time you
|
2011-11-11 07:55:09 +04:00
|
|
|
shouldn't need to go this far.
|
|
|
|
\param ignorePunctuation Ignore punctuation during sorting.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn BCollator::BCollator(BMessage* archive)
|
|
|
|
\brief Unarchive a collator from a message.
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\param archive The message to unarchive the BCollator object from.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn BCollator::BCollator(const BCollator& other)
|
|
|
|
\brief Copy constructor.
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
Copies a BCollator object from another BCollator object.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
|
|
|
\param other The BCollator to copy from.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn BCollator::~BCollator()
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Destructor method.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
Deletes the BCollator object freeing the resources it consumes.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn Bcollator& BCollator::operator=(const BCollator& other)
|
|
|
|
\brief Assignment operator.
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\param other the BCollator object to assign from.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn void BCollator::SetDefaultStrength(int8 strength)
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Set the \a strength of the collator.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
Note that the \a strength can also be chosen on a case-by-case basis
|
2011-08-10 01:46:13 +04:00
|
|
|
when calling other methods.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\param strength The collator class provide four level of \a strength.
|
2011-08-10 01:46:13 +04:00
|
|
|
\li \c B_COLLATE_PRIMARY doesn't differentiate e from é,
|
|
|
|
\li \c B_COLLATE_SECONDARY takes letter accents into account,
|
|
|
|
\li \c B_COLLATE_TERTIARY is case sensitive,
|
|
|
|
\li \c B_COLLATE_QUATERNARY is very strict. Most of the time you
|
2011-11-11 07:55:09 +04:00
|
|
|
shouldn't need to go this far.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn int8 BCollator::DefaultStrength() const
|
|
|
|
\brief Get the current strength of this catalog.
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\returns The current strength of the catalog.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn void BCollator::SetIgnorePunctuation(bool ignore)
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Enable or disable punctuation handling.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
This function enables or disables the handling of punctuation.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\param ignore Boolean indicating whether or not punctuation should
|
|
|
|
be ignored.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn bool BCollator::IgnorePunctuation() const
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Gets the behavior of the collator with regards to punctuation.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\returns \c true if the collator will take punctuation into account
|
|
|
|
when sorting, \c false otherwise.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-11-11 07:55:09 +04:00
|
|
|
\fn status_t BCollator::GetSortKey(const char* string, BString* key,
|
2011-08-10 01:46:13 +04:00
|
|
|
int8 strength) const
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Compute the sortkey of a \a string.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
The sortkey is a modified version of the input \a string that you can use
|
|
|
|
to perform faster comparisons with other sortkeys using strcmp() or a
|
|
|
|
similar comparison function. If you need to compare one string with other
|
|
|
|
many times, storing the sortkey will allow you to perform the comparisons
|
|
|
|
faster.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
\param string String from which to compute the sortkey.
|
|
|
|
\param key The resulting sortkey.
|
2011-11-11 07:55:09 +04:00
|
|
|
\param strength The \a strength to use to compute the sortkey.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\retval B_OK if everything went well.
|
|
|
|
\retval B_ERROR if an error occurred generating the sortkey.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn int BCollator::Compare(const char* s1, const char* s2,
|
|
|
|
int8 strength) const
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Returns the difference betweens the two strings according to the
|
|
|
|
collation defined by the \a strength parameter.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
This method should be used in place of the strcmp() function to perform
|
|
|
|
locale-aware comparisons.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
\param s1 The first string to compare.
|
|
|
|
\param s2 The second string to compare.
|
2011-11-11 07:55:09 +04:00
|
|
|
\param strength The \a strength to use for the string comparison.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
|
|
|
\retval 0 if the strings are equal.
|
|
|
|
\retval <0 if s1 is less than s2.
|
|
|
|
\retval >0 if s1 is greater than s2.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn bool BCollator::Equal(const char* s1, const char* s2,
|
|
|
|
int8 strength) const
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Compares two strings for equality.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
Note that strings that are not byte-by-byte identical may end up being
|
|
|
|
treated as equal by this method. For example two strings may be
|
|
|
|
considered equal if the only differences between them are in case and
|
|
|
|
punctuation, depending on the \a strength used. Using
|
|
|
|
\c B_QUANTERNARY_STRENGTH will force this method return \c true only
|
|
|
|
if the strings are byte-for-byte identical.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
\param s1 The first string to compare.
|
|
|
|
\param s2 The second string to compare.
|
2011-11-11 07:55:09 +04:00
|
|
|
\param strength The \a strength to use for the string comparison.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\returns \c true if the strings are identical, \c false otherwise.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2013-10-04 18:52:49 +04:00
|
|
|
\fn bool BCollator::Greater(const char* s1, const char* s2,
|
|
|
|
int8 strength = B_COLLATE_DEFAULT) const
|
2011-08-10 01:46:13 +04:00
|
|
|
\brief Determine if a string is greater than another.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-11-11 13:19:19 +04:00
|
|
|
\note !Greater(s1, s2) is the same as GreaterOrEqual(s2, s1). This means
|
|
|
|
there is no need for Lesser(s1, s2) and LesserOrEqual(s1, s2) methods.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
\param s1 The first string to compare.
|
|
|
|
\param s2 The second string to compare.
|
2011-11-11 07:55:09 +04:00
|
|
|
\param strength The \a strength to use for the string comparison.
|
2011-08-10 01:46:13 +04:00
|
|
|
|
|
|
|
\returns \c true if s1 is greater than, but not equal to, s2.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-10 16:39:48 +04:00
|
|
|
/*!
|
2013-10-04 18:52:49 +04:00
|
|
|
\fn bool BCollator::GreaterOrEqual(const char* s1, const char* s2,
|
|
|
|
int8 strength = B_COLLATE_DEFAULT) const
|
2011-11-11 07:55:09 +04:00
|
|
|
\brief Determines if one string is greater than another.
|
|
|
|
|
2011-11-11 13:19:19 +04:00
|
|
|
\note !GreaterOrEqual(s1, s2) is the same as Greater(s2, s1).
|
2011-08-10 01:46:13 +04:00
|
|
|
|
|
|
|
\param s1 The first string to compare.
|
|
|
|
\param s2 The second string to compare.
|
2011-11-11 07:55:09 +04:00
|
|
|
\param strength The \a strength to use for the string comparison.
|
2010-08-10 16:39:48 +04:00
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
\returns \c true if s1 is greater or equal than s2.
|
2010-08-10 16:39:48 +04:00
|
|
|
*/
|
2010-08-11 12:09:12 +04:00
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
|
2010-08-11 12:09:12 +04:00
|
|
|
/*!
|
2011-08-10 01:46:13 +04:00
|
|
|
\fn static BArchivable* BCollator::Instantiate(BMessage* archive)
|
|
|
|
\brief Unarchive the collator
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
This method allows you to restore a collator that you previously
|
|
|
|
archived. It is faster to archive and unarchive a collator than it is
|
|
|
|
to create a new one up each time you need a BCollator object with the
|
|
|
|
same settings.
|
2010-08-11 12:09:12 +04:00
|
|
|
|
2011-08-10 01:46:13 +04:00
|
|
|
\param archive The message to restore the collator from.
|
|
|
|
|
2011-11-11 07:55:09 +04:00
|
|
|
\returns A pointer to a BArchivable object containing the BCollator or
|
|
|
|
\c NULL if an error occurred restoring the \a archive.
|
2010-08-11 12:09:12 +04:00
|
|
|
*/
|