summaryrefslogtreecommitdiff
path: root/libs/hunspell/docs/ChangeLog
diff options
context:
space:
mode:
Diffstat (limited to 'libs/hunspell/docs/ChangeLog')
-rw-r--r--libs/hunspell/docs/ChangeLog1993
1 files changed, 0 insertions, 1993 deletions
diff --git a/libs/hunspell/docs/ChangeLog b/libs/hunspell/docs/ChangeLog
deleted file mode 100644
index 1f6e774a63..0000000000
--- a/libs/hunspell/docs/ChangeLog
+++ /dev/null
@@ -1,1993 +0,0 @@
-2016-04-29 Caolán McNamara <caolanm at LibO>:
- * deprecate old api and add new one
- old one remains implemented in terms of new one
- and will eventually be removed
- * shrink exposed api down to just hunspell.hxx
- * next major release is likely to require C++11
-
-2016-04-15 Caolán McNamara <caolanm at LibO>:
- * generally using std::string and std::vector internally
-
-2016-04-13 Caolán McNamara <caolanm at LibO>:
- * gh#371 drop experimental code
-
-2015-09-11 Caolán McNamara <caolanm at LibO>:
- * rhbz#1261421 crash on mashing hangul korean keyboard
-
-2014-12-03 Németh László <nemeth at numbertext dot org>:
- * tools/hunspell.cxx: security fixes of the Hunspell executable
- - secure file name handling, the problem (checking
- OpenDocument files with malicious file names)
- reported by Eric Sesterhenn
- - using tmpnam() only with system("mkdir tempname && ...")
-
-2014-10-17 Caolán McNamara <caolanm at LibO>:
- * sf#245 Feature from Anish Patil -S mode
- to show suggestions for completion of
- correctly spelled words
- * sf#248 Fix manpage about how to include
-
-2014-10-16 Caolán McNamara <caolanm at LibO>:
- * rhbz#915448, sf#57, sf#185 report character offset
- and not byte offset in ispell mode
- * sf#56 segv in experimental mode
- * sf#228 don't translate init string
-
-2014-09-22 Németh László <nemeth at numbertext dot org>:
- * fix crash in morphological analysis of the Hungarian
- compound word 'művészegyéniség', reported by Gáspár Sinai
-
-2014-08-26 Németh László <nemeth at numbertext dot org>:
- * unmunch separates flags of prefixes from the word,
- bug reported by Daniel Naber
-
-2014-08-05 Németh László <nemeth at numbertext dot org>:
- * moz#318040 Mozzilla accepts abbreviations without dots
- * myfopen(): add _wfullpath to expand relative parts of absolute paths
-
-2014-07-16 Caolán McNamara <caolanm at LibO>:
- * moz#675553 Switch from PRBool to bool
- * moz#690892 replace PR_TRUE/PR_FALSE with true/false
- * Silence the warning about empty while body loop in clang
- * moz#777292 Make nsresult an enum
- * moz#579517 Use stdint types in gecko
- * moz#784776 consistently use FLAG_NULL
- * moz#927728 Convert PRUnichar to char16_t
- * moz#943268 Remove nsCharsetAlias and nsCharsetConverterManager
- * Don't include config.h in license.hunspell if MOZILLA_CLIENT is set
-
-2014-06-26 Caolán McNamara <caolanm at LibO>:
- * clang scan-build: Allocator sizeof operand mismatch
- * clang scan-build: other low hanging warnings
- * clang scan-build: significant warnings
-
-2014-06-02 Németh László <nemeth at numbertext dot org>:
- * escape spaces in paths of ODF files
-
-2014-05-28 Németh László <nemeth at numbertext dot org>:
- * add long path/Unicode path support in WIN32 environment:
- - hunspell#233 (reported by mahak gark) and LibreOffice fdo#48017
- * flat ODF support, eg.:
- hunspell doc.fodt
- cat doc.fodt | hunspell -l -O
- * new options:
- - -X (XML) input format
- - -O (ODF or flat ODF) input format
- - --check-apostrophe: check and force Unicode apostrophe usage
- (ASCII or Unicode apostrophe has to be in the
- WORDCHARS section of the affix file)
- * fix ODF support:
- - break 1-line XML of ODT documents at </style:style>, too,
- not only at </text:p> (limiting tokenization problems, when
- fgets stops within an XML tag)
- - show ODF file path on the UI instead of the temporary file
- * fix XML support:
- - ', ", &, < and > in replacements converted to XML entities
- - recognize &apos at tokenization, depending from WORDCHARS
- - &apos; in tokens converted to ' before spell checking and
- in the output of the pipe interface
- * better apostrophe usage:
- - WORDCHARS only with one of the Unicode or ASCII apostrophe
- results extended word tokenization: both of them will be part of
- the words (if they are inside: eg. word's, but not words').
- - convert Unicode apostrophes to ASCII ones for 8-bit dictionaries
- (eg. English dictionaries), or for UTF-8 dictionaries only
- with ASCII apostrophe supports (eg. French dictionaries).
- * updated manual:
- - hunspell.4 renamed to hunspell.5, see
- hunspell#241 reported by Cristopher Yeleighton
- - updated translations
- - note about long/Unicode paths in WIN32 (hunspell.3)
-
-2014-04-25 Németh László <nemeth at numbertext dot org>:
- * OpenDocument support, eg.
- hunspell *.odt
- hunspell -l *.odt
- * always load default personal dictionary (fix
- filtering bad words - reduce this word list - using
- it as a personal dictionary workflow)
- * fix parsing/URL recognition problem (bad tokens
- with aposthrophes)
-
-2013-07-25 pchang9@cs.wisc.edu
- * moz#897255 Wasted work in line_uniq
- * moz#897780 Wasted work in SuggestMgr::twowords
-
-2013-07-25 Caolán McNamara <caolanm at LibO>:
- * hunspell#167 layout problems with long lines
- - based on the original fix by xorho
- adapted to HEAD
- * rhbz#925562 upgrade config.guess for aarch64
-
-2013-07-24 pchang9@cs.wisc.edu
- * moz#896301 Wasted work in SfxEntry::checkword
- * moz#896844 Wasted work in AffixMgr::defcpd_check
-
-2013-06-13 Konstantin Khlebniko
- * #49 HashMgr::add_word computes wrong size for struct hentry
-
-2013-06-13 Ville Skyttä
- * #53 Man page syntax fixes
-
-2013-04-19 John Thomson <john thomson at SIL>
- * win_api: add remove() of Hunspell API (hun#3606435)
-
-2013-04-19 Rouslan Solomokhin <at sf.net>
- * fix crash in suggestions for 99-character long words
- by extending arrays of SuggestMgr::forgotchar_*
- (hun#3595024, also http://crbug.com/130128),
- thanks to also Paweł Hajdan to report the patch
-
-2013-04-01 Caolán McNamara <caolanm at LibO>:
- * hunspell: -Werror=undef
-
-2013-03-13 Caolán McNamara <caolanm at LibO>:
- * rhbz#918938 crash in interaction with danish thesaurus
-
-2012-09-18 Németh László <nemeth at numbertext dot org>:
- * src/hunspell/affixmgr.*: - fix morphological analysis of
- compound words (hun#3544994, reported by Dávid Nemeskey, fdo#55045)
-
-2012-06-29 Caolán McNamara <caolanm at LibO>:
- * fix various coverity warnings
-
-2012-01-10 Ehsan Akhgari <ehsan at mozilla dot com>
- * moz#710940 Firefox Crash [@ AffixMgr::parse_file(char const*, char
- const*) ]
-
-2011-12-16 Jared Wein <jwein at mozilla dot com>
- * moz#710967 Incorrect argument passed to strncmp in
- AffixMgr::parse_convtable
-
-2011-12-06 Caolán McNamara <caolanm at LibO>:
- * rhbz#759647 fixed tempname of hunSPELL.bak collides with other users
- when multiple edits in one dir
-
-2011-10-13 Caolán McNamara <caolanm at LibO>:
- * moz#694002 crash in hunspell affixmgr on exit with bad .aff
- * leak in hunspell affixmgr with bad .aff
-
-2011-09-19 Caolán McNamara <caolanm at LibO>:
- * make libparsers.a not installed thanks to Tomáš Chvátal
-
-2011-06-23 Caolán McNamara <caolanm at LibO>:
- * fix some windows compiler warnings
-
-2011-05-24 Németh László <nemeth at numbertext dot org>:
- * src/hunspell/affixmgr.*: allow twofold suffixes in compounds
- by extended version of Arno Teigseth's patch, see hun#3288562.
- - new option for this feature: COMPOUNDMORESUFFIXES
-
-2011-02-16 Németh László <nemeth at numbertext dot org>:
- * src/*/Makefile.am: fix library versioning, the probem reported by
- Rene Engerhald and Simon Brouwer.
-
- * man/hunspell.4: new version based on the revised version of Ruud Baars
-
-2011-02-02 Németh László <nemeth at OOo>:
- * suggestngr.cxx: fix ngram PHONE suggestion for input words with
- diacritics using UTF-8 encoded dictionaries (add byte length to the
- 8-bit phonet() argument instead of character length)
-
- * suggestmgr.cxx: fix missing csconv problem with UTF-8 encoding
- dictionares, when the input contains non-BMP characters
- - tests/utf8_nonbmp.sug: test file
-
- * suggestmgr.cxx: mixed and keyboard based character suggestions
- don't forbid ngram suggestion search (optimized tests/suggestiontest)
-
- * affixmgr.cxx: fix hun#2999225: interfering compounding mechanisms,
- tested on Dutch word list and reported by Ruud Baars
-
- * affixmgr.cxx: allomorph fix for hun#2970240 (Hungarian
- compound "vadász+gép" was analyzed as vad+ász+gép, and rejected
- by the ss->s rep rule (verb "vadássz"), but the analysis
- didn't continue for the longer word parts (vadász+gép).
-
- * csutil.cxx: add lang code "az_AZ", "hu_HU", "tr_TR" for back
- compatibility (fixing Azeri and Turkish casing conversion, also
- Hungarian compound handling)
-
- * affixmgr.cxx: fix morphological analysis
-
-2011-01-26 Németh László <nemeth at OOo>:
- * affixmgr.cxx: fix for moz#626195 (memcheck problem with FULLSTRIP).
-
- * affixmgr.*, suggestmgr.cxx: FORBIDWARN parameter (see manual)
-
-2011-01-24 Németh László <nemeth at OOo>:
- * suffixmgr.cxx: fix bad suggestion of forbidden compound words, eg.
- "termijndoel" with the Dutch dictionary. Reported by Ruud Baars.
-
- * latexparser.cxx: fix double apostrophe TeX quoation mark tokenization
- (hun#3119776), reported by Wybodekker at SF.net.
-
- * tests/suggestiontest/*: multilanguage and single Hunspell version, see README
- * tests/suggestiontest/prepare2: for make -f Makefile.orig single
-
-2011-01-22 Németh László <nemeth at OOo>:
- * affixmgr.*, suggestmgr.*: new features
- ONLYMAXDIFF: remove all bad ngram suggestions (default mode keeps one)
- NONGRAMSUGGEST: similar to NOSUGGEST, but it forbids to use the word
- in ngram based (more, than 1-character distance) suggestions.
-
-2011-01-21 Németh László <nemeth at OOo>:
- * suggestmgr.*: limit wild suggestions (hun#2970237 by Ruud Baars)
- - limited compound word suggestions
- - improved and limited ngram based suggestions
- * tests/*.sug: modified test files
- - feature MAXCPDSUGS:
- MAXCPDSUGS 0 : no compound suggestion, suggested by
- Finn Gruwier Larsen in hunfeat#2836033
- MAXCPDSUGS n : max. ~n compound suggestions
- - feature MAXDIFF: differency limit for ngram suggestions: 0-10
- eg. MAXDIFF 5: normal (default) limit
- MAXDIFF 0: only one ngram suggestion
- MAXDIFF 10: ~maxngramsugs ngram suggestions
-
- * affixmgr.*, hunspell.*: add flag FORCEUCASE (hun#2999228), force
- capitalization of compound words, see Hunspell 4 manual),
- suggested by Ruud Baars
- test/forceucase.*: test files
-
- * affixmgr.*, hunspell.*: add flag WARN (hun#1808861), optional warning feature
- for rare words, suggested by Ruud Baars
- tests/warn: test files
- * tools/hunspell.cxx: add option -r for optional filtering of rare words
-
- * affixmgr.cxx: fix hun#3161359 (gcc warnings) reported by Ryan VanderMeulen.
-
-2011-01-17 Németh László <nemeth at OOo>:
- * suggestmgr.cxx: fix hun#3158994 and hun#3159027 (missing csconv table
- using awkward 8bit capitalization of UTF-8 encoded dictionary words with PHONE
- suggestion, reported by benjarobin and dicollecte at SF.net).
-
-2011-01-13 Németh László <nemeth at OOo>:
- * affixmgr.cxx: ONLYINCOMPOUND fix for hun#2999224 (fogemorphene
- was allowed in end position of compoundings). Reported by Ruud Baars.
- * tests/onlyincompound2.*: test files
-
-2011-01-10 Ingo H. de Boer <idb_winshell at SF.net>:
- * win_api/{hunspell,libhunspell, testparser}.vcproj: updated project
- files for the library and the executables. Compiling problem
- also reported by Don Walker.
-
-2011-01-06 Németh László <nemeth at OOo>:
- * affixmgr.cxx: fix freedesktop#32850 (program halt during Hungarian
- spell checking of the word "6csillagocska6", reported by András Tímár)
-
- * tools/hunspell.cxx: add Mac OS X Hunspell dictionary paths, asked by
- Vidar Gundersen in hunfeat#3142010
-
-2011-01-05 Caolán McNamara <cmc at OOo>:
- * moz#620626 NS_UNICHARUTIL_CID doesn't support
- case conversion
-
-2011-01-03 Németh László <nemeth at OOo>:
- * NEWS and THANKS: update for release 1.2.13
-
-2010-12-20 Németh László <nemeth at OOo>:
- * affixmgr.cxx: hun#3140784
-
-2010-12-16 Németh László <nemeth at OOo>:
- * affixmgr.cxx:
- - improved fix of hun#2970242 (supporting
- zero affixes, reported by Ruud Baars
- - tests/opentaal_cpdpat{,2}: test files
-
- - switching off default BREAK parameters by BREAK 0,
- reported by Ruud Baars
-
- - hun#2999225: interfering compounding mechanisms, reported by Ruud Baars
-
-2010-12-11 Németh László <nemeth at OOo>:
- * affixmgr.cxx: fix hun#2970242 (CHECKCOMPOUNDPATTERN only with flags),
- the bug reported by Ruud Baars
- * tests/2970242.*: test files
-
- * tests/2970240.*: test files for CHECKCOMPOUNDPATTERN fix (check all
- boundaries in compound words, fixed by the previous CHECKCOMPOUNDREP
- fix), the bug reported by Ruud Baars
-
- * win_api/Makefile.cygwin: update
-
-2010-12-09 Caolán McNamara <cmc at OOo>:
- * moz#617953 fix leak
-
-2010-11-08 Caolán McNamara <cmc at OOo>:
- * rhbz#650503 crash in arabic dictionary
-
-2010-11-05 Caolán McNamara <cmc at OOo>:
- * rhbz#648740 don't warn on empty flagvector
-
-2010-11-03 Caolán McNamara <cmc at OOo>:
- * logically we shouldn't need a csconv table in utf-8 mode
-
-2010-10-27 Németh László <nemeth at OOo>:
- * hun#3000055 (requested by Ruud Baars) add REP boundary specifiation:
- REP ^word$ xxxx
- REP ^wordstarting xxxx
- REP wordending$ xxxx
-
- * hun#3008434 (requested by Adrián Chaves Fernández) and
- hun#3018929 (requested by Ruud Baars): REP with more than 2 words:
- REP morethantwo more_than_two
-
- * suggestmgr.cxx: fix incomplete suggestion list for capitalized words,
- eg. missing Machtstrijd->Machtsstrijd in the Dutch dictionary
- (reported by Ruud Bars)
-
- * tests, man: related updates
-
-2010-10-12 Caolán McNamara <cmc at OOo>:
- * moz#603311 HashMgr::load_tables leaks dict when decode_flags fails
- * fix mem leak found with new tests
- * hun#3084340 allow underscores in html entity names
-
-2010-10-07 Németh László <nemeth at OOo>:
- * affixmgr.cxx:
- - hun#2970239 fix bad suggestion of forbidden compound words
- - hun#2999224 fix keepcase feature on compound words (only partial
- fix for COMPOUNDRULE based compounding)
- - fix checkcompoundrep feature in compound words (check all boundaries,
- not only the last one)
- Problems reported by Ruud Baars.
-
- * tests/opentaal_forbiddenword[12]*, tests/opentaal_keepcase*:
- new test files for the previous fixes
- * tests/checkcompoundrep: extended test file.
-
-2010-09-05 Caolán McNamara <cmc at OOo>:
- * moz#583582 fix double buffer gcc fortify issue
-
-2010-08-13 Caolán McNamara <cmc at OOo>:
- * moz#586671 AffixMgr::parse_convtable leaks pattern/pattern2 if it
- can't create both
- * moz#586686 tidy up get_xml_list and friends
-
-2010-08-10 Caolán McNamara <cmc at OOo>:
- * hun#3022860 fix remove duplicate code
-
-2010-07-17 Caolán McNamara <cmc at OOo>:
- * remove ununsed get_default_enc and avoid potential misrecognition of
- three letter language ids
- * normalize encoding names before lookup
-
-2010-07-05 Caolán McNamara <cmc at OOo>:
- * hun#2286060 add Hangul syllables to unicode tables
-
-2010-06-26 Caolán McNamara <cmc at OOo>:
- * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz
- case
-
-2010-06-13 Caolán McNamara <cmc at OOo>:
- * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz
- case
-
-2010-06-02 Caolán McNamara <cmc at OOo>:
- * moz#569611 compile cleanly under win64
-
-2010-05-22 Caolán McNamara <cmc at OOo>:
- * moz#525581 apply mozilla's current preferred get_current_cs impl
-
-2010-05-17 Németh László <nemeth at OOo>:
- * affixmgr.cxx: fix bad limitation of parenthesized flags at
- COMPOUNDRULEs. Windows crash reported by Ruud Baars and Simon Brouwer.
-
-2010-05-05 Caolán McNamara <cmc at OOo>:
- * rhbz#589326 malloc of int that should have been of char**
- * hun#2997388 fix ironic misspellings
-
-2010-04-28 Caolán McNamara <cmc at OOo>:
- * moz#550942 get_xml_list doesn't handle failure from get_xml_par
-
-2010-04-27 Caolán McNamara <cmc at OOo>:
- * moz#465612 mozilla-specific code leaks
- * moz#430900 phone is dereferenced before oom check
- * moz#418348 ckey_utf alloc is used unchecked in SuggestMgr::badcharkey_utf
- * CID#1487 pointer "rl" dereferenced before NULL check
- * CID#1464 Returned without freeing storage "ptr"
- * CID#1459 Avoid duplicate strchr
- * CID#1443 Avoid any chance of dereferencing *slst
- * CID#1442 Unsafe to have a null morph
- * CID#1440 Avoid null filenames
- * CID#1302 Dereferencing NULL value "apostrophe"
- * CID#1441 Avoid deferencing null ppfx
-
-2010-04-16 Caolán McNamara <cmc at OOo>:
- * hun#2344123 fix U)ncap in utf-8 locale
- * fix up hunspell text UI and lines wider than terminal
-
-2010-04-15 Caolán McNamara <cmc at OOo>:
- * hun#2613701 fix small leak in FileMgr::FileMgr
- * fix small leak in tools/hunspell
- * hun#2871300 avoid crash if def and words are NULL
- * hun#2904479 fix length of hzip file
- * hun#2986756 mingw build fix
- * hun#2986756 fix double-free
- * hun#2059896 fix crash in interactive mode without nls
- * hun#2917914 add some extra words to the latexparser
- * make some structs static
- * C-api has duped symbol names
- * regenerate gettext/intl with recent version
- * hun#2796772 build a .dll under MinGW
- * rhbz#502387 allow cross-compiling for MinGW target
- * hun#2467643 update .vcproj files to include replist.?xx
- * unify visiblity/dll_export support across platforms
- * hun#2831289 sizeof(short) typo
- * hun#2986756 add -u3 gcc style output
-
-2010-04-14 Caolán McNamara <cmc at OOo>:
- * hun#2813804 fix segfault on hu_HU stemming
-
-2010-04-13 Caolán McNamara <cmc at OOo>:
- * hun#2806689 fix ironic misspellings
- * hun#2836240 add Italian translations
-
-2010-04-09 Caolán McNamara <cmc at OOo>:
- * fix titchy possible leak in command-line spellchecker
-
-2010-04-07 Caolán McNamara <cmc at OOo>:
- * hun#2973827 apply win64 patch
- * hun#2005643 fix broken mystrdup
-
-2010-03-04 Caolán McNamara <cmc at OOo>:
- * ooo#107768 fix crash in long strings in spellml mode
- * hun#1999737 add some malloc checks
- * hun#1999769 drop old buffer on realloc failure
- * hun#2005643 tidy string functions
- * hun#2005643 micro-opt
- * hun#2006077 free strings on failed dict parse
- * hun#2110783 ispell-alike verbose mode implementation
-
-2010-03-03 Németh László <nemeth at OOo>:
- * hunspell/(affixmgr, suggestmgr).cxx: add character sequence
- support for MAP suggestion, using parenthesized character groups
- in the syntax, eg. MAP ß(ss).
- * man/hunspell.4, tests/map*: documentation and test files
-
-2010-02-25 Németh László <nemeth at OOo>:
- * hunspell/hunspell.cxx: add recursion limit for BREAK (fix OOo Issue 106267)
-
- * hunspell/hunspell.cxx: fix crash in morphological analysis of
- capitalized words with ending dashes
-
- * affixmgr.cxx: fix morphological analysis of long numbers combined with dash,
- eg. 45-00000045 (reported by a@freeblog.hu).
-
-2010-02-23 Caolán McNamara <cmc at OOo>:
- * hun#2314461 improve ispell-alike mode
- * hun#2784983 improve default language detection
- * hun#2812045 fix some compiler warnings
- * hun#2910695 survive missing HOME dir
- * hun#2934195 fix suggestmgr crash
- * hun#2921129 remove unused variables
- * hun#2826164 make sure make check uses the in-tree libhunspell
- * bump toolchain to support --disable-rpath
- * hun#2843984 fix coverity warning
- * hun#2843986 fix coverity warning
- * hun#2077630 add iconv lib
- * make gcc strict-aliasing warning free
- * make cppcheck warning free
-
-2008-11-01 Németh László <nemeth at OOo>:
- * replist.*, hunspell.cxx, affixmgr.cxx: new input and output
- conversion support, see ICONV and OCONV keywords in the Hunspell(4)
- manual page and the test examples. The input/output conversion
- problem of syllabic languages reported by Daniel Yacob and
- Shewangizaw Gulilat.
- - tests/{iconv,oconv}.*: test examples
-
- * tools/wordforms: word generation script for dictionary developers
- (Hunspell version of the unmunch program)
-
- * hunspell/hunspell.cxx: extended BREAK feature: ^ and $ mean in break
- patterns the beginning and end of the word.
- - tests/BREAK.*: modified examples.
-
- * hunspell/hunspell.cxx: set default break at hyphen characters.
- The associated problem reported by S Page in Hunspell Bug 2174061.
- See Mozilla Bug ID 355178 and OOo Issue 64400, too.
- - tests/breakdefault.*: test data
- The following definition is equivalent of the default word break:
-
- BREAK 3
- BREAK -
- BREAK ^-
- BREAK -$
-
- * affixmgr.cxx: SIMPLIFIEDTRIPLE is a new affix file keyword to allow
- simplified forms of the compound words with triple repeating letters.
- It is useful for Swedish and Norwegian languages.
-
- * affixmgr.cxx: extend CHECKCOMPOUNDPATTERN to support
- alternations of compound words for example by sandhi
- feature of Indian and other languages. The problem reported
- by Kiran Chittella associated with Telugu writing system
- (see Telugu example in tests/checkcompoundpattern4.test).
- The new optional field of CHECKCOMPOUNDPATTERN definition is the
- replacement of the compound boundary defined by the previous fields:
- CHECKCOMPOUNDPATTERN ff f ff
- means ff|f compound boundary has been replaced by "ff", like in
- the (prereform) German Schiffahrt (Schiff+fahrt).
- - CHECKCOMPOUNDPATTERN supports also optional flag conditions now:
- CHECKCOMPOUNDPATTERN ff/A f/B ff
- means that the first word of the compound needs flag "A" and
- the second word of the compound needs flag "B" to the operation.
-
- * tools/hunspell.cxx: add empty lines as separators to the output of
- the stemming and morphological analysis.
-
- * affixmgr.cxx: fix condition checking algorithm. Bad suggestion
- generation reported by Mehmet Akin in SF.net Bug 2124186 with help of
- Eleonora Goldman.
-
- * affixmgr,cxx: fix COMPOUNDWORDMAX feature. The problem and its
- code details reported by Göran Andersson under SF.net Bug ID 2138001.
-
- * csutil.cxx: fix bad conditional code for Mozilla compilation.
- Patch by Serge Gautherie. The problem reported by Ryan VanderMeulen.
-
- * hunspell/hunspell.cxx: add missing ngram suggestion for HUHINITCAP
- (capitalized mixed case) words.
-
- * w_char.hxx: use GCC conditions for GCC related code. Patch by
- Ryan VanderMeulen.
-
- * affixmgr.cxx: check morphological description in morphgen()
- (fix potential program fault by incomplete morphological
- description of affix rules)
-
- * src/win_api: config.h: switch on warning messages on Windows
-
- * tools/affixcompress: extended help for -h (use LC_ALL=C sort
- for input word list)
-
- * man/hunspell.4: updated manual:
- - new and modified features (SIMPLIFIEDTRIPLE, ICONV, OCONV,
- BREAK, CHECKCOMPOUNDPATTERN).
- - note about costs of zero affixes, suggested by Olivier Ronez.
-
- * hunspell/hunspell.cxx: remove deprecated word breaking codes.
-
-2008-08-15 Németh László <nemeth at OOo>:
- * affentry.cxx: add FULLSTRIP option. With FULLSTRIP, affix rules can
- strip full words, not only one less characters. Suggested by
- Davide Prina and other developers in OOo Issue 80145.
- * tests/fullstrip.*: Test data based on Davide Prina's example.
- * tools/unmunch.cxx: modified for FULLSTRIP.
-
- * affixmgr.cxx: COMPOUNDRULE now works with long and numerical flag
- types by parenthesized flags. Syntax: (flag)*, (flag)(flag)?(flag)*.
- * tests/compoundrule[78].*: tests with parenthesized COMPOUNDRULE
- definitions.
-
- * suggestmgr.cxx: modified badchar*(), forgotchar*() and extrachar*()
- 1-character distance suggestion algorithms: search a TRY character
- in all position instead of all TRY characters in a character position
- (it can give more readable suggestion order, also better suggestions
- in the first positions, when TRY characters are sorted by frequency.)
- For example, suggestions for "moze":
- ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6),
- maze, more, mote, ooze, mole etc. (Hunspell 1.2.7).
-
- * suggestmgr.cxx: extended compound word checking for better COMPOUNDRULE
- related suggestions, for example English ordinal numbers: 121323th ->
- 121323rd (it needs also a th->rd REP definition).
-
- * phonet.cxx: cast unsigned char parameter of isdigit() and fix
- isalpha by myisalpha() (potential problems in Windows environment).
- Reported by Thomas Lange in OOo Issue 92736.
-
- * hunspell/csutil.*,hunspell/{affentry,affixmgr,hunspell,suggestmgr}.cxx:
- fix potential buffer overloading under morphological analysis by the
- new mystrcat() function. Reported by Molnár Andor (dolhpy at true
- dot hu) in SF.net Bug 2026203.
-
- * affixmgr.cxx: add recursion limit to defcpd(). Fix OOo Issue 76067:
- crash-like deceleration by checking hexadecimal numbers with long FFF
- sequence (combinatory explosion by the en_US words "f" and "ff").
- Missing fix reported by Mathias Bauer.
-
- * affixmgr.cxx: fix the difference in the Unicode and non-Unicode
- parts of cpdcase_check(). Bug report by Brett Wilson.
-
- * filemgr.*, affixmgr.cxx, csutil.*, hashmgr.*: warning messages now
- contain line numbers (use --with-warnings configure option for
- warning messages).
-
- * hunspell.cxx: analyze(): fix case conversion of stemming and
- morphological analysis of UTF-8 encoded input. Reported by Ferenc Godó.
-
- * tools/hunspell.cxx: fix LaTeX Unicode support in filter mode.
- Reported by Jan Seeger in SF.net Bug 2039990.
-
- * affixmgr.hxx: 0.5 or in 64 bit environment, 1 MB (virtual) memory
- saving using only the requested size for sFlag and pFlag arrays.
- Bug report by Brett Wilson.
-
- * affixmgr.cxx,tools/hunspell.cxx: get_version() returns with full
- VERSION affix parameter instead of its first word. Fixes for
- Hunspell's header. Some problems with Hunspell header reported in
- SF.net Bug 2043080.
-
-2008-07-15 Németh László <nemeth at OOo>:
- * affentry.cxx: fixes of the affix rule matching algorithm (affected
- only the sk_SK dictionary from all OpenOffice.org dictionaries):
- - fix dot pattern + accented letters matching (in non Unicode encoding)
- - word-length conditions work again
- * tests/condition.*: extended test for the fix.
-
- * hashmgr.cxx: load multiword expressions: spaces may be parts
- of the dictionary words again (but spaces also work as morphological
- field separators: word word2 -> "word word2", word po:noun -> "word").
- * man/hunspell.4: updated manual
-
- * tools/hunspell.cxx: add iconv character conversion support to
- stemming and morphological analysis
-
- * tools/hunspell.cxx: add /usr/share/myspell/dicts search path for
- Ubuntu support
-
-2008-07-09 Németh László <nemeth at OOo>:
- * affentry.cxx: fixes of the affix rule matching algorithm:
- - right ASCII character handling in bracket expression;
- - fault-tolerant nextchar() for bad rules.
- Problem with the en_GB dictionary and nextchar() with a detailed
- code analysis reported by John Winters in SF.net Bug ID 2012753.
- * tests/condition.*: extended test for the fix.
-
- * hunspell/hunspell.*, parsers/*, tools/hunspell.cxx: fix compiler
- warnings (deprecated const-free char consts)
-
- * win_api/hunspelldll.*: add hunspell_free_list(), the problem
- reported by Laurier Mercer.
-
-2008-06-30 Török László <torok_laszlo at users dot SF dot net>:
- * tests/affixmgr.cxx: fix morphological analysis: strcat() on
- an uninitialized char array in suffix_check_morph().
-
-2008-06-18 Németh László <nemeth at OOo>:
- * src/hunspell/affixmgr.cxx: fix GCC compiler warnings
- (comparisons with string literal results in unspecified behaviour).
- The problem reported by Ladislav Michnovič.
-
-2008-06-17 Németh László <nemeth at OOo>:
- * src/hunspell/{hunspell.cxx,hunspell.h}: add free_list() to the C and
- C++ interface to deallocate suggestion lists. The problem
- reported by Laurie Mercer and Christophe Paris.
- * csutil.cxx: fix freelist() to deallocate non-NULL list, when n = 0.
- * tools/{analyze,example,chmorph,hunspell}.cxx: use free_list().
-
- * tools/hunspell.cxx: fix only --with-readline compiling problem.
- Reported by Volkov Peter in SF.net Bug 1995842.
-
- * man/hunspell.3,hunspell.hxx: fix analyze and generate examples in
- the manual and comments (using char*** parameter instead of char**).
-
- * tools/example.cxx: fix suggestion example.
-
-2008-06-17 Németh László <nemeth at OOo>:
- * affentry.cxx: fix the new affix rule matching algorithm of
- Hunspell 1.2. Arabic dictionary problem reported by Khaled Hosny
- in SF.net Bug ID 1975530. Mohamed Kebdani also sent a
- prepared test data.
- * tests/{1975530,condition*}: tests for the fix
-
-2008-06-13 Ingo H. de Boer <idb_winshell at SF.net>:
- * src/hunspell/{affixmgr.cxx,hunspell.cxx}: add missing type
- cast to strstr() calls for VC8 compatibility.
-
-2008-06-13 Németh László <nemeth at OOo>:
- * suggestmgr.cxx: add also part1-part2 suggestion with dash
- for bad part1part2 word forms, suggested by Ruud Baars.
- For example, now suggestion of "parttime": "part time"
- and "part-time".
- NOTE: this feature will work only when the TRY definition
- contains "-" or the letter "a".
-
- * hunspell.cxx: new XML API in spell() and suggest() (see hunspell(3)).
-
- * src/hunspell/*: fixes for OpenOffice.org build environment.
-
- * man/{hunspell.3,hzip.1,hunzip.1}: add new manual pages for
- Hunspell programming API and dictionary compression and
- encryption utilities.
-
- * src/hunspell/*: handle failed mystrdup() calls and other potential
- insufficient memory problems. The problem reported by Elio Voci
- in OpenOffice.org Issue 90604 and others.
-
- * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars
- without conditional code. Problem reported by Ingo H. de Boer
- in SF.net Bug 1763105.
-
- * win_api/hunspelldll.h: put_word() renamed to add() in the (old)
- Windows DLL API bug reported in SF.net Bug 1943236. Also reported
- by Bartkó Zoltán.
-
- * tools/hunspell.cxx: fix chench() for environments without
- native language support (ENABLE_NLS 0 in config.h),
- PHP system_exec() bug reported by Michel Weimerskirch in
- SF.net Bug 1951087.
-
- * hunspell.cxx, affixmgr.cxx: remove "result" from the
- (result && *result) conditions, when "result" is a static variable.
- The problem and a possible solution reported by Ladislav Michnovič.
-
- * affixmgr.cxx: parse_affix(): print line instead of NULL in
- the warning message, when affix class header is bad.
- The problem reported by Ladislav Michnovič.
-
-2008-06-01 Christian Lohmaier <cloph at OOo>
- * configure.ac: patch to fix --with-readline, --with-ui logic.
- Reported in the SF.net Bug 981395.
-
-2008-05-04: Volkov Peter <volkov_peter at users sourceforge net>
- * configure.ac: fix LibTool 2.22 incompatibility by removing
- unused LT_* macros. Report and patch in SF.net Bug 1957383.
- The problem reported and fixed by Ladislav Michnovič, too.
-
-2008-04-23: Ladislav Michnovič <lmichnovic at suse cz>
- * hunspell.pc.in: fix wrongly set directories.
-
-2008-04-12 Németh László <nemeth at OOo>:
- * src/tools/hunspell.cxx:
- - Multilingual spell checking and special dictionary support with -d.
- Multilingual spell checking suggested by Khaled Hosny (SF.net
- Bug 1834280). Example for the new syntax:
-
- -d en_US,en_geo,en_med,de_DE,de_med
-
- en_US and de_DE are base dictionaries, and en_geo, en_med, de_med
- are special dictionaries (dictionaries without affix file).
- Special dictionaries are optional extension of the base dictionaries.
- There is no explicit naming convention for special dictionaries,
- only the ".dic" extension: dictionaries without affix file will
- be an extension of the preceding base dictionary. First dictionary
- in -d parameter must have an affix file (it must be a base
- dictionary).
-
- - new options for debugging, morphological analysis and stemming:
- -m: morphological analysis or flag debug mode (without affix
- rule data it signs the flag of the affix rules)
- -s: stemming mode
- -D: show also available dictionaries and search path
- (suggested by Aaron Digulla in SF.net Bug 1902133)
-
- - add missing refresh() to print bad words before the slower suggestion
- search in UI (better user experience)
-
- - fix tabulator problems (reported by ugli-kid-joe AT sf DOT net)
-
- - fix different encoding of dic and input, and suggestions
-
- - add per mille sign to LANG hu_HU section.
-
- - rewrite program messages. Concatenating multiple printfs for
- easier translation suggested by András Tímár and Gábor Kelemen.
-
- * src/hunspell/csutil.cxx: set static encds variable. Patch by
- Rene Engerhald. SF.net Bug 1896207 and 1939988.
-
- * src/hunspell/w_char.hxx,csutil.hxx: reorganizing
- w_char typedef and HENTRY_DATA, HENTRY_FIND consts
-
- * src/hunspell/hunzip.cxx: fopen(): using rb options instead of r (fix
- for Windows)
-
- * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars
- in an #ifdef WINSHELL section. Problem reported by Ingo H. de Boer
- in SF.net Bug 1763105.
-
- * src/tools/chmorph.cxx: remove the experimental modifications
-
- * src/tools/hzip.c: fopen(): using wb options instead of w (fix
- for Windows)
-
- * src/tools/hunzip.cxx: add missing MOZILLA_CLIENT. Reported
- by Ryan VanderMeulen.
-
- * man/*, man/hu/*: updated manual
-
- * man/hunspell.4: fix formatting problem (missing header)
-
- * tools/makealias: now works with the extra data fields.
-
- * phonet.cxx: use HASHSIZE const
-
- * tests/rep.aff: fix REP count
-
- * src/win_api/Makefile.cygwin, README: native Windows compilation
- in Cygwin environment without cygwin1.dll dependency (see README
- for compiling instructions).
-
-2008-04-08 Roland Smith <rsmith AT xs4all DOT nl>:
- * src/parsers/latexparser.cxx: fix PATTERN_LEN for AMD64 and
- other platforms with different struct padding (SF.net Bug 1937995).
-
-2008-04-03 Kelemen Gábor <kelemeng AT gnome DOT hu>:
- * po/POTFILES.in: fix path of the source file
-
- * po/Makevars: add --from-code=UTF-8 gettext option
-
- * hunspell.cxx: add comments for shortkey translation
-
-2008-02-04 Flemming Frandsen <flfr AT stibo DOT com>
- * src/hunspell.h: fix Windows DLL support
- - this patch also reported by Zoltán Bartkó.
-
-2008-01-30 Mark McClain <marc_mcclain AT users DOT sf DOT net>
- * src/hunspell.cxx: stem(): fix function call side effect
- for PPC platform (SF.net Bug 1882105).
-
-2008-01-30 Németh László <nemeth at OOo>:
- * hunspell.cxx, csutil.cxx, hunspelldll.c: fix
- SF.et Bug 1851246, patch also by Ingo H. de Boer.
-
- * hunspell.h: fix SF.net Bug 1856572 (C prototype problem),
- patch by Mark de Does.
-
- * hunspell.pc.in: fix SF.net Bug 1857450 wrong prefix, reported
- by Mark de Does.
-
- * hunspell.pc.in: reset numbering scheme: libhunspell-1.2.
- Fix SF.net Bug 1857512 reported by Mark de Does,
- also by Rene Engelhard.
-
- * csutil.cxx: patches for ARM platform, signed_chars.dpatch
- by Rene Engelhard and arm_structure_alignment.dpatch by
- Steinar H. Gunderson <sesse@debian.org>
-
- * hunzip.*, hzip.c: new hzip compression format
-
- * tools/affixcompressor: affix compressor utility (similar to
- munch, but it generates affix table automatically), works
- with million-words dictionaries of agglutinative languages.
-
- * README: fix problems reported by Pham Ngoc Khanh.
-
- * csutil.cxx, suggestmgr: Warning-free in OOo builds.
-
- * hashmgr.*, csutil.*: fix protected memory problems with
- stored pointers on several not x86 platforms by
- store_pointer(), get_stored_pointer().
-
- * src/tools/hunspell.cxx: fix iconv support on Solaris platform.
-
- * tests/IJ.good: add missing test file
-
- * csutil.cxx: fix const char* related errors. Compiling bug
- with Visual C++ reported by Ryan VanderMeulen and Ingo H. de Boer.
-
-2008-01-03 Caolan McNamara <cmc at OO.o>:
- * csutil.cxx: SF.net Bug 1863239, notrailingcomma patch and
- optimization of get_currect_cs().
-
-2007-11-01 Németh László <nemeth at OOo>:
- * hunspell/*: new feature: morphological generation,
- also fix experimental morphological analysis and stemming.
- - new API functions and improved API:
- - analyze(word): (instead of morph()) morphological analysis
- - stem(word): stemming
- - stem(list): stemming based on the result of an analysis
- - generate(word, word2): morphological generation
- - generate(word, list): morphological generation
- - add(word): add word to the run-time dictionary (renamed put_word())
- - add_with_affix(word, word2): (renamed put_word_pattern()):
- add word to the run-time dictionary with affix flags of the
- second parameter: all affixed forms of the user words will be
- recognised by the spell checker. Especially useful for
- agglutinative languages.
- - remove(word): remove word from the run-time dictionary (not
- implemented)
- - see manual and hunspell/hunspell.hxx header and tests/morph.*
- * tests/morph.*: test data, example for morphological analysis,
- stemming and generation
-
- * tools/analyze, tools/chmorph: extended and new demo applications:
- - analyze (originally hunmorph): analyses and stems input words,
- generates word forms from input word pairs.
- - chmorph: morphological transformation filter
-
- * configure.ac, hunspell/makefile.am: set library version number.
- Bug reported by Rene Engelhard.
-
- * affentry.cxx, affixmgr.cxx: new pattern matching algorithm in
- condition checking of affix rules instead of the Dömölki-algorithm:
- - Unlimited condition length (instead of max. 8 characters).
- - Less memory consumption, especially useful for affix rich languages:
- 5,4 MB memory savings with hu_HU dictionary.
- - Speed change depends from dictionaries and CPU caches: English spell
- checking is 4% faster on Linux words with en_US dictionary, Hungarian
- spell checking is 25% slower on most frequent words of Hungarian
- Webcorpus.
-
- * tests/sug.*, sugutf.*: updated test data (use "a" and "lot"
- dictionary items instead of "a lot".)
-
- * src/hunspell/hunspell.cxx: free(csconv) instead of delete csconv.
- Report and patch by Sylvain Paschein in Mozilla Issue 398268.
-
- * suggestmgr.cxx, tools/hunspell.cxx: bad spelling of "misspelled".
- Ubuntu Bug #134792, patch by Malcolm Parsons.
-
- * tests/base_utf.*: use Unicode apostrophe instead of 8-bit one.
-
- * hunspell.cxx, hashmgr.cxx: add(): use HashMgr::add()
-
-2007-10-25 Pavel Janík <pjanik at OOo>:
- * hunspell/csutil.cxx: Fix type cast warnings on 64bit Linux in
- printing of character positions in u8_u16(). OOo issue 82984.
-
-2007-09-05 Németh László <nemeth at OOo>:
- * win_api/Hunspell.vproj, parsers/testparser.cxx,textparser.hxx:
- warning fixes and removing unnecessary Windows project file.
- Reported by Ingo H. de Boer.
-
- * hashmgr.*, {affixmgr,suggestmgr}.cxx: optimized data structure
- for variable-count fields (only "ph" transliteration field in
- this version, see next item). Also less memory consumption:
- -13% (0.75 MB) with en_US dictionary, -6% (1 MB) with hu_HU.
-
- * suggestmgr.cxx: dictionary based phonetic suggestion for special
- or foreign pronounciation (see also rule-based PHONE in manual).
- Usage: tab separated field in dictionary lines, started with "ph:".
- The field contains a phonetic transliteration of the word:
-
-Marseille ph:maarsayl
- * tests/phone.*: test data for dictionary and rule based phonetic
- suggestion.
-
- * hunspell.cxx: fix potential bad memory access in allcap word
- capitalization in suggest() (bug of previous version).
-
- * hunspell.cxx, atypes.hxx: set correct limit for UTF-8 encoded
- input words (256 byte).
-
- * suggestmgr.cxx: improved REP suggestions with spaces: it works
- without dictionary modification.
- OOo issue 80147, reported by Davide Prina.
- * tests/rep.*: new test data: higher priority for "alot" -> "a lot",
- and Italian suggestion "un'alunno" -> "un alunno".
-
- * affixmgr.cxx: fix Unicode ngram suggestions in expand_rootword().
- (Suggestions with bad affixes.)
- Bug reported by Vitaly Piryatinksy <piv dot v dot vitaly at gmail>.
- * tests/ngram_utf_fix.*: test based on Vitaly Piryatinksy's data.
-
- * suggestmgr.cxx: fix twowords() for last UTF-8 multibyte character.
- (conditional jump or move depended on uninitialised value).
-
-2007-08-29 Ingo H. de Boer <idb_winshell at SF.net>:
- * win_api/{hunspell,libhunspell, testparser}.vcproj: new project
- files for the library and the executables.
-
- * Hunspell.rc, Hunspell.sln, config.h: updated versions.
- Version number problem also reported by András Tímár.
-
-2007-08-27 Németh László <nemeth at OOo>:
- * suggestmgr.hxx: put fixed version. Bug report by Ingo H. de Boer.
-
- * suggestmgr.cxx: remove variable-length local character array
- reported by Ingo H. de Boer.
-
-2007-08-27 Németh László <nemeth at OOo>:
- * suggestmgr.hxx: change bad time_t to clock_t in header, too.
- Bug reports or patches by Ingo H. de Boer under SF.net
- Bug ID 1781951, János Mohácsi and Gábor Zahemszky, András Tímár,
- OMax3 at SF.net under SF.net Bug ID 1781592.
-
- * phonet.*: change variable-length local character array to
- portable fixed size character array. Problem reported by
- Ingo H. de Boer under SF.net Bug ID 1781951 and
- Ryan VanderMeulen.
-
- * suggestmgr.cxx: remove debug message (also by
- Ingo H. de Boer).
-
-2007-08-26 Ingo H. de Boer <idb_winshell at SF.net>:
- * win_api/Hunspell.vcproj: updated version (with phonet.*)
-
-2007-08-23 Németh László <nemeth at OOo>:
- * phonet.{c,h}xx, suggestmgr.cxx: PHONE parameter:
- pronounciation based suggestion using Björn Jacke's original Aspell
- phonetic transcription algorithm (http://aspell.net), relicensed
- under GPL/LGPL/MPL tri-license with the permission of the author.
- Usage: see manual.
-
- * affixmgr,suggestmgr.cxx: add KEY parameter for keyboard and
- input method error related suggestions.
- Example: KEY qwertyuiop|asdfghjkl|zxcvbnm
-
- * man/hunspell.4: description about PHONE and KEY suggestion parameters.
-
- * suggestmgr.cxx: enhancements for better suggestions:
- - Set ngram suggestions for badchar-type errors
- and only two word and compound word suggestions, too.
- - Separate not compound and compound word
- suggestions for MAP suggestion, too.
- - Double swap suggestions for short words.
- For example: ahev -> have, hwihc -> which.
- - Better time limits using clock() instead of time()
- (tenths of a second resolution instead of second ones).
- - leftcommonsubstring() weigth function.
-
- * htype.hxx, hashmgr.cxx: blen (byte length) and clen (character
- length) fields instead of wlen
-
- * affixmgr.cxx: fix get_syllable() for bad Unicode inputs.
-
- * tests/suggestiontest/*: test environment for suggestions
-
-2007-08-07 Martijn Wargers:
- * csutil.cxx: fix Mingw build error associated with ToUpper() call.
- Report and patch in Mozilla Issue 391447.
-
-2007-08-07 Robert Longson:
- * atypes.cxx: use empty inline function HUNSPELL_WARNING instead of
- variadic macros to switch of Hunspell warnings.
- Reported by Gavin Sharp in Mozilla Issue 391147.
-
-2007-08-05 Ginn Chen:
- * hashmgr.cxx: Hunspell failed to compile on OpenSolaris (use stdio
- instead of csdio). Report and patch in Mozilla Issue 391040.
-
-2007-07-25 Németh László <nemeth at OOo>:
- * parsers/*.cxx: Hunspell executable recognises and accepts URLs,
- e-mail addresses, directory paths, reported by Jeppe Bundsgaard.
- * src/tools/hunspell.cxx: --check-url: new option of Hunspell program.
- Use --check-url, if you want check URLs, e-mail addresses and paths.
-
- * parsers/textparser.cxx: strip colon at end of words for Finnish
- and Swedish (colon may be in words in Finnish and Swedish).
- Problem reported by Lars Aronsson.
- * tests/colons_in_words.*: test data
-
- * tests/digits_in_words.*: example for using digits in words
- (eg. 1-jährig, 112-jährig etc. in German), reported by Lars Aronsson.
-
- * hashmgr.cxx: Hunspell accepts allcaps forms of mixed case
- words of personal dictionaries (+allcaps custom dictionary words with
- allcaps affixes).
- Sf.net Bug ID 1755272, reported by Ellis Miller.
-
- * hashmgr.cxx: fix small memory leaks with alias compressed
- dictionaries (free flag vectors of affixed personal dictionary words
- and flag vectors of hidden capitalized forms of mixed case and
- allcaps words).
-
- * affixmgr.cxx: fix COMPOUNDRULE checking with affixed compounds.
- Sf.net Bug ID 1706659, reported by Björn Jacke. Also fixing for
- OOo Issue 76067 (crash-like deceleration for hexadecimal numbers
- with long FFFFFF sequence using en_US dictionary).
-
- * tools/hunspell.cxx: add missing return to save_privdic().
-
- * man/hunspell.4: add information about affixation of personal words:
- "Personal dictionaries are simple word lists, but with optional
- word patterns for affixation, separated by a slash:
-
- foo
- Foo/Simpson
-
- In this example, "foo" and "Foo" are personal words, plus Foo
- will be recognised with affixes of Simpson (Foo's etc.)."
-
-2007-07-18 Németh László <nemeth at OOo>:
- * src/win_api/: add missing resource files, reported by Ingo H. de Boer.
-
-2007-07-16 Németh László <nemeth at OOo>:
- * hunspell.cxx: fix dot removing from UTF-8 encoded words in cleanword2()
- (Capitalised words with dots, as "Something." were not recognised
- using Unicode encoded dictionaries.)
- * tests/{base.*,base_utf.*}: extended and new test files for
- dot removing and Unicode support.
-
- * tools/hunspell.cxx: fix Cygwin, OS X compatibility using platform
- specifics iconv() header by ICONV_CONST macro of Autoconf.
- Sf.net Bug ID 1746030, reported by Mike Tian-Jian Jiang.
- Sf.net Bug ID 1753939, reported by Jean-Christophe Helary.
-
- * tools/hunspell.cxx: fix missing global path setting with -d option.
-
- * tests/test.sh: fix broken Valgrind checking (missing warnings
- with VALGRIND=memcheck make check).
-
- * csutil.cxx: fix condition in u8_u16() to avoid invalid read
- of not null-terminated character arrays (detected by Valgrind
- in Hunspell executable: associated with 8-bit character table
- conversion in tools/hunspell.cxx).
-
- * csutil.cxx: free_utf_tbl(): use utf_tbl_count-- instead of utf_tbl--.
- Memory leak in Hunspell executable detected by Valgrind.
-
- * hashmgr.cxx: add missing free_utf_tbl(), memory leak in Hunspell
- executable detected by Valgrind.
-
- * hashmgr.cxx: load_tables(): fix memory error in spec. capitalization.
- Use sizeof(unsigned short) instead of bad sizeof(unsigned short*).
- Invalid memory read detected by Valgrind.
-
- * hashmgr.cxx: add_word(): fix memory error in spec. capitalization.
- Update also affix array length of capitalized homonyms. Invalid
- memory read detected by Valgrind.
-
- * hunspell.cxx: suggest(): fix invalid memory write and leak.
- Bad realloc() and missing free() detected by Valgrind associated
- with suggestions for "something.The" type spelling errors.
-
- * {dictmgr,csutil,hashmgr,suggestmgr}.cxx: check memory allocation.
- Sf.net Bug ID 1747507, based on the patch by Jose da Silva.
-
-2007-07-13 Ingo H. de Boer <idb_winshell at SF.net>:
- * atypes.cxx: fix Visual C compatibility: Using
- "HUNSPELL_WARNING(a,b,...} {}" macro instead of empty "X(a,b...)".
-
- * hunspell.cxx: changes for Windows API.
- * win_api/Hunspell.*: new resource files
- * win_api/hunspelldll.*: set optional Hunspell and Borland spec. codes
- Sf.net Bug ID 1753802, patch by Ingo H. de Boer.
- See also Sf.net Bug ID 1751406, patch by Mike Tian-Jian Jiang.
-
-2007-07-09 Caolan McNamara <cmc at OO.o>:
- * {hunspell,hashmgr,affentry}.cxx: fix warnings of Coverity program
- analyzer. Sf.net Bug ID, 1750219.
-
-2007-07-06 Németh László <nemeth at OOo>:
- * atypes.cxx: warning-free swallowing of conditional warning messages
- and their parameters using empty HUNSPELL_WARNING(a,b...) macro.
- * {affixmgr,atypes,csutil}.cxx: fix unused variable warnings
- using WARNVAR macro for conditionally named variables.
- * hashmgr.cxx: fix unused variable warning in add_word() by cond. name
- * hunspell.cxx: fix shadowed declaration of captype var. in suggest()
-
-2006-06-29 Caolan McNamara <cmc at OO.o>:
- * hunspell.cxx: patch to fix possible memory leak in analyze() of
- experimental morphological analyzer code. Sf.net Bug ID 1745263.
-
-2007-06-29 Németh László <nemeth at OOo>:
-improvements:
- * src/hunspell/hunspell.cxx: check bad capitalisation of Dutch letter IJ.
- - Sf.net Feature Request ID 1640985, reported by Frank Fesevur.
- - Solution: FORBIDDENWORD for capitalised word forms (need
- an improved Dutch dictionary with forbidden words: Ijs/*, etc.).
- * tests/IJ.*: test data and example.
-
- * hashmgr.cxx, hunspell.cxx: check capitalization of special word forms
- - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG
- Sf.net Bug ID 1398550, reported by Dmitri Gabinski.
- - allcap words and suffixes: UNICEF's - UNICEF'S
- - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA
- For Catalan, French and Italian languages.
- Reported by Davide Prina in OOo Issue 68568.
- * tests/allcaps*: tests for OPENOFFICE.ORG, UNICEF'S capitalization.
- * tests/i68568*: tests for SANT'ELIA capitalization.
-
- * hunspell/hunspell.cxx: suggestion for missing sentence spacing:
- something.The -> something. The
-
- * tools/hunspell.cxx: multiple character encoding support
- - -i option: custom input encoding
- Sf.net Bug ID 1610866, reported by Thobias Schlemmer.
- Sf.net Bug ID 1633413, reported by Dan Kenigsberg.
- See also hunspell-1.1.5-encoding.patch of Fedora from Caolan Mc'Namara.
- * tests/*.test: add input encodings
-
- * tools/hunspell.cxx: use locale data for default dictionary names.
- Sf.net Bug ID 1731630, report and patch from Bernhard Rosenkraenzer,
- See also hunspell-1.1.4-defaultdictfromlang.patch of Fedora Linux
- from Caolan McNamara.
-
- * tools/hunspell.cxx: fix 8-bit tokenization (letters without
- casing, like ß or Hebrew characters now are handled well)
-
- * tools/hunspell.cxx: dictionary search path
- - DICPATH environmental variable
- - -D option: show directory path of loaded dictionary
- - automatic detection of OpenOffice.org directories
-
-fixes:
- * affixmgr.cxx: fault-tolerant patch for REP and other affix
- table data problems. Problem with Hunspell and en_GB dictionary
- reported by Thomas Lange in OOo Issue 76098 and
- Stephan Bergmann in OOo Issue 76100.
- Sf.net Bug ID 1698240, reported by Ingo H. de Boer.
-
- * csutil.cxx: fix mkallcap_utf() for allcaps suggestion in UTF-8.
-
- * suggestmgr.cxx: fix bad movechar_utf() (missing strlen()).
-
- * hunspell.cxx: fix bad degree sign detection in Unicode
- hu_HU environment.
-
- * hunspell/hunspell.cxx: free allocated memory of csconv in
- ported Mozilla code.
- - Mozilla Bugzilla Bug 383564, report and Mozilla MySpell patch
- by Andrew Geul. Reported by Ryan VanderMeulen for Hunspell.
-
- * suggestmgr.cxx: fix minor difference in Unicode suggestion
- (ngram suggestion of allcaps words in Unicode).
-
- * hashmgr.cxx: close file handle after errors.
- Sf.net Bug ID 1736286, reported by John Nisly.
-
- * configure.ac: syntax error (shell variable with spaces).
- Sf.net Bug ID 1731625, reported by Bernhard Rosenkraenzer.
-
- * hunspell.cxx: check_word(): fix bad usage of info pointer.
-
- * hashmgr.cxx: fix de_DE related bug (accept words with leading dash).
- Sf.net Bug ID 1696134, reported by Björn Jacke.
-
- * suggestmgr.cxx, tests/1695964.*: fix NEEDAFFIX homonym suggestion.
- Sf.net Bug ID 1695964, reported by Björn Jacke.
-
- * tests/1463589*: capitalized ngram suggestion test data for
- Sf.net Bug ID 1463589, reported by Frederik Fouvry.
-
- * csutil.cxx, affixmgr.cxx: fix possible heap error with
- multiple instances of utf_tbl.
- Sf.net Bug ID 1693875, reported by Ingo H. de Boer.
-
- * affixmgr.cxx, suggestmgr.cxx, license.hunspell: convert to ASCII.
- Locale dependent compiling problems. Sf.net Bug ID 1694379, reported
- by Mike Tian-Jian Jiang. OOo Issue 78018 reported by Thomas Lange.
-
- * tests/test.sh: compatibility issues
- - fix Valgrind support (check shared library instead of shell wrapper)
- - remove deprecated "tail +2" syntax
- - set 8-bit locale for testing (LC_ALL=C)
-
- * hunspell.hxx: remove license.* and config.h dependencies.
- - hunspell-1.1.5-badheader.patch from Caolan McNamara <cmc at OO.o>
-
-2007-03-21 Németh László <nemeth at OOo>:
- * tools/Makefile.am, munch.h, unmunch.h: add missing munch.h and unmunch.h
- Reported by Björn Jacke and Khaled Hosny (sf.net Bug ID 1684144)
- * hunspell/hunspell.cxx, hunspell.hxx: fix --with-ui compliling error (add get_csconv())
- Reported by Khaled Hosny (sf.net Bug ID 1685010)
-
-2007-03-19 Németh László <nemeth at OOo>:
- * csutil.cxx, hunspell/hunspell.cxx: Unicode non BMP area (>65K character range) support
- (except conditional patterns and strip characters of affix rules)
- * tests/utf8_nonbmp*: test data
-
- * src/hunspell/*: add Mozilla patches from David Einstein
- - run-time generated 8-bit character tables
- - other Mozilla related changes (see Mozilla Bugzilla Bug 319778)
-
- * csutil.cxx, affixmgr.cxx, hashmgr.cxx: optimized version of IGNORE feature
- - IGNORE works with affixes (except strip characters and affix conditions)
- * tests/ignore*: test data with latin characters
- * tests/ignoreutf*: Unicode test data with Arabic diacritics (Harakat)
-
- * src/hunspell/suggestmgr.cxx: new edit distance suggestion methods
- - capitalization: nasa -> NASA
- - long swap: permenant -> permanent
- - long mov.: Ghandi -> Gandhi
- - double two characters: vacacation -> vacation
- * tests/sug.*: test data
-
- * src/hunspell/affixmgr.cxx: space in REP strings (alot -> a lot)
- Note: Underline character signs the space in REP strings: REP alot a_lot, and
- put the expression with space ("a lot") into the dic file (see tests/sug).
-
- * hashmgr.cxx, affixmgr.cxx: ignore Unicode byte order mark (BOM sequence)
- * tests/utf8_bom*: test data
-
- * hunspell/*.cxx: OOo Issue 68903 - Make lingucomponent warning-free on wntmsci10
- - fix Hunspell related warning messages on Windows platform (except some assignment
- within conditional expressions). Reported and started by Stephan Bergmann.
-
- * hunspell/affixmgr.cxx: fix OOo Issue 66683 - hunspell dmake debug=x fails
- - Reported by Stephan Bergmann.
-
- * src/hunspell/hunspell.[ch]xx: thread safe API for Hunspell executable
- (removing prev*() functions, new spell(word, info, root) function)
-
- * configure.ac, src/hunspell/*: HUNSPELL_EXPERIMENTAL code
- --with-experimental configure option (conditional compiling of morphological analyser
- and stemmer tools)
-
- * configure.ac, src/hunspell/*: conditional Hunspell warning messages
- --with-warnings configure option
-
- * affixmgr.cxx: new, optimized parsing functions
-
- * affixmgr.cxx: fix homonym handling for German dictionary project,
- reported by Björn Jacke (sf.net Bug ID 1592880).
- * tests/1592880.*: test data by Björn Jacke
-
- * src/hunspell/affixmgr.cxx: fix CIRCUMFIX suggestion
- Bug reported by Erdal Ronahi.
-
- * hunspell.cxx: reverse root word output (complex prefixes)
- Bug reported by Munzir Taha.
-
- * tools/hunspell.cxx: fix Emacs compatibility, patch by marot at sf.net
- - no % command in PIPE mode (SourceForge BugTracker 1595607)
- - fix HUNSPELL_VERSION string
-
- * suggestmgr.[hc]xx: rename check() functions to checkword() (OOo Issue 68296)
- adopt MySpell patch by Bryan Petty (tierra at ooo) for Hunspell source
-
- * csutil.cxx, munch.c, unmunch.c: adopt relevant parts of the MinGW patch
- (OOo Issue 42504) by tonal at ooo
-
- * affigmgr.cxx: remove double candidate_check() call, reported by Bram Moolenaar
-
- * tests/test.sh: add LC_ALL="C" environment. Locale dependency of make check
- reported by Gentoo project.
-
- * src/tools/hunspell.cxx: UTF-8 highlighting fix for console UI
- (not solved: breaking long UTF-8 lines)
-
- * src/tools/unmunch.c: fix bad generation if strip is shorter than condition,
- reported by Davide Prina
- * src/tools/unmunch.h: increase 5000 -> 500000
-
- * src/tools/hunspell.cxx: fix memory error in suggestion (uninitialized parameter),
- Bug also reported by Björn Jacke in SourceForge Bug 1469957
-
- * csutil.cxx, affixmgr.cxx: fix Caolan McNamara's patch for non OOo environment
-
-2006-11-11 Caolan McNamara <cmc at OO.o>:
- * csutil.cxx, affixmgr.cxx: UTF-8 table patch (OOo Issue 71449)
- Description: memory optimization (OOo doesn't use the large UTF-8 table).
-
- * Makefile.am: shared library patch (Sourceforge ID 1610756)
-
- * hunspell.h, hunspell.cxx: C API patch (Sourceforge ID 1616353)
-
- * hunspell.pc: pkgconfig patch (Sourceforge ID 1639128)
-
-2006-10-17 Ryan Jones <at Mozilla Bugzilla>:
- * affixmgr.cxx: missing fclose(affixlst) calls
- Reported by <gavins at ooo> in OOo Issue 70408
-
-2007-07-11 Taha Zerrouki <taha at gawab>:
- * affixmgr.cxx, hunspell.cxx, hashmgr.cxx, csutil.cxx: IGNORE feature to remove
- optional Arabic and other characters from input and dictionary words.
- * src/hunspell/langnum.hxx: add Arabic language number, lang_ar=96
- * tests/ignore.*: test data
-
-2006-05-28 Miha Vrhovnik <mvrhov at users.sourceforge>:
- * src/win_api/*: C API for Windows DLLs
- - also Delphi text editor example (see on Hunspell Sourceforge page)
-
-2006-05-18 Kevin F. Quinn <kevquinn at gentoo>:
- * utf_info.cxx: struct -> static struct
- Shared library patch also developed by Gentoo developers (Hanno Meyer-Thurow,
- Diego Pettenò, Kevin F. Quinn)
-
-2006-02-02 Németh László <nemethl@gyorsposta.hu>:
- * src/hunspell/hunspell.cxx: suggest(): replace "fooBar" -> "foo bar" suggestions
- with "fooBar" ->"foo Bar" (missing spaces are typical OCR bugs).
- Bug reported by stowrob at OOo in Issue 58202.
- * src/hunspell/suggestmgr.cxx: twowords(): permit 1-character words.
- (restore MySpell's original behavior). Here: "aNew" -> "a New".
- * tests/i58202.*: test data
-
- * src/parsers/textparser.cxx: fix Unicode tokenization in is_wordchar()
- (extra word characters (WORDCHARS) didn't work on big-endian platforms).
-
- * src/hunspell/{csutil,affixmgr}.cxx: inline isSubset(), isRevSubset():
- little speed optimalization for languages with rich morphology.
-
- * src/tools/hunspell.cxx: fix bad --with-ui and --with-readline compiling
- when (N)curses is missing. Reported by Daniel Naber.
-
-2006-01-19 Tor Lillqvist <tml@novell.com>
- * src/hunspell/csutil.cxx: mystrsep(): fix locale-dependent isspace() tokenization
-
-2006-01-06 András Tímár <timar@fsf.hu>
- * src/hunspell/{hashmgr.hxx,hunspell.cxx}: fix Visual C++ compiling errors
-
-2006-01-05 Németh László <nemethl@gyorsposta.hu>:
- * COPYING: set GPL/LGPL/MPL tri-license for Mozilla integration.
- Rationale: Mozilla source code contains an old MySpell version
- with GPL/LGPL/MPL tri-license. (MPL license is a copyleft license, similar
- to the LGPL, but it acts on file level.)
- * COPYING.LGPL: GNU Lesser General Public License 2.1 (LGPL)
- * COPYING.MPL: Mozilla Public License 1.1 (MPL)
- * license.hunspell, src/hunspell/license.hunspell: GPL/LGPL/MPL tri-license
-
- * src/hunspell/{affixmgr,hashmgr}.*: AF, AM alias definitions in affix file:
- compression of flag sets and morphological descriptions (see manual,
- and tests/alias* test files).
- Rationale: Alias compression is also good for loading time and memory
- efficiency, not only smaller resources.
- * src/tools/makealias: alias compression utility
- (usage: ./makealias file.dic file.aff)
- * tests/alias{,2,3}: AF, AM tests
- * man/hunspell.4: add AF, AM documentation
- * src/hunspell/affentry.cxx, atypes.hxx: add new opts bits (aeALIASM, aeALIASF)
-
- * tools/hunspell, src/parser/*, src/hunspell/*: Hunspell program
- tokenizes Unicode texts (only with UTF-8 encoded dictionaries).
- Missing Unicode tokenization reported by Björn Jacke, Egmont Koblinger,
- Jess Body and others.
- Note: Curses interactive interface hasn't worked perfectly yet.
- * tests/*.tests: remove -1 parameters of Hunspell
- * tests/*.{good,wrong}: remove tabulators
-
- * src/hunspell/{hunspell,affixmgr}.cxx: BREAK option: break words at
- specified break points and checking word parts separately (see manual).
- Note: COMPOUNDRULE is better (or will be better) for handling dashes and
- other compound joining characters or character strings. Use BREAK, if you
- want check words with dashes or other joining characters and there is no time
- or possibility to describe precise compound rules with COMPOUNDRULE.
- * tests/break.*: BREAK example.
-
- * src/hunspell/{affixmgr,hunspell}.cxx: add CHECKSHARPS declaration instead
- of LANG de_DE definitions to handle German sharp s in both spelling and
- suggestion.
- * src/hunspell/hunspell.cxx: With CHECKSHARPS, uppercase words are valid
- with both lower sharp s (it's is optional for names in German legal texts)
- and SS (MÜßIG, MÜSSIG). Missing lower sharp s form reported by Björn Jacke.
- * src/hunspell/hunspell.cxx: KEEPCASE flag on a sharp s word has a special
- meaning with CHECKSHARPS declaration: KEEPCASE permits capitalisation and SS upper
- casing of a sharp s word (Müßig and MÜSSIG), but forbids the upper cased form
- with lower sharp s character(s): *MÜßIG.
- * tests/germancompounding*: add CHECKSHARPS, remove LANG
- * tests/checksharps*: add CHECKSHARPS and KEEPCASE, remove LANG
-
- * src/hunspell/hunspell.cxx: improved suggestions:
- - suggestions for pressed Caps Lock problems: macARONI -> macaroni
- - suggestions for long shift problems: MAcaroni -> Macaroni, macaroni
- - suggestions for KEEPCASE words: KG -> kg
- * src/hunspell/csutil.cxx: fix mystrrep() function:
- - suggestions for lower sharp s in uppercased words: MÜßIG -> MÜSSIG
- * tests/checksharps{,utf}.sug: add tests for mystrrep() fix
-
- * src/hunspell/hashmgr.cxx: Now dictionary words can contain slashes
- with the "\/" syntax. Problem reported by Frederik Fouvry.
-
- * src/hunspell/hunspell.cxx: fix bad duplicate filter in suggest().
- (Suggesting some capitalised compound words caused program crash
- with Hungarian dictionary, OOo Issue 59055).
-
- * src/hunspell/affixmgr.cxx: fix bad defcpd_check() call in compound_check().
- (Overlapping new COMPOUNDRULE and old compounding methods caused program
- crash at suggestion.)
-
- * src/hunspell/affixmgr.{cxx,hxx}: check affix flag duplication at affix classes.
- Suggested by Daniel Naber.
-
- * src/hunspell/affentry.cxx: remove unused variable declarations (OOo i58338).
- Compiler warnings reported by András Tímár and Martin Hollmichel.
-
- * src/hunspell/hunspell.cxx: morph(): not analyse bad mixed uppercased forms
- (fix Arabic morphological analysis with Buckwalter's Arabic transliteration)
-
- * src/hunspell/affentry.{cxx,hxx}, atypes.hxx: little memory optimization
- in affentry:
- - using unsigned char fields instead of short (stripl, appndl, numconds)
- - rename xpflg field to opts
- - removing utf8 field, use aeUTF8 bit of opts field
-
- * configure.ac: set tests/maputf.test to XFAILED on ARM platform.
- Fail reported by Rene Engelhard.
-
- * configure.ac: link Ncursesw library, if exists.
-
- * BUGS: add BUGS file
-
- * tests/complexprefixes2.*: test for morphological analysis with COMPLEXPREFIXES
-
- * src/hunspell/affixmgr.cxx: use "COMPOUNDRULE" instead of
- "COMPOUND". The new name suggested by Bram Moolenaar.
- * tests/compoundrule*: modified and renamed compound.* test files
-
- * man/hunspell.4: AF, AM, BREAK, CHECKSHARPS, COMPOUNDRULE, KEEPCASE.
- - also new addition to the documentation:
- Header of the dictionary file define approximate dictionary size:
- ``A dictionary file (*.dic) contains a list of words, one per line.
- The first line of the dictionaries (except personal dictionaries)
- contains the _approximate_ word count (for optimal hash memory size).''
- Asked by Frederik Foudry.
-
- One-character replacements in REP definitions: ``It's very useful to
- define replacements for the most typical one-character mistakes, too:
- with REP you can add higher priority to a subset of the TRY suggestions
- (suggestion list begins with the REP suggestions).''
-
-2005-11-11 Németh László <nemethl@gyorsposta.hu>:
- * src/hunspell/affixmgr.*: fix Unicode MAP errors (sorted only n-1
- characters instead of n ones in UTF-16 MAP character lists).
- Bug reported by Rene Engelhard.
-
- * src/hunspell/affixmgr.*: fix infinite COMPOUND matching (default char
- type is unsigned on PowerPC, s390 and ARM platforms and it will never
- be negative). Bug reported by Rene Engelhard.
-
- * src/hunspell/{affixmgr,suggestmgr}.cxx: fix bad ONLYINCOMPOUND
- word suggestions.
- * tests/onlyincompound.sug: empty test file to check this fix.
- Bug reported by Björn Jacke.
-
- * src/hunspell/affixmgr.cxx: fix backtracking in COMPOUND pattern matching.
- * tests/compound6.*: test files to check this fix.
-
- * csutil.cxx: set bigger range types in flag_qsort() and flag_bsearch().
-
- * affixmgr.hxx: set better type for cont_classes[] Boolean data (short -> char)
-
- * configure.ac, tests/automake.am: set platform specific XFAIL test
- (flagutf8.test on ARM platform)
-
-2005-11-09 Németh László <nemethl@gyorsposta.hu>:
-improvements:
- * src/hunspell/affixmgr.*: new and improved affix file parameters:
-
- - COMPOUND definitions: compound patterns with regexp-like matching.
- See manual and test files: tests/compound*.*
- Suggested by Bram Moolenaar.
- Also useful for simple word-level lexical scanning, for example
- analysing numbers or words with numbers (OOo Issue #53643):
- http://qa.openoffice.org/issues/show_bug.cgi?id=53643
- Examples: tests/compound{4,5}.*.
-
- - NOSUGGEST flag: words signed with NOSUGGEST flag are not suggested.
- Proposed flag for vulgar and obscene words (OOo Issue #55498).
- Example: tests/nosuggest.*.
- Problem reported by bobharvey at OOo:
- http://qa.openoffice.org/issues/show_bug.cgi?id=55498
-
- - KEEPCASE flag: Forbid capitalized and uppercased forms of words
- signed with KEEPCASE flags. Useful for special ortographies
- (measurements and currency often keep their case in uppercased
- texts) and other writing systems (eg. keeping lower case of IPA
- characters).
-
- - CHECKCOMPOUNDCASE: Forbid upper case characters at word bound in compounds.
- Examples: tests/checkcompoundcase* and tests/germancompounding.*
-
- - FLAG UTF-8: New flag type: Unicode character encoded with UTF-8.
- Example: tests/flagutf8.*.
- Rationale: Unicode character type can be more readable
- (in a Unicode text editor) than `long' or `num' flag type.
-
-bug fixes:
- * src/hunspell/hunspell.cxx: accept numbers and numbers with separators (i53643)
- Bug reported by skelet at OOo:
- http://qa.openoffice.org/issues/show_bug.cgi?id=53643
-
- * src/hunspell/csutil.cxx: fix casing data in ISO 8859-13 character table.
-
- * src/hunspell/csutil.cxx: add ISO-8859-15 character encoding (i54980)
- Rationale: ISO-8859-15 is the default encoding of the French OpenOffice.org
- dictionary. ISO-8859-15 is a modified version of ISO-8859-1
- (latin-1) character encoding with French œ ligatures and euro
- symbol. Problem reported by cbrunet at OOo in OOo Issue 54980:
- http://qa.openoffice.org/issues/show_bug.cgi?id=54980
-
- * src/hunspell/affixmgr.cxx: fix zero-byte malloc after a bad affix header.
- Patch by Harri Pitkänen.
-
- * src/hunspell/suggestmgr.cxx: fix bad NEEDAFFIX word suggestion
- in ngram suggestions. Reported by Daniel Naber and Friedel Wolff.
-
- * src/hunspell/hashmgr.cxx: fix bad white space checking in affix files.
- src/hunspell/{csutil,affixmgr}.cxx: add other white space separators.
- Problems with tabulators reported by Frederik Fouvry.
-
- * src/hunspell/*: replace system-dependent <license.*> #include
- parameters with quoted ones. Problem reported by Dafydd Jones.
-
- * src/hunspell/hunspell.cxx: fix missing morphological analysis of dot(s)
- Reported by Trón Viktor.
-
-changes:
- * src/hunspell/affixmgr.cxx: rename PSEUDOROOT to NEEDAFFIX.
- Suggested by Bram Moolenaar.
-
- * src/hunspell/suggestmgr.hxx: Increase default maximum of
- ngram suggestions (3->5). Suggested by Kevin Hendricks.
-
- * src/hunspell/htypes.hxx: Increase MAXDELEN for long affix flags.
-
- * src/hunspell/suggestmgr.cxx: modify (perhaps fix) Unicode map suggestion.
- tests/maputf test fail on ARM platform reported by Rene Engelhard.
-
- * src/hunspell/{affentry.cxx,atypes.hxx}: remove [PREFIX] and
- MISSING_DESCRIPTION messages from morphological analysis.
- Problems reported by Trón Viktor.
-
- * tests/germancompounding.{aff,good}: Add "Computer-Arbeit" test word.
- Suggested by Daniel Naber.
-
- * doc/man/hunspell.4: Proof-reading patch by Goldman Eleonóra.
-
- * doc/man/hunspell.4: Fix bad affix example (replace `move' with `work').
- Bug reported by Frederik Fouvry.
-
- * tests/*: new test files:
- affixes.*: simple affix compression example from Hunspell 4 manual page
- checkcompoundcase.*, checkcompoundcase2.*, checkcompoundcaseutf.*
- compound.*, compound2.*, compound3.*, compound4.*, compound5.*
- compoundflag.* (former compound.*)
- flagutf8.*: test for FLAG UTF-8
- germancompounding.*: simplification with CHECKCOMPOUNDCASE.
- germancompoundingold.* (former germancompounding.*)
- i53643.*: check numbers with separators
- i54980.*: ISO8859-15 test
- keepcase.*: test for KEEPCASE
- needaffix*.* (former pseudoroot*.* tests)
- nosuggest.*: test for NOSUGGEST
-
-2005-09-19 Németh László <nemethl@gyorsposta.hu>:
- * src/hunspell/suggestmgr.cxx: improved ngram suggestion:
- - detect not neighboring swap characters (pernament -> permanent)
- Rationale: ngram method has a significant error with not neighboring
- swap characters, especially when swap is in the middle of the word.
- - suggest uppercase forms (unesco -> UNESCO, siggraph's -> SIGGRAPH's)
- - suggest only ngram swap character and uppercase form, if they exist.
- Rationale: swap character and casing equivalence give mutch better
- suggestions as any other (weighted) ngram suggestions.
- - add uppercase suggestion (PERMENANT -> PERMANENT)
-
- * src/hunspell/*: complete comparison with MySpell 3.2 (in OOo beta 2):
- - affixmgr.cxx: add missing numrep initialization
- - hashmgr.cxx: add_word(): don't allocate temporary records
- - hunspell.cxx: in suggest():
- - check capitalized words first (better sug. order for proper names),
- - check pSMgr->suggest() return value
- - set pSMgr->suggest() call to not optional in HUHCAP
- - csutil.cxx: fix bad KOI8-U -> koi8r_tbl reference in enc_entry encds
- - csutil.cxx: fix casing data in ISO 8859-2, Windows 1251 and KOI8-U
- encoding tables. Bug reported by Dmitri Gabinski.
-
- * src/hunspell/affixmgr.*: improved compound word and other features
- - generalize hu_HU specific compound word features with new affix file
- parameters, suggested by Bram Moolenaar:
- - CHECKCOMPOUNDDUP: forbid word duplication in compounds (eg. foo|foo)
- - CHECKCOMPOUNDTRIPLE: forbid triple letters in compounds (eg. foo|obar)
- - CHECKCOMPOUNDPATTERN: forbid patterns at word bounds in compounds
- - CHECKCOMPOUNDREP: using REP replacement table, forbid presumably bad
- compounds (useful for languages with unlimited number of compounds)
- - ONLYINCOMPOUND flag works also with words (see tests/onlyincompound.*)
- Suggested by Daniel Naber, Björn Jacke, Trón Viktor & Bram Moolenaar.
- - PSEUDOROOT works also with prefixes and prefix + suffix combinations
- (see tests/pseudoroot5.*). Suggested by Trón Viktor.
- - man/hunspell.4: updated man page
-
- * src/hunspell/affixmgr.*: fix incomplete prefix handling with twofold
- suffixes (delete unnecessary contclasses[] conditions in
- prefix_check_twosfx() and prefix_check_twosfx_morph()).
- Bug reported by Trón Viktor.
-
- * src/hunspell/affixmgr.*: complete also *_morph() functions with
- conditions of new Hunspell features (circumfix, pseudoroot etc.).
-
- * src/hunspell/suggestmgr.cxx:
- - fix missing suggestions for words with crossed prefix and suffix
- - fix redundant non compound word checking
- - fix losing suggestions problem. Bug reported by Dmitri Gabinski.
-
- * src/hunspell/dictmgr.*:
- - add new dictionary manager for Hunspell UNO modul
- Problems with eo_ANY Esperanto locale reported by Dmitri Gabinski.
-
- * src/hunspell/*: use precise constant sizes for 8-bit and 16-bit character
- arrays with MAXWORDUTF8LEN and MAXSWUTF8L macros.
-
- * src/hunspell/affixmgr.cxx: fix bad MAXNGRAMSUGS parameter handling
-
- * src/hunspell/affixmgr.cxx, src/tools/{un}munch.*: fix GCC 4.0 warnings
- on fgets(), reported by Dvornik László
-
- * po/hu.po: improved translation by Dvornik László
-
- * tests/test.sh: improved test environment
- - add suggestion testing (see tests/*.sug)
- - add memory debugging environment, based on the excellent Valgrind debugger.
- Usage on Linux and experimental platforms of Valgrind:
- VALGRIND=memcheck make check
- - rename test_hunmorph to test.sh
-
- * tests/*: new tests:
- - base.*: base example based on MySpell's checkme.lst.
- - map{,utf}.*, rep{,utf}: MAP and REP suggestion examples
- - tests on new CHECKCOMPOUND, ONLYINCOMPOUND and PSEUDOROOT features
- - i54633.*: capitalized suggestion test for Issue 54633 from OOo's Issuezilla
- - i35725.*: improved ngram suggestion test for Issue 35725
-
-2005-08-26 Németh László <nemethl@gyorsposta.hu>:
-improvements:
-
- * src/hunspell/suggestmgr.cxx:
- Unicode support in related character map suggestion
-
- * src/hunspell/suggestmgr.cxx: Unicode support in ngram suggestion
-
- * src/hunspell/{suggestmgr,affixmgr,hunspell}.cxx: improve ngram suggestion.
- Fix http://qa.openoffice.org/issues/show_bug.cgi?id=35725. See release
- notes for examples. This problem reported by beccablain at OOo.
- - ngram suggestions now are case insensitive (see `Permenant' bug in Issuezilla)
- - weight ngram suggestions (with the longest common subsequent algorithm,
- also considering lengths of bad word and suggestion, identical first
- letters and almost completely identical character positions)
- - set strict affix congruency in expand_rootword(). Now ngram suggestions
- are good for languages with rich morphology and also better for English.
- Rationale: affixed forms of the first ngram suggestion
- very often suppress the second and subsequent root word suggestions. But
- faults in affixes are more uncommon, and can be fix without suggestions.
- We must prefer the more informative second and subsequent root word
- suggestions instead of the suggestions for bad affixes.
- - a better suggestion may not be substring of a less good suggestion
- Rationale: Suggesting affixed forms of a root word is
- unnecessary, when root word has got better weighted ngram value.
- (Checking substrings is a good approximation for this refinement.)
- - lesser ngram suggestions (default 3 maximum instead of 10)
- Rationale: For users need a big extra effort to check a lot of bad ngram
- suggestions, nine times out of ten unnecessarily. It is very
- distracting, because ngram suggestions could be very different.
- Usually Myspell and Hunspell suggest one or two suggestions with
- the old suggestion algorithms (maximum is 15), with ngram algorithm
- often gives maximum number suggestions. With strict affix congruency
- and other refinements, the good suggestion there is usually among the
- first three elements.
- - new affix parameter: MAXNGRAMSUG
-
- * src/hunspell/*: support agglutinative languages with rich prefix
- morphology or with right-to-left writing system (for example, Turkic
- and Austronesian languages with (modified) Arabic scripts).
- - new affix parameter: COMPLEXPREFIXES
- Set twofold prefix stripping (but single suffix stripping)
- * src/hunspell/affixmgr.cxx:
- - speed up prefix loading with tree sorting algorithm.
- * tests/complexprefixes.*, tests/complexprefixesutf.*:
- Coptic example posted by Moheb Mekhaiel
-
- * src/hunspell/hashmgr.cxx: check size attribute in dic file
- suggested by Daniel Naber
- Rationale: With missing size attribute Hunspell allocates too small and
- more slower hash memory, and Hunspell can lose first dictionary word.
-
- * src/hunspell/affixmgr.cxx: check stripping characters and condition
- compatibility in affix rules (bugs detected in cs_CZ, es_ES, es_NEW,
- es_MX, lt_LT, nn_NO, pt_PT, ro_RO and sk_SK dictionaries). See release
- notes of Hunspell 1.0.9 in NEWS.
-
- * src/hunspell/affixmgr.cxx: check unnecessary fields in affix rules
- (bugs detected in ro_RO and sv_SE dictionaries). See release notes.
-
- * src/hunspell/affixmgr.cxx: remove redundant condition checking
- in affix rules with stripping characters (redundancy in OpenOffice.org
- dictionaries reported by Eleonóra Goldman)
- Rationale: this is a little optimization, but it was excellent for
- detect the bad ngram affixation with bad or weak affix conditions.
-
- * tests/germancompounding.aff: improve compound definition
- - use dash prefix instead of language specific tokenizer
- Rationale: Using uniform approach is the right way to check and analyze
- compound words. Language specific word breaking is deprecated, need
- a sophisticated grammar checking for word-like word pairs
- (for example in Hungarian there is a substandard, but accepted
- syntax with dash for word pairs: cats, dogs -> kutyák-macskák (like
- cats/dogs in English).
-
- * test Hunspell with 54 OpenOffice.org dictionaries: see release notes
-
-bug fixes:
-
- * src/hunspell/suggestmgr.*: add time limit to exponential
- algorithm of the related character map suggestion
- Rationale: a long word in agglutinative languages or a special pattern
- (for example a horizontal rule) made of map characters can `crash' the
- spell checker.
-
- * src/hunspell/affentry.cxx: add() functions: fix bad word generation
- checking stripping characters (see similar bug in unmunch)
-
- * src/hunspell/affixmgr.cxx: parse_file(): fix unconditional getNext()
- call for ~AffixMgr() when affix file is corrupt.
-
- * src/hunspell/affixmgr.*: AffixMgr(), parse_cpdsyllable(): fix missing
- string duplications for ~AffixMgr() when affix file is corrupt.
-
- * src/hunspell/affixmgr.*: parse_affix(): fix fprintf() call when affix
- file is corrupt. Bug reported by Daniel Naber.
-
- * suggestmgr.cxx: replace single usage of 'strdup' with 'mystrdup'
- patch by Chris Halls (debian.org)
-
- * src/hunspell/makefile.mk: add makefile.mk for compiling in OpenOffice.org
- See README in Hunspell UNO modul.
- Problems with separated compiling reported by Rene Engelhard
-
- * src/hunspell/hunspell.cxx: fix pseudoroot support
- - search a not pseudoroot homonym in check()
- * tests/pseudoroot4.*: test this fix
-
- * src/tools/unmunch.c: fix bad word generation when conditions
- are shorter or incompatible with stripping characters in affix rules
-
- * src/tools/unmunch.c: fix mychomp() for de_AT.dic and other dic files
- without last new line character.
-
-other changes:
- * src/hunspell/suggestmgr.*: erase ACCENT suggestion
- Rationale: ACCENT suggestion was the same as Kevin Hendrick's map
- suggestion algorithm, but with a less good interface in affix file.
-
- * src/hunspell/suggestmgr.*: combine cycle number limit
- in badchar(), and forgotchar() with a time limit.
-
- * src/hunspell/affixmgr.*: remove NOMAPSUGS affix parameter
-
- * src/hunspell/{suggestmgr,hunspell}.*: strip periods from
- suggestions (restore MySpell's original behaviour)
- Rationale: OpenOffice.org has an automatic period handling mechanism
- and suggestions look better without periods.
- - new affix file parameter: SUGSWITHDOTS
- Add period(s) to suggestions, if input word terminates in period(s).
- (No need for OpenOffice.org dictionaries.)
-
- * tests/germancompounding.aff: improve bad german affix in affix example
- (computeren->computern). Suggested by Daniel Naber.
-
- * src/tools/example.cxx: add Myspell's example
-
- * src/tools/munch.cxx: add Myspell's munch
-
- * man{,/hu}/hunspell.4: refresh manual pages
-
-2005-08-01 Németh László <nemethl@gyorsposta.hu>:
- * add missing MySpell files and features:
- - add MySpell license.readme, README and CONTRIBUTORS ({license,README,AUTHORS}.myspell)
- - add MySpell unmunch program (src/tools/unmunch.c)
- - add licenses to source (src/hunspell/license.{myspell,hunspell})
- - port MAP suggestion (with imperfect UTF-8 support)
- - add NOSPLITSUGS affix parameter
- - add NOMAPSUGS affix parameter
-
- * src/man/man.4: MAP, COMPOUNDPERMITFLAG, NOSPLITSUGS, NOMAPSUGS
-
- * src/hunspell/aff{entry,ixmgr}.cxx:
- - improve compound word support
- - new affix parameter: COMPOUNDPERMITFLAG (see manual)
- * src/tests/compoundaffix{,2}.*: examples for COMPOUNDPERMITFLAG
- * src/tests/germancompounding.*: new solution for German compounding
- Problems with German compounding reported by Daniel Naber
-
- * src/hunspell/hunspell.cxx: fix German uppercase word spelling
- with the spellsharps() recursive algorithm.
- Default recursive depth is 5 (MAXSHARPS).
- * src/tests/germansharps*: extended German sharp s tests
-
- * src/tools/hunspell.cxx: fix fatal memory bug in non-interactive
- subshells without HOME environmental variable
- Bug detected with PHP by András Izsók.
-
-2005-07-22 Németh László <nemethl@gyorsposta.hu>:
- * src/hunspell/csutil.hxx: utf16_u8()
- - fix 3-byte UTF-8 character conversion
-
-2005-07-21 Németh László <nemethl@gyorsposta.hu>:
- * src/hunspell/csutil.hxx: hunspell_version() for OOo UNO modul
-
-2005-07-19 Németh László <nemethl@gyorsposta.hu>:
- * renaming:
- - src/morphbase -> src/hunspell
- - src/hunspell, src/hunmorph -> src/tools
- - src/huntokens -> src/parsers
-
- * src/tools/hunstem.cxx: add stemmer example
-
-2005-07-18 Németh László <nemethl@gyorsposta.hu>:
- * configure.ac: --with-ui, --with-readline configure options
- * src/hunspell/hunspell.cxx: fix conditional compiling
-
- * src/hunspell/hunspell.cxx: set HunSPELL.bak temporaly file
- in the same dictionary with the checked file.
-
- * src/morphbase/morphbase.cxx:
-
- - handling German sharp s (ß)
-
- - fix (temporaly) analyize()
-
- * tests: a lot of new tests
-
- * po/, intl/, m4/: add gettext from GNU hello
-
- * po/hu.po: add Hungarian translation
-
- * doc/, man/: rename doc to man
-
-2005-07-04 Németh László <nemethl@gyorsposta.hu>:
- * src/morphbase/hashmgr.cxx: set FLAG attributum instead of FLAG_NUM and FLAG_LONG
-
- * doc/hunspell.4: manual in English
-
-2005-06-30 Németh László <nemethl@gyorsposta.hu>:
- * src/morphbase/csutil.cxx: add character tables from csutil.cxx of OOo 1.1.4
-
- * src/morphbase/affentry.cxx: fix Unicode condition checking
-
- * tests/{,utf}compound.*: tests compounding
-
-2005-06-27 Németh László <nemethl@gyorsposta.hu>:
- * src/morphbase/*: fix Unicode compound handling
-
-2005-06-23 Halácsy Péter:
- * src/hunmorph/hunmorph.cxx: delete spelling error message and suggest_auto() call
-
-2005-06-21 Németh László <nemethl@gyorsposta.hu>:
- * src/morphbase: Unicode support
- * tests/utf8.*: SET UTF-8 test
-
- * src/morphbase: checking and fixing with Valgrind
- Memory handling error reported by Ferenc Szidarovszky
-
-2005-05-26 Németh László <nemethl@gyorsposta.hu>:
- * suggestmgr.cxx: fix stemming
- * AUTHORS, COPYING, ChangeLog: set CC-LGPL free software license
-
-2004-05-25 Varga Dániel <daniel@all.hu>
- * src/stemtool: new subproject
-
-2005-05-25 Halácsy Péter <peter@halacsy.com>
- * AUTHORS, COPYING: set CC Attribution license
-
-2004-05-23 Varga Dániel <daniel@all.hu>
- * src: - modifications for compiling with Visual C++
-
- * src/hunmorph/csutil.cxx: correcting header of flag_qsort(),
- * src/hunmorph/*: correct csutil include
-
-2005-05-19 Németh László <nemethl@gyorsposta.hu>
- * csutil.cxx: fix loop condition in lineuniq()
- bug reported by Viktor Nagy (nagyv nyelvtud hu).
-
- * morphbase.cxx: handle PSEUDOROOT with zero affixes
- bug reported by Viktor Nagy (nagyv nyelvtud hu).
- * tests/zeroaffix.*: add zeroaffix tests
-
-2005-04-09 Németh László <nemethl@gyorsposta.hu>
- * config.h.in: reset with autoheader
-
- * src/hunspell/hunspell.cxx: set version
-
-2005-04-06 Németh László <nemethl@gyorsposta.hu>
- * tests: tests
-
- * src/morphbase:
- New optional parameters in affix file:
- - PSEUDOROOT: for forbidding root with not forbidden suffixed forms.
- - COMPOUNDWORDMAX: max. words in compounds (default is no limit)
- - COMPOUNDROOT: signs compounds in dictionary for handling special compound rules
- - remove COMPOUNDWORD, ONLYROOT
-
-2005-03-21 Németh László <nemethl@gyorsposta.hu>
- * src/morphbase/*:
- - 2-byte flags, FLAG_NUM, FLAG_LONG
- - CIRCUMFIX: signed suffixes and prefixes can only occur together
- - ONLYINCOMPOUND for fogemorpheme (Swedish, Danish) or Flute-elements (German)
- - COMPOUNDBEGIN: allow signed roots, and roots with signed suffix in begin of compounds
- - COMPOUNDMIDDLE: like before, but middle of compounds
- - COMPOUNDEND: like before, but end of compounds
- - remove COMPOUNDFIRST, COMPOUNDLAST