diff options
Diffstat (limited to 'libs/hunspell/docs/ChangeLog')
| -rw-r--r-- | libs/hunspell/docs/ChangeLog | 1993 |
1 files changed, 0 insertions, 1993 deletions
diff --git a/libs/hunspell/docs/ChangeLog b/libs/hunspell/docs/ChangeLog deleted file mode 100644 index 1f6e774a63..0000000000 --- a/libs/hunspell/docs/ChangeLog +++ /dev/null @@ -1,1993 +0,0 @@ -2016-04-29 Caolán McNamara <caolanm at LibO>: - * deprecate old api and add new one - old one remains implemented in terms of new one - and will eventually be removed - * shrink exposed api down to just hunspell.hxx - * next major release is likely to require C++11 - -2016-04-15 Caolán McNamara <caolanm at LibO>: - * generally using std::string and std::vector internally - -2016-04-13 Caolán McNamara <caolanm at LibO>: - * gh#371 drop experimental code - -2015-09-11 Caolán McNamara <caolanm at LibO>: - * rhbz#1261421 crash on mashing hangul korean keyboard - -2014-12-03 Németh László <nemeth at numbertext dot org>: - * tools/hunspell.cxx: security fixes of the Hunspell executable - - secure file name handling, the problem (checking - OpenDocument files with malicious file names) - reported by Eric Sesterhenn - - using tmpnam() only with system("mkdir tempname && ...") - -2014-10-17 Caolán McNamara <caolanm at LibO>: - * sf#245 Feature from Anish Patil -S mode - to show suggestions for completion of - correctly spelled words - * sf#248 Fix manpage about how to include - -2014-10-16 Caolán McNamara <caolanm at LibO>: - * rhbz#915448, sf#57, sf#185 report character offset - and not byte offset in ispell mode - * sf#56 segv in experimental mode - * sf#228 don't translate init string - -2014-09-22 Németh László <nemeth at numbertext dot org>: - * fix crash in morphological analysis of the Hungarian - compound word 'művészegyéniség', reported by Gáspár Sinai - -2014-08-26 Németh László <nemeth at numbertext dot org>: - * unmunch separates flags of prefixes from the word, - bug reported by Daniel Naber - -2014-08-05 Németh László <nemeth at numbertext dot org>: - * moz#318040 Mozzilla accepts abbreviations without dots - * myfopen(): add _wfullpath to expand relative parts of absolute paths - -2014-07-16 Caolán McNamara <caolanm at LibO>: - * moz#675553 Switch from PRBool to bool - * moz#690892 replace PR_TRUE/PR_FALSE with true/false - * Silence the warning about empty while body loop in clang - * moz#777292 Make nsresult an enum - * moz#579517 Use stdint types in gecko - * moz#784776 consistently use FLAG_NULL - * moz#927728 Convert PRUnichar to char16_t - * moz#943268 Remove nsCharsetAlias and nsCharsetConverterManager - * Don't include config.h in license.hunspell if MOZILLA_CLIENT is set - -2014-06-26 Caolán McNamara <caolanm at LibO>: - * clang scan-build: Allocator sizeof operand mismatch - * clang scan-build: other low hanging warnings - * clang scan-build: significant warnings - -2014-06-02 Németh László <nemeth at numbertext dot org>: - * escape spaces in paths of ODF files - -2014-05-28 Németh László <nemeth at numbertext dot org>: - * add long path/Unicode path support in WIN32 environment: - - hunspell#233 (reported by mahak gark) and LibreOffice fdo#48017 - * flat ODF support, eg.: - hunspell doc.fodt - cat doc.fodt | hunspell -l -O - * new options: - - -X (XML) input format - - -O (ODF or flat ODF) input format - - --check-apostrophe: check and force Unicode apostrophe usage - (ASCII or Unicode apostrophe has to be in the - WORDCHARS section of the affix file) - * fix ODF support: - - break 1-line XML of ODT documents at </style:style>, too, - not only at </text:p> (limiting tokenization problems, when - fgets stops within an XML tag) - - show ODF file path on the UI instead of the temporary file - * fix XML support: - - ', ", &, < and > in replacements converted to XML entities - - recognize &apos at tokenization, depending from WORDCHARS - - ' in tokens converted to ' before spell checking and - in the output of the pipe interface - * better apostrophe usage: - - WORDCHARS only with one of the Unicode or ASCII apostrophe - results extended word tokenization: both of them will be part of - the words (if they are inside: eg. word's, but not words'). - - convert Unicode apostrophes to ASCII ones for 8-bit dictionaries - (eg. English dictionaries), or for UTF-8 dictionaries only - with ASCII apostrophe supports (eg. French dictionaries). - * updated manual: - - hunspell.4 renamed to hunspell.5, see - hunspell#241 reported by Cristopher Yeleighton - - updated translations - - note about long/Unicode paths in WIN32 (hunspell.3) - -2014-04-25 Németh László <nemeth at numbertext dot org>: - * OpenDocument support, eg. - hunspell *.odt - hunspell -l *.odt - * always load default personal dictionary (fix - filtering bad words - reduce this word list - using - it as a personal dictionary workflow) - * fix parsing/URL recognition problem (bad tokens - with aposthrophes) - -2013-07-25 pchang9@cs.wisc.edu - * moz#897255 Wasted work in line_uniq - * moz#897780 Wasted work in SuggestMgr::twowords - -2013-07-25 Caolán McNamara <caolanm at LibO>: - * hunspell#167 layout problems with long lines - - based on the original fix by xorho - adapted to HEAD - * rhbz#925562 upgrade config.guess for aarch64 - -2013-07-24 pchang9@cs.wisc.edu - * moz#896301 Wasted work in SfxEntry::checkword - * moz#896844 Wasted work in AffixMgr::defcpd_check - -2013-06-13 Konstantin Khlebniko - * #49 HashMgr::add_word computes wrong size for struct hentry - -2013-06-13 Ville Skyttä - * #53 Man page syntax fixes - -2013-04-19 John Thomson <john thomson at SIL> - * win_api: add remove() of Hunspell API (hun#3606435) - -2013-04-19 Rouslan Solomokhin <at sf.net> - * fix crash in suggestions for 99-character long words - by extending arrays of SuggestMgr::forgotchar_* - (hun#3595024, also http://crbug.com/130128), - thanks to also Paweł Hajdan to report the patch - -2013-04-01 Caolán McNamara <caolanm at LibO>: - * hunspell: -Werror=undef - -2013-03-13 Caolán McNamara <caolanm at LibO>: - * rhbz#918938 crash in interaction with danish thesaurus - -2012-09-18 Németh László <nemeth at numbertext dot org>: - * src/hunspell/affixmgr.*: - fix morphological analysis of - compound words (hun#3544994, reported by Dávid Nemeskey, fdo#55045) - -2012-06-29 Caolán McNamara <caolanm at LibO>: - * fix various coverity warnings - -2012-01-10 Ehsan Akhgari <ehsan at mozilla dot com> - * moz#710940 Firefox Crash [@ AffixMgr::parse_file(char const*, char - const*) ] - -2011-12-16 Jared Wein <jwein at mozilla dot com> - * moz#710967 Incorrect argument passed to strncmp in - AffixMgr::parse_convtable - -2011-12-06 Caolán McNamara <caolanm at LibO>: - * rhbz#759647 fixed tempname of hunSPELL.bak collides with other users - when multiple edits in one dir - -2011-10-13 Caolán McNamara <caolanm at LibO>: - * moz#694002 crash in hunspell affixmgr on exit with bad .aff - * leak in hunspell affixmgr with bad .aff - -2011-09-19 Caolán McNamara <caolanm at LibO>: - * make libparsers.a not installed thanks to Tomáš Chvátal - -2011-06-23 Caolán McNamara <caolanm at LibO>: - * fix some windows compiler warnings - -2011-05-24 Németh László <nemeth at numbertext dot org>: - * src/hunspell/affixmgr.*: allow twofold suffixes in compounds - by extended version of Arno Teigseth's patch, see hun#3288562. - - new option for this feature: COMPOUNDMORESUFFIXES - -2011-02-16 Németh László <nemeth at numbertext dot org>: - * src/*/Makefile.am: fix library versioning, the probem reported by - Rene Engerhald and Simon Brouwer. - - * man/hunspell.4: new version based on the revised version of Ruud Baars - -2011-02-02 Németh László <nemeth at OOo>: - * suggestngr.cxx: fix ngram PHONE suggestion for input words with - diacritics using UTF-8 encoded dictionaries (add byte length to the - 8-bit phonet() argument instead of character length) - - * suggestmgr.cxx: fix missing csconv problem with UTF-8 encoding - dictionares, when the input contains non-BMP characters - - tests/utf8_nonbmp.sug: test file - - * suggestmgr.cxx: mixed and keyboard based character suggestions - don't forbid ngram suggestion search (optimized tests/suggestiontest) - - * affixmgr.cxx: fix hun#2999225: interfering compounding mechanisms, - tested on Dutch word list and reported by Ruud Baars - - * affixmgr.cxx: allomorph fix for hun#2970240 (Hungarian - compound "vadász+gép" was analyzed as vad+ász+gép, and rejected - by the ss->s rep rule (verb "vadássz"), but the analysis - didn't continue for the longer word parts (vadász+gép). - - * csutil.cxx: add lang code "az_AZ", "hu_HU", "tr_TR" for back - compatibility (fixing Azeri and Turkish casing conversion, also - Hungarian compound handling) - - * affixmgr.cxx: fix morphological analysis - -2011-01-26 Németh László <nemeth at OOo>: - * affixmgr.cxx: fix for moz#626195 (memcheck problem with FULLSTRIP). - - * affixmgr.*, suggestmgr.cxx: FORBIDWARN parameter (see manual) - -2011-01-24 Németh László <nemeth at OOo>: - * suffixmgr.cxx: fix bad suggestion of forbidden compound words, eg. - "termijndoel" with the Dutch dictionary. Reported by Ruud Baars. - - * latexparser.cxx: fix double apostrophe TeX quoation mark tokenization - (hun#3119776), reported by Wybodekker at SF.net. - - * tests/suggestiontest/*: multilanguage and single Hunspell version, see README - * tests/suggestiontest/prepare2: for make -f Makefile.orig single - -2011-01-22 Németh László <nemeth at OOo>: - * affixmgr.*, suggestmgr.*: new features - ONLYMAXDIFF: remove all bad ngram suggestions (default mode keeps one) - NONGRAMSUGGEST: similar to NOSUGGEST, but it forbids to use the word - in ngram based (more, than 1-character distance) suggestions. - -2011-01-21 Németh László <nemeth at OOo>: - * suggestmgr.*: limit wild suggestions (hun#2970237 by Ruud Baars) - - limited compound word suggestions - - improved and limited ngram based suggestions - * tests/*.sug: modified test files - - feature MAXCPDSUGS: - MAXCPDSUGS 0 : no compound suggestion, suggested by - Finn Gruwier Larsen in hunfeat#2836033 - MAXCPDSUGS n : max. ~n compound suggestions - - feature MAXDIFF: differency limit for ngram suggestions: 0-10 - eg. MAXDIFF 5: normal (default) limit - MAXDIFF 0: only one ngram suggestion - MAXDIFF 10: ~maxngramsugs ngram suggestions - - * affixmgr.*, hunspell.*: add flag FORCEUCASE (hun#2999228), force - capitalization of compound words, see Hunspell 4 manual), - suggested by Ruud Baars - test/forceucase.*: test files - - * affixmgr.*, hunspell.*: add flag WARN (hun#1808861), optional warning feature - for rare words, suggested by Ruud Baars - tests/warn: test files - * tools/hunspell.cxx: add option -r for optional filtering of rare words - - * affixmgr.cxx: fix hun#3161359 (gcc warnings) reported by Ryan VanderMeulen. - -2011-01-17 Németh László <nemeth at OOo>: - * suggestmgr.cxx: fix hun#3158994 and hun#3159027 (missing csconv table - using awkward 8bit capitalization of UTF-8 encoded dictionary words with PHONE - suggestion, reported by benjarobin and dicollecte at SF.net). - -2011-01-13 Németh László <nemeth at OOo>: - * affixmgr.cxx: ONLYINCOMPOUND fix for hun#2999224 (fogemorphene - was allowed in end position of compoundings). Reported by Ruud Baars. - * tests/onlyincompound2.*: test files - -2011-01-10 Ingo H. de Boer <idb_winshell at SF.net>: - * win_api/{hunspell,libhunspell, testparser}.vcproj: updated project - files for the library and the executables. Compiling problem - also reported by Don Walker. - -2011-01-06 Németh László <nemeth at OOo>: - * affixmgr.cxx: fix freedesktop#32850 (program halt during Hungarian - spell checking of the word "6csillagocska6", reported by András Tímár) - - * tools/hunspell.cxx: add Mac OS X Hunspell dictionary paths, asked by - Vidar Gundersen in hunfeat#3142010 - -2011-01-05 Caolán McNamara <cmc at OOo>: - * moz#620626 NS_UNICHARUTIL_CID doesn't support - case conversion - -2011-01-03 Németh László <nemeth at OOo>: - * NEWS and THANKS: update for release 1.2.13 - -2010-12-20 Németh László <nemeth at OOo>: - * affixmgr.cxx: hun#3140784 - -2010-12-16 Németh László <nemeth at OOo>: - * affixmgr.cxx: - - improved fix of hun#2970242 (supporting - zero affixes, reported by Ruud Baars - - tests/opentaal_cpdpat{,2}: test files - - - switching off default BREAK parameters by BREAK 0, - reported by Ruud Baars - - - hun#2999225: interfering compounding mechanisms, reported by Ruud Baars - -2010-12-11 Németh László <nemeth at OOo>: - * affixmgr.cxx: fix hun#2970242 (CHECKCOMPOUNDPATTERN only with flags), - the bug reported by Ruud Baars - * tests/2970242.*: test files - - * tests/2970240.*: test files for CHECKCOMPOUNDPATTERN fix (check all - boundaries in compound words, fixed by the previous CHECKCOMPOUNDREP - fix), the bug reported by Ruud Baars - - * win_api/Makefile.cygwin: update - -2010-12-09 Caolán McNamara <cmc at OOo>: - * moz#617953 fix leak - -2010-11-08 Caolán McNamara <cmc at OOo>: - * rhbz#650503 crash in arabic dictionary - -2010-11-05 Caolán McNamara <cmc at OOo>: - * rhbz#648740 don't warn on empty flagvector - -2010-11-03 Caolán McNamara <cmc at OOo>: - * logically we shouldn't need a csconv table in utf-8 mode - -2010-10-27 Németh László <nemeth at OOo>: - * hun#3000055 (requested by Ruud Baars) add REP boundary specifiation: - REP ^word$ xxxx - REP ^wordstarting xxxx - REP wordending$ xxxx - - * hun#3008434 (requested by Adrián Chaves Fernández) and - hun#3018929 (requested by Ruud Baars): REP with more than 2 words: - REP morethantwo more_than_two - - * suggestmgr.cxx: fix incomplete suggestion list for capitalized words, - eg. missing Machtstrijd->Machtsstrijd in the Dutch dictionary - (reported by Ruud Bars) - - * tests, man: related updates - -2010-10-12 Caolán McNamara <cmc at OOo>: - * moz#603311 HashMgr::load_tables leaks dict when decode_flags fails - * fix mem leak found with new tests - * hun#3084340 allow underscores in html entity names - -2010-10-07 Németh László <nemeth at OOo>: - * affixmgr.cxx: - - hun#2970239 fix bad suggestion of forbidden compound words - - hun#2999224 fix keepcase feature on compound words (only partial - fix for COMPOUNDRULE based compounding) - - fix checkcompoundrep feature in compound words (check all boundaries, - not only the last one) - Problems reported by Ruud Baars. - - * tests/opentaal_forbiddenword[12]*, tests/opentaal_keepcase*: - new test files for the previous fixes - * tests/checkcompoundrep: extended test file. - -2010-09-05 Caolán McNamara <cmc at OOo>: - * moz#583582 fix double buffer gcc fortify issue - -2010-08-13 Caolán McNamara <cmc at OOo>: - * moz#586671 AffixMgr::parse_convtable leaks pattern/pattern2 if it - can't create both - * moz#586686 tidy up get_xml_list and friends - -2010-08-10 Caolán McNamara <cmc at OOo>: - * hun#3022860 fix remove duplicate code - -2010-07-17 Caolán McNamara <cmc at OOo>: - * remove ununsed get_default_enc and avoid potential misrecognition of - three letter language ids - * normalize encoding names before lookup - -2010-07-05 Caolán McNamara <cmc at OOo>: - * hun#2286060 add Hangul syllables to unicode tables - -2010-06-26 Caolán McNamara <cmc at OOo>: - * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz - case - -2010-06-13 Caolán McNamara <cmc at OOo>: - * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz - case - -2010-06-02 Caolán McNamara <cmc at OOo>: - * moz#569611 compile cleanly under win64 - -2010-05-22 Caolán McNamara <cmc at OOo>: - * moz#525581 apply mozilla's current preferred get_current_cs impl - -2010-05-17 Németh László <nemeth at OOo>: - * affixmgr.cxx: fix bad limitation of parenthesized flags at - COMPOUNDRULEs. Windows crash reported by Ruud Baars and Simon Brouwer. - -2010-05-05 Caolán McNamara <cmc at OOo>: - * rhbz#589326 malloc of int that should have been of char** - * hun#2997388 fix ironic misspellings - -2010-04-28 Caolán McNamara <cmc at OOo>: - * moz#550942 get_xml_list doesn't handle failure from get_xml_par - -2010-04-27 Caolán McNamara <cmc at OOo>: - * moz#465612 mozilla-specific code leaks - * moz#430900 phone is dereferenced before oom check - * moz#418348 ckey_utf alloc is used unchecked in SuggestMgr::badcharkey_utf - * CID#1487 pointer "rl" dereferenced before NULL check - * CID#1464 Returned without freeing storage "ptr" - * CID#1459 Avoid duplicate strchr - * CID#1443 Avoid any chance of dereferencing *slst - * CID#1442 Unsafe to have a null morph - * CID#1440 Avoid null filenames - * CID#1302 Dereferencing NULL value "apostrophe" - * CID#1441 Avoid deferencing null ppfx - -2010-04-16 Caolán McNamara <cmc at OOo>: - * hun#2344123 fix U)ncap in utf-8 locale - * fix up hunspell text UI and lines wider than terminal - -2010-04-15 Caolán McNamara <cmc at OOo>: - * hun#2613701 fix small leak in FileMgr::FileMgr - * fix small leak in tools/hunspell - * hun#2871300 avoid crash if def and words are NULL - * hun#2904479 fix length of hzip file - * hun#2986756 mingw build fix - * hun#2986756 fix double-free - * hun#2059896 fix crash in interactive mode without nls - * hun#2917914 add some extra words to the latexparser - * make some structs static - * C-api has duped symbol names - * regenerate gettext/intl with recent version - * hun#2796772 build a .dll under MinGW - * rhbz#502387 allow cross-compiling for MinGW target - * hun#2467643 update .vcproj files to include replist.?xx - * unify visiblity/dll_export support across platforms - * hun#2831289 sizeof(short) typo - * hun#2986756 add -u3 gcc style output - -2010-04-14 Caolán McNamara <cmc at OOo>: - * hun#2813804 fix segfault on hu_HU stemming - -2010-04-13 Caolán McNamara <cmc at OOo>: - * hun#2806689 fix ironic misspellings - * hun#2836240 add Italian translations - -2010-04-09 Caolán McNamara <cmc at OOo>: - * fix titchy possible leak in command-line spellchecker - -2010-04-07 Caolán McNamara <cmc at OOo>: - * hun#2973827 apply win64 patch - * hun#2005643 fix broken mystrdup - -2010-03-04 Caolán McNamara <cmc at OOo>: - * ooo#107768 fix crash in long strings in spellml mode - * hun#1999737 add some malloc checks - * hun#1999769 drop old buffer on realloc failure - * hun#2005643 tidy string functions - * hun#2005643 micro-opt - * hun#2006077 free strings on failed dict parse - * hun#2110783 ispell-alike verbose mode implementation - -2010-03-03 Németh László <nemeth at OOo>: - * hunspell/(affixmgr, suggestmgr).cxx: add character sequence - support for MAP suggestion, using parenthesized character groups - in the syntax, eg. MAP ß(ss). - * man/hunspell.4, tests/map*: documentation and test files - -2010-02-25 Németh László <nemeth at OOo>: - * hunspell/hunspell.cxx: add recursion limit for BREAK (fix OOo Issue 106267) - - * hunspell/hunspell.cxx: fix crash in morphological analysis of - capitalized words with ending dashes - - * affixmgr.cxx: fix morphological analysis of long numbers combined with dash, - eg. 45-00000045 (reported by a@freeblog.hu). - -2010-02-23 Caolán McNamara <cmc at OOo>: - * hun#2314461 improve ispell-alike mode - * hun#2784983 improve default language detection - * hun#2812045 fix some compiler warnings - * hun#2910695 survive missing HOME dir - * hun#2934195 fix suggestmgr crash - * hun#2921129 remove unused variables - * hun#2826164 make sure make check uses the in-tree libhunspell - * bump toolchain to support --disable-rpath - * hun#2843984 fix coverity warning - * hun#2843986 fix coverity warning - * hun#2077630 add iconv lib - * make gcc strict-aliasing warning free - * make cppcheck warning free - -2008-11-01 Németh László <nemeth at OOo>: - * replist.*, hunspell.cxx, affixmgr.cxx: new input and output - conversion support, see ICONV and OCONV keywords in the Hunspell(4) - manual page and the test examples. The input/output conversion - problem of syllabic languages reported by Daniel Yacob and - Shewangizaw Gulilat. - - tests/{iconv,oconv}.*: test examples - - * tools/wordforms: word generation script for dictionary developers - (Hunspell version of the unmunch program) - - * hunspell/hunspell.cxx: extended BREAK feature: ^ and $ mean in break - patterns the beginning and end of the word. - - tests/BREAK.*: modified examples. - - * hunspell/hunspell.cxx: set default break at hyphen characters. - The associated problem reported by S Page in Hunspell Bug 2174061. - See Mozilla Bug ID 355178 and OOo Issue 64400, too. - - tests/breakdefault.*: test data - The following definition is equivalent of the default word break: - - BREAK 3 - BREAK - - BREAK ^- - BREAK -$ - - * affixmgr.cxx: SIMPLIFIEDTRIPLE is a new affix file keyword to allow - simplified forms of the compound words with triple repeating letters. - It is useful for Swedish and Norwegian languages. - - * affixmgr.cxx: extend CHECKCOMPOUNDPATTERN to support - alternations of compound words for example by sandhi - feature of Indian and other languages. The problem reported - by Kiran Chittella associated with Telugu writing system - (see Telugu example in tests/checkcompoundpattern4.test). - The new optional field of CHECKCOMPOUNDPATTERN definition is the - replacement of the compound boundary defined by the previous fields: - CHECKCOMPOUNDPATTERN ff f ff - means ff|f compound boundary has been replaced by "ff", like in - the (prereform) German Schiffahrt (Schiff+fahrt). - - CHECKCOMPOUNDPATTERN supports also optional flag conditions now: - CHECKCOMPOUNDPATTERN ff/A f/B ff - means that the first word of the compound needs flag "A" and - the second word of the compound needs flag "B" to the operation. - - * tools/hunspell.cxx: add empty lines as separators to the output of - the stemming and morphological analysis. - - * affixmgr.cxx: fix condition checking algorithm. Bad suggestion - generation reported by Mehmet Akin in SF.net Bug 2124186 with help of - Eleonora Goldman. - - * affixmgr,cxx: fix COMPOUNDWORDMAX feature. The problem and its - code details reported by Göran Andersson under SF.net Bug ID 2138001. - - * csutil.cxx: fix bad conditional code for Mozilla compilation. - Patch by Serge Gautherie. The problem reported by Ryan VanderMeulen. - - * hunspell/hunspell.cxx: add missing ngram suggestion for HUHINITCAP - (capitalized mixed case) words. - - * w_char.hxx: use GCC conditions for GCC related code. Patch by - Ryan VanderMeulen. - - * affixmgr.cxx: check morphological description in morphgen() - (fix potential program fault by incomplete morphological - description of affix rules) - - * src/win_api: config.h: switch on warning messages on Windows - - * tools/affixcompress: extended help for -h (use LC_ALL=C sort - for input word list) - - * man/hunspell.4: updated manual: - - new and modified features (SIMPLIFIEDTRIPLE, ICONV, OCONV, - BREAK, CHECKCOMPOUNDPATTERN). - - note about costs of zero affixes, suggested by Olivier Ronez. - - * hunspell/hunspell.cxx: remove deprecated word breaking codes. - -2008-08-15 Németh László <nemeth at OOo>: - * affentry.cxx: add FULLSTRIP option. With FULLSTRIP, affix rules can - strip full words, not only one less characters. Suggested by - Davide Prina and other developers in OOo Issue 80145. - * tests/fullstrip.*: Test data based on Davide Prina's example. - * tools/unmunch.cxx: modified for FULLSTRIP. - - * affixmgr.cxx: COMPOUNDRULE now works with long and numerical flag - types by parenthesized flags. Syntax: (flag)*, (flag)(flag)?(flag)*. - * tests/compoundrule[78].*: tests with parenthesized COMPOUNDRULE - definitions. - - * suggestmgr.cxx: modified badchar*(), forgotchar*() and extrachar*() - 1-character distance suggestion algorithms: search a TRY character - in all position instead of all TRY characters in a character position - (it can give more readable suggestion order, also better suggestions - in the first positions, when TRY characters are sorted by frequency.) - For example, suggestions for "moze": - ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6), - maze, more, mote, ooze, mole etc. (Hunspell 1.2.7). - - * suggestmgr.cxx: extended compound word checking for better COMPOUNDRULE - related suggestions, for example English ordinal numbers: 121323th -> - 121323rd (it needs also a th->rd REP definition). - - * phonet.cxx: cast unsigned char parameter of isdigit() and fix - isalpha by myisalpha() (potential problems in Windows environment). - Reported by Thomas Lange in OOo Issue 92736. - - * hunspell/csutil.*,hunspell/{affentry,affixmgr,hunspell,suggestmgr}.cxx: - fix potential buffer overloading under morphological analysis by the - new mystrcat() function. Reported by Molnár Andor (dolhpy at true - dot hu) in SF.net Bug 2026203. - - * affixmgr.cxx: add recursion limit to defcpd(). Fix OOo Issue 76067: - crash-like deceleration by checking hexadecimal numbers with long FFF - sequence (combinatory explosion by the en_US words "f" and "ff"). - Missing fix reported by Mathias Bauer. - - * affixmgr.cxx: fix the difference in the Unicode and non-Unicode - parts of cpdcase_check(). Bug report by Brett Wilson. - - * filemgr.*, affixmgr.cxx, csutil.*, hashmgr.*: warning messages now - contain line numbers (use --with-warnings configure option for - warning messages). - - * hunspell.cxx: analyze(): fix case conversion of stemming and - morphological analysis of UTF-8 encoded input. Reported by Ferenc Godó. - - * tools/hunspell.cxx: fix LaTeX Unicode support in filter mode. - Reported by Jan Seeger in SF.net Bug 2039990. - - * affixmgr.hxx: 0.5 or in 64 bit environment, 1 MB (virtual) memory - saving using only the requested size for sFlag and pFlag arrays. - Bug report by Brett Wilson. - - * affixmgr.cxx,tools/hunspell.cxx: get_version() returns with full - VERSION affix parameter instead of its first word. Fixes for - Hunspell's header. Some problems with Hunspell header reported in - SF.net Bug 2043080. - -2008-07-15 Németh László <nemeth at OOo>: - * affentry.cxx: fixes of the affix rule matching algorithm (affected - only the sk_SK dictionary from all OpenOffice.org dictionaries): - - fix dot pattern + accented letters matching (in non Unicode encoding) - - word-length conditions work again - * tests/condition.*: extended test for the fix. - - * hashmgr.cxx: load multiword expressions: spaces may be parts - of the dictionary words again (but spaces also work as morphological - field separators: word word2 -> "word word2", word po:noun -> "word"). - * man/hunspell.4: updated manual - - * tools/hunspell.cxx: add iconv character conversion support to - stemming and morphological analysis - - * tools/hunspell.cxx: add /usr/share/myspell/dicts search path for - Ubuntu support - -2008-07-09 Németh László <nemeth at OOo>: - * affentry.cxx: fixes of the affix rule matching algorithm: - - right ASCII character handling in bracket expression; - - fault-tolerant nextchar() for bad rules. - Problem with the en_GB dictionary and nextchar() with a detailed - code analysis reported by John Winters in SF.net Bug ID 2012753. - * tests/condition.*: extended test for the fix. - - * hunspell/hunspell.*, parsers/*, tools/hunspell.cxx: fix compiler - warnings (deprecated const-free char consts) - - * win_api/hunspelldll.*: add hunspell_free_list(), the problem - reported by Laurier Mercer. - -2008-06-30 Török László <torok_laszlo at users dot SF dot net>: - * tests/affixmgr.cxx: fix morphological analysis: strcat() on - an uninitialized char array in suffix_check_morph(). - -2008-06-18 Németh László <nemeth at OOo>: - * src/hunspell/affixmgr.cxx: fix GCC compiler warnings - (comparisons with string literal results in unspecified behaviour). - The problem reported by Ladislav Michnovič. - -2008-06-17 Németh László <nemeth at OOo>: - * src/hunspell/{hunspell.cxx,hunspell.h}: add free_list() to the C and - C++ interface to deallocate suggestion lists. The problem - reported by Laurie Mercer and Christophe Paris. - * csutil.cxx: fix freelist() to deallocate non-NULL list, when n = 0. - * tools/{analyze,example,chmorph,hunspell}.cxx: use free_list(). - - * tools/hunspell.cxx: fix only --with-readline compiling problem. - Reported by Volkov Peter in SF.net Bug 1995842. - - * man/hunspell.3,hunspell.hxx: fix analyze and generate examples in - the manual and comments (using char*** parameter instead of char**). - - * tools/example.cxx: fix suggestion example. - -2008-06-17 Németh László <nemeth at OOo>: - * affentry.cxx: fix the new affix rule matching algorithm of - Hunspell 1.2. Arabic dictionary problem reported by Khaled Hosny - in SF.net Bug ID 1975530. Mohamed Kebdani also sent a - prepared test data. - * tests/{1975530,condition*}: tests for the fix - -2008-06-13 Ingo H. de Boer <idb_winshell at SF.net>: - * src/hunspell/{affixmgr.cxx,hunspell.cxx}: add missing type - cast to strstr() calls for VC8 compatibility. - -2008-06-13 Németh László <nemeth at OOo>: - * suggestmgr.cxx: add also part1-part2 suggestion with dash - for bad part1part2 word forms, suggested by Ruud Baars. - For example, now suggestion of "parttime": "part time" - and "part-time". - NOTE: this feature will work only when the TRY definition - contains "-" or the letter "a". - - * hunspell.cxx: new XML API in spell() and suggest() (see hunspell(3)). - - * src/hunspell/*: fixes for OpenOffice.org build environment. - - * man/{hunspell.3,hzip.1,hunzip.1}: add new manual pages for - Hunspell programming API and dictionary compression and - encryption utilities. - - * src/hunspell/*: handle failed mystrdup() calls and other potential - insufficient memory problems. The problem reported by Elio Voci - in OpenOffice.org Issue 90604 and others. - - * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars - without conditional code. Problem reported by Ingo H. de Boer - in SF.net Bug 1763105. - - * win_api/hunspelldll.h: put_word() renamed to add() in the (old) - Windows DLL API bug reported in SF.net Bug 1943236. Also reported - by Bartkó Zoltán. - - * tools/hunspell.cxx: fix chench() for environments without - native language support (ENABLE_NLS 0 in config.h), - PHP system_exec() bug reported by Michel Weimerskirch in - SF.net Bug 1951087. - - * hunspell.cxx, affixmgr.cxx: remove "result" from the - (result && *result) conditions, when "result" is a static variable. - The problem and a possible solution reported by Ladislav Michnovič. - - * affixmgr.cxx: parse_affix(): print line instead of NULL in - the warning message, when affix class header is bad. - The problem reported by Ladislav Michnovič. - -2008-06-01 Christian Lohmaier <cloph at OOo> - * configure.ac: patch to fix --with-readline, --with-ui logic. - Reported in the SF.net Bug 981395. - -2008-05-04: Volkov Peter <volkov_peter at users sourceforge net> - * configure.ac: fix LibTool 2.22 incompatibility by removing - unused LT_* macros. Report and patch in SF.net Bug 1957383. - The problem reported and fixed by Ladislav Michnovič, too. - -2008-04-23: Ladislav Michnovič <lmichnovic at suse cz> - * hunspell.pc.in: fix wrongly set directories. - -2008-04-12 Németh László <nemeth at OOo>: - * src/tools/hunspell.cxx: - - Multilingual spell checking and special dictionary support with -d. - Multilingual spell checking suggested by Khaled Hosny (SF.net - Bug 1834280). Example for the new syntax: - - -d en_US,en_geo,en_med,de_DE,de_med - - en_US and de_DE are base dictionaries, and en_geo, en_med, de_med - are special dictionaries (dictionaries without affix file). - Special dictionaries are optional extension of the base dictionaries. - There is no explicit naming convention for special dictionaries, - only the ".dic" extension: dictionaries without affix file will - be an extension of the preceding base dictionary. First dictionary - in -d parameter must have an affix file (it must be a base - dictionary). - - - new options for debugging, morphological analysis and stemming: - -m: morphological analysis or flag debug mode (without affix - rule data it signs the flag of the affix rules) - -s: stemming mode - -D: show also available dictionaries and search path - (suggested by Aaron Digulla in SF.net Bug 1902133) - - - add missing refresh() to print bad words before the slower suggestion - search in UI (better user experience) - - - fix tabulator problems (reported by ugli-kid-joe AT sf DOT net) - - - fix different encoding of dic and input, and suggestions - - - add per mille sign to LANG hu_HU section. - - - rewrite program messages. Concatenating multiple printfs for - easier translation suggested by András Tímár and Gábor Kelemen. - - * src/hunspell/csutil.cxx: set static encds variable. Patch by - Rene Engerhald. SF.net Bug 1896207 and 1939988. - - * src/hunspell/w_char.hxx,csutil.hxx: reorganizing - w_char typedef and HENTRY_DATA, HENTRY_FIND consts - - * src/hunspell/hunzip.cxx: fopen(): using rb options instead of r (fix - for Windows) - - * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars - in an #ifdef WINSHELL section. Problem reported by Ingo H. de Boer - in SF.net Bug 1763105. - - * src/tools/chmorph.cxx: remove the experimental modifications - - * src/tools/hzip.c: fopen(): using wb options instead of w (fix - for Windows) - - * src/tools/hunzip.cxx: add missing MOZILLA_CLIENT. Reported - by Ryan VanderMeulen. - - * man/*, man/hu/*: updated manual - - * man/hunspell.4: fix formatting problem (missing header) - - * tools/makealias: now works with the extra data fields. - - * phonet.cxx: use HASHSIZE const - - * tests/rep.aff: fix REP count - - * src/win_api/Makefile.cygwin, README: native Windows compilation - in Cygwin environment without cygwin1.dll dependency (see README - for compiling instructions). - -2008-04-08 Roland Smith <rsmith AT xs4all DOT nl>: - * src/parsers/latexparser.cxx: fix PATTERN_LEN for AMD64 and - other platforms with different struct padding (SF.net Bug 1937995). - -2008-04-03 Kelemen Gábor <kelemeng AT gnome DOT hu>: - * po/POTFILES.in: fix path of the source file - - * po/Makevars: add --from-code=UTF-8 gettext option - - * hunspell.cxx: add comments for shortkey translation - -2008-02-04 Flemming Frandsen <flfr AT stibo DOT com> - * src/hunspell.h: fix Windows DLL support - - this patch also reported by Zoltán Bartkó. - -2008-01-30 Mark McClain <marc_mcclain AT users DOT sf DOT net> - * src/hunspell.cxx: stem(): fix function call side effect - for PPC platform (SF.net Bug 1882105). - -2008-01-30 Németh László <nemeth at OOo>: - * hunspell.cxx, csutil.cxx, hunspelldll.c: fix - SF.et Bug 1851246, patch also by Ingo H. de Boer. - - * hunspell.h: fix SF.net Bug 1856572 (C prototype problem), - patch by Mark de Does. - - * hunspell.pc.in: fix SF.net Bug 1857450 wrong prefix, reported - by Mark de Does. - - * hunspell.pc.in: reset numbering scheme: libhunspell-1.2. - Fix SF.net Bug 1857512 reported by Mark de Does, - also by Rene Engelhard. - - * csutil.cxx: patches for ARM platform, signed_chars.dpatch - by Rene Engelhard and arm_structure_alignment.dpatch by - Steinar H. Gunderson <sesse@debian.org> - - * hunzip.*, hzip.c: new hzip compression format - - * tools/affixcompressor: affix compressor utility (similar to - munch, but it generates affix table automatically), works - with million-words dictionaries of agglutinative languages. - - * README: fix problems reported by Pham Ngoc Khanh. - - * csutil.cxx, suggestmgr: Warning-free in OOo builds. - - * hashmgr.*, csutil.*: fix protected memory problems with - stored pointers on several not x86 platforms by - store_pointer(), get_stored_pointer(). - - * src/tools/hunspell.cxx: fix iconv support on Solaris platform. - - * tests/IJ.good: add missing test file - - * csutil.cxx: fix const char* related errors. Compiling bug - with Visual C++ reported by Ryan VanderMeulen and Ingo H. de Boer. - -2008-01-03 Caolan McNamara <cmc at OO.o>: - * csutil.cxx: SF.net Bug 1863239, notrailingcomma patch and - optimization of get_currect_cs(). - -2007-11-01 Németh László <nemeth at OOo>: - * hunspell/*: new feature: morphological generation, - also fix experimental morphological analysis and stemming. - - new API functions and improved API: - - analyze(word): (instead of morph()) morphological analysis - - stem(word): stemming - - stem(list): stemming based on the result of an analysis - - generate(word, word2): morphological generation - - generate(word, list): morphological generation - - add(word): add word to the run-time dictionary (renamed put_word()) - - add_with_affix(word, word2): (renamed put_word_pattern()): - add word to the run-time dictionary with affix flags of the - second parameter: all affixed forms of the user words will be - recognised by the spell checker. Especially useful for - agglutinative languages. - - remove(word): remove word from the run-time dictionary (not - implemented) - - see manual and hunspell/hunspell.hxx header and tests/morph.* - * tests/morph.*: test data, example for morphological analysis, - stemming and generation - - * tools/analyze, tools/chmorph: extended and new demo applications: - - analyze (originally hunmorph): analyses and stems input words, - generates word forms from input word pairs. - - chmorph: morphological transformation filter - - * configure.ac, hunspell/makefile.am: set library version number. - Bug reported by Rene Engelhard. - - * affentry.cxx, affixmgr.cxx: new pattern matching algorithm in - condition checking of affix rules instead of the Dömölki-algorithm: - - Unlimited condition length (instead of max. 8 characters). - - Less memory consumption, especially useful for affix rich languages: - 5,4 MB memory savings with hu_HU dictionary. - - Speed change depends from dictionaries and CPU caches: English spell - checking is 4% faster on Linux words with en_US dictionary, Hungarian - spell checking is 25% slower on most frequent words of Hungarian - Webcorpus. - - * tests/sug.*, sugutf.*: updated test data (use "a" and "lot" - dictionary items instead of "a lot".) - - * src/hunspell/hunspell.cxx: free(csconv) instead of delete csconv. - Report and patch by Sylvain Paschein in Mozilla Issue 398268. - - * suggestmgr.cxx, tools/hunspell.cxx: bad spelling of "misspelled". - Ubuntu Bug #134792, patch by Malcolm Parsons. - - * tests/base_utf.*: use Unicode apostrophe instead of 8-bit one. - - * hunspell.cxx, hashmgr.cxx: add(): use HashMgr::add() - -2007-10-25 Pavel Janík <pjanik at OOo>: - * hunspell/csutil.cxx: Fix type cast warnings on 64bit Linux in - printing of character positions in u8_u16(). OOo issue 82984. - -2007-09-05 Németh László <nemeth at OOo>: - * win_api/Hunspell.vproj, parsers/testparser.cxx,textparser.hxx: - warning fixes and removing unnecessary Windows project file. - Reported by Ingo H. de Boer. - - * hashmgr.*, {affixmgr,suggestmgr}.cxx: optimized data structure - for variable-count fields (only "ph" transliteration field in - this version, see next item). Also less memory consumption: - -13% (0.75 MB) with en_US dictionary, -6% (1 MB) with hu_HU. - - * suggestmgr.cxx: dictionary based phonetic suggestion for special - or foreign pronounciation (see also rule-based PHONE in manual). - Usage: tab separated field in dictionary lines, started with "ph:". - The field contains a phonetic transliteration of the word: - -Marseille ph:maarsayl - * tests/phone.*: test data for dictionary and rule based phonetic - suggestion. - - * hunspell.cxx: fix potential bad memory access in allcap word - capitalization in suggest() (bug of previous version). - - * hunspell.cxx, atypes.hxx: set correct limit for UTF-8 encoded - input words (256 byte). - - * suggestmgr.cxx: improved REP suggestions with spaces: it works - without dictionary modification. - OOo issue 80147, reported by Davide Prina. - * tests/rep.*: new test data: higher priority for "alot" -> "a lot", - and Italian suggestion "un'alunno" -> "un alunno". - - * affixmgr.cxx: fix Unicode ngram suggestions in expand_rootword(). - (Suggestions with bad affixes.) - Bug reported by Vitaly Piryatinksy <piv dot v dot vitaly at gmail>. - * tests/ngram_utf_fix.*: test based on Vitaly Piryatinksy's data. - - * suggestmgr.cxx: fix twowords() for last UTF-8 multibyte character. - (conditional jump or move depended on uninitialised value). - -2007-08-29 Ingo H. de Boer <idb_winshell at SF.net>: - * win_api/{hunspell,libhunspell, testparser}.vcproj: new project - files for the library and the executables. - - * Hunspell.rc, Hunspell.sln, config.h: updated versions. - Version number problem also reported by András Tímár. - -2007-08-27 Németh László <nemeth at OOo>: - * suggestmgr.hxx: put fixed version. Bug report by Ingo H. de Boer. - - * suggestmgr.cxx: remove variable-length local character array - reported by Ingo H. de Boer. - -2007-08-27 Németh László <nemeth at OOo>: - * suggestmgr.hxx: change bad time_t to clock_t in header, too. - Bug reports or patches by Ingo H. de Boer under SF.net - Bug ID 1781951, János Mohácsi and Gábor Zahemszky, András Tímár, - OMax3 at SF.net under SF.net Bug ID 1781592. - - * phonet.*: change variable-length local character array to - portable fixed size character array. Problem reported by - Ingo H. de Boer under SF.net Bug ID 1781951 and - Ryan VanderMeulen. - - * suggestmgr.cxx: remove debug message (also by - Ingo H. de Boer). - -2007-08-26 Ingo H. de Boer <idb_winshell at SF.net>: - * win_api/Hunspell.vcproj: updated version (with phonet.*) - -2007-08-23 Németh László <nemeth at OOo>: - * phonet.{c,h}xx, suggestmgr.cxx: PHONE parameter: - pronounciation based suggestion using Björn Jacke's original Aspell - phonetic transcription algorithm (http://aspell.net), relicensed - under GPL/LGPL/MPL tri-license with the permission of the author. - Usage: see manual. - - * affixmgr,suggestmgr.cxx: add KEY parameter for keyboard and - input method error related suggestions. - Example: KEY qwertyuiop|asdfghjkl|zxcvbnm - - * man/hunspell.4: description about PHONE and KEY suggestion parameters. - - * suggestmgr.cxx: enhancements for better suggestions: - - Set ngram suggestions for badchar-type errors - and only two word and compound word suggestions, too. - - Separate not compound and compound word - suggestions for MAP suggestion, too. - - Double swap suggestions for short words. - For example: ahev -> have, hwihc -> which. - - Better time limits using clock() instead of time() - (tenths of a second resolution instead of second ones). - - leftcommonsubstring() weigth function. - - * htype.hxx, hashmgr.cxx: blen (byte length) and clen (character - length) fields instead of wlen - - * affixmgr.cxx: fix get_syllable() for bad Unicode inputs. - - * tests/suggestiontest/*: test environment for suggestions - -2007-08-07 Martijn Wargers: - * csutil.cxx: fix Mingw build error associated with ToUpper() call. - Report and patch in Mozilla Issue 391447. - -2007-08-07 Robert Longson: - * atypes.cxx: use empty inline function HUNSPELL_WARNING instead of - variadic macros to switch of Hunspell warnings. - Reported by Gavin Sharp in Mozilla Issue 391147. - -2007-08-05 Ginn Chen: - * hashmgr.cxx: Hunspell failed to compile on OpenSolaris (use stdio - instead of csdio). Report and patch in Mozilla Issue 391040. - -2007-07-25 Németh László <nemeth at OOo>: - * parsers/*.cxx: Hunspell executable recognises and accepts URLs, - e-mail addresses, directory paths, reported by Jeppe Bundsgaard. - * src/tools/hunspell.cxx: --check-url: new option of Hunspell program. - Use --check-url, if you want check URLs, e-mail addresses and paths. - - * parsers/textparser.cxx: strip colon at end of words for Finnish - and Swedish (colon may be in words in Finnish and Swedish). - Problem reported by Lars Aronsson. - * tests/colons_in_words.*: test data - - * tests/digits_in_words.*: example for using digits in words - (eg. 1-jährig, 112-jährig etc. in German), reported by Lars Aronsson. - - * hashmgr.cxx: Hunspell accepts allcaps forms of mixed case - words of personal dictionaries (+allcaps custom dictionary words with - allcaps affixes). - Sf.net Bug ID 1755272, reported by Ellis Miller. - - * hashmgr.cxx: fix small memory leaks with alias compressed - dictionaries (free flag vectors of affixed personal dictionary words - and flag vectors of hidden capitalized forms of mixed case and - allcaps words). - - * affixmgr.cxx: fix COMPOUNDRULE checking with affixed compounds. - Sf.net Bug ID 1706659, reported by Björn Jacke. Also fixing for - OOo Issue 76067 (crash-like deceleration for hexadecimal numbers - with long FFFFFF sequence using en_US dictionary). - - * tools/hunspell.cxx: add missing return to save_privdic(). - - * man/hunspell.4: add information about affixation of personal words: - "Personal dictionaries are simple word lists, but with optional - word patterns for affixation, separated by a slash: - - foo - Foo/Simpson - - In this example, "foo" and "Foo" are personal words, plus Foo - will be recognised with affixes of Simpson (Foo's etc.)." - -2007-07-18 Németh László <nemeth at OOo>: - * src/win_api/: add missing resource files, reported by Ingo H. de Boer. - -2007-07-16 Németh László <nemeth at OOo>: - * hunspell.cxx: fix dot removing from UTF-8 encoded words in cleanword2() - (Capitalised words with dots, as "Something." were not recognised - using Unicode encoded dictionaries.) - * tests/{base.*,base_utf.*}: extended and new test files for - dot removing and Unicode support. - - * tools/hunspell.cxx: fix Cygwin, OS X compatibility using platform - specifics iconv() header by ICONV_CONST macro of Autoconf. - Sf.net Bug ID 1746030, reported by Mike Tian-Jian Jiang. - Sf.net Bug ID 1753939, reported by Jean-Christophe Helary. - - * tools/hunspell.cxx: fix missing global path setting with -d option. - - * tests/test.sh: fix broken Valgrind checking (missing warnings - with VALGRIND=memcheck make check). - - * csutil.cxx: fix condition in u8_u16() to avoid invalid read - of not null-terminated character arrays (detected by Valgrind - in Hunspell executable: associated with 8-bit character table - conversion in tools/hunspell.cxx). - - * csutil.cxx: free_utf_tbl(): use utf_tbl_count-- instead of utf_tbl--. - Memory leak in Hunspell executable detected by Valgrind. - - * hashmgr.cxx: add missing free_utf_tbl(), memory leak in Hunspell - executable detected by Valgrind. - - * hashmgr.cxx: load_tables(): fix memory error in spec. capitalization. - Use sizeof(unsigned short) instead of bad sizeof(unsigned short*). - Invalid memory read detected by Valgrind. - - * hashmgr.cxx: add_word(): fix memory error in spec. capitalization. - Update also affix array length of capitalized homonyms. Invalid - memory read detected by Valgrind. - - * hunspell.cxx: suggest(): fix invalid memory write and leak. - Bad realloc() and missing free() detected by Valgrind associated - with suggestions for "something.The" type spelling errors. - - * {dictmgr,csutil,hashmgr,suggestmgr}.cxx: check memory allocation. - Sf.net Bug ID 1747507, based on the patch by Jose da Silva. - -2007-07-13 Ingo H. de Boer <idb_winshell at SF.net>: - * atypes.cxx: fix Visual C compatibility: Using - "HUNSPELL_WARNING(a,b,...} {}" macro instead of empty "X(a,b...)". - - * hunspell.cxx: changes for Windows API. - * win_api/Hunspell.*: new resource files - * win_api/hunspelldll.*: set optional Hunspell and Borland spec. codes - Sf.net Bug ID 1753802, patch by Ingo H. de Boer. - See also Sf.net Bug ID 1751406, patch by Mike Tian-Jian Jiang. - -2007-07-09 Caolan McNamara <cmc at OO.o>: - * {hunspell,hashmgr,affentry}.cxx: fix warnings of Coverity program - analyzer. Sf.net Bug ID, 1750219. - -2007-07-06 Németh László <nemeth at OOo>: - * atypes.cxx: warning-free swallowing of conditional warning messages - and their parameters using empty HUNSPELL_WARNING(a,b...) macro. - * {affixmgr,atypes,csutil}.cxx: fix unused variable warnings - using WARNVAR macro for conditionally named variables. - * hashmgr.cxx: fix unused variable warning in add_word() by cond. name - * hunspell.cxx: fix shadowed declaration of captype var. in suggest() - -2006-06-29 Caolan McNamara <cmc at OO.o>: - * hunspell.cxx: patch to fix possible memory leak in analyze() of - experimental morphological analyzer code. Sf.net Bug ID 1745263. - -2007-06-29 Németh László <nemeth at OOo>: -improvements: - * src/hunspell/hunspell.cxx: check bad capitalisation of Dutch letter IJ. - - Sf.net Feature Request ID 1640985, reported by Frank Fesevur. - - Solution: FORBIDDENWORD for capitalised word forms (need - an improved Dutch dictionary with forbidden words: Ijs/*, etc.). - * tests/IJ.*: test data and example. - - * hashmgr.cxx, hunspell.cxx: check capitalization of special word forms - - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG - Sf.net Bug ID 1398550, reported by Dmitri Gabinski. - - allcap words and suffixes: UNICEF's - UNICEF'S - - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA - For Catalan, French and Italian languages. - Reported by Davide Prina in OOo Issue 68568. - * tests/allcaps*: tests for OPENOFFICE.ORG, UNICEF'S capitalization. - * tests/i68568*: tests for SANT'ELIA capitalization. - - * hunspell/hunspell.cxx: suggestion for missing sentence spacing: - something.The -> something. The - - * tools/hunspell.cxx: multiple character encoding support - - -i option: custom input encoding - Sf.net Bug ID 1610866, reported by Thobias Schlemmer. - Sf.net Bug ID 1633413, reported by Dan Kenigsberg. - See also hunspell-1.1.5-encoding.patch of Fedora from Caolan Mc'Namara. - * tests/*.test: add input encodings - - * tools/hunspell.cxx: use locale data for default dictionary names. - Sf.net Bug ID 1731630, report and patch from Bernhard Rosenkraenzer, - See also hunspell-1.1.4-defaultdictfromlang.patch of Fedora Linux - from Caolan McNamara. - - * tools/hunspell.cxx: fix 8-bit tokenization (letters without - casing, like ß or Hebrew characters now are handled well) - - * tools/hunspell.cxx: dictionary search path - - DICPATH environmental variable - - -D option: show directory path of loaded dictionary - - automatic detection of OpenOffice.org directories - -fixes: - * affixmgr.cxx: fault-tolerant patch for REP and other affix - table data problems. Problem with Hunspell and en_GB dictionary - reported by Thomas Lange in OOo Issue 76098 and - Stephan Bergmann in OOo Issue 76100. - Sf.net Bug ID 1698240, reported by Ingo H. de Boer. - - * csutil.cxx: fix mkallcap_utf() for allcaps suggestion in UTF-8. - - * suggestmgr.cxx: fix bad movechar_utf() (missing strlen()). - - * hunspell.cxx: fix bad degree sign detection in Unicode - hu_HU environment. - - * hunspell/hunspell.cxx: free allocated memory of csconv in - ported Mozilla code. - - Mozilla Bugzilla Bug 383564, report and Mozilla MySpell patch - by Andrew Geul. Reported by Ryan VanderMeulen for Hunspell. - - * suggestmgr.cxx: fix minor difference in Unicode suggestion - (ngram suggestion of allcaps words in Unicode). - - * hashmgr.cxx: close file handle after errors. - Sf.net Bug ID 1736286, reported by John Nisly. - - * configure.ac: syntax error (shell variable with spaces). - Sf.net Bug ID 1731625, reported by Bernhard Rosenkraenzer. - - * hunspell.cxx: check_word(): fix bad usage of info pointer. - - * hashmgr.cxx: fix de_DE related bug (accept words with leading dash). - Sf.net Bug ID 1696134, reported by Björn Jacke. - - * suggestmgr.cxx, tests/1695964.*: fix NEEDAFFIX homonym suggestion. - Sf.net Bug ID 1695964, reported by Björn Jacke. - - * tests/1463589*: capitalized ngram suggestion test data for - Sf.net Bug ID 1463589, reported by Frederik Fouvry. - - * csutil.cxx, affixmgr.cxx: fix possible heap error with - multiple instances of utf_tbl. - Sf.net Bug ID 1693875, reported by Ingo H. de Boer. - - * affixmgr.cxx, suggestmgr.cxx, license.hunspell: convert to ASCII. - Locale dependent compiling problems. Sf.net Bug ID 1694379, reported - by Mike Tian-Jian Jiang. OOo Issue 78018 reported by Thomas Lange. - - * tests/test.sh: compatibility issues - - fix Valgrind support (check shared library instead of shell wrapper) - - remove deprecated "tail +2" syntax - - set 8-bit locale for testing (LC_ALL=C) - - * hunspell.hxx: remove license.* and config.h dependencies. - - hunspell-1.1.5-badheader.patch from Caolan McNamara <cmc at OO.o> - -2007-03-21 Németh László <nemeth at OOo>: - * tools/Makefile.am, munch.h, unmunch.h: add missing munch.h and unmunch.h - Reported by Björn Jacke and Khaled Hosny (sf.net Bug ID 1684144) - * hunspell/hunspell.cxx, hunspell.hxx: fix --with-ui compliling error (add get_csconv()) - Reported by Khaled Hosny (sf.net Bug ID 1685010) - -2007-03-19 Németh László <nemeth at OOo>: - * csutil.cxx, hunspell/hunspell.cxx: Unicode non BMP area (>65K character range) support - (except conditional patterns and strip characters of affix rules) - * tests/utf8_nonbmp*: test data - - * src/hunspell/*: add Mozilla patches from David Einstein - - run-time generated 8-bit character tables - - other Mozilla related changes (see Mozilla Bugzilla Bug 319778) - - * csutil.cxx, affixmgr.cxx, hashmgr.cxx: optimized version of IGNORE feature - - IGNORE works with affixes (except strip characters and affix conditions) - * tests/ignore*: test data with latin characters - * tests/ignoreutf*: Unicode test data with Arabic diacritics (Harakat) - - * src/hunspell/suggestmgr.cxx: new edit distance suggestion methods - - capitalization: nasa -> NASA - - long swap: permenant -> permanent - - long mov.: Ghandi -> Gandhi - - double two characters: vacacation -> vacation - * tests/sug.*: test data - - * src/hunspell/affixmgr.cxx: space in REP strings (alot -> a lot) - Note: Underline character signs the space in REP strings: REP alot a_lot, and - put the expression with space ("a lot") into the dic file (see tests/sug). - - * hashmgr.cxx, affixmgr.cxx: ignore Unicode byte order mark (BOM sequence) - * tests/utf8_bom*: test data - - * hunspell/*.cxx: OOo Issue 68903 - Make lingucomponent warning-free on wntmsci10 - - fix Hunspell related warning messages on Windows platform (except some assignment - within conditional expressions). Reported and started by Stephan Bergmann. - - * hunspell/affixmgr.cxx: fix OOo Issue 66683 - hunspell dmake debug=x fails - - Reported by Stephan Bergmann. - - * src/hunspell/hunspell.[ch]xx: thread safe API for Hunspell executable - (removing prev*() functions, new spell(word, info, root) function) - - * configure.ac, src/hunspell/*: HUNSPELL_EXPERIMENTAL code - --with-experimental configure option (conditional compiling of morphological analyser - and stemmer tools) - - * configure.ac, src/hunspell/*: conditional Hunspell warning messages - --with-warnings configure option - - * affixmgr.cxx: new, optimized parsing functions - - * affixmgr.cxx: fix homonym handling for German dictionary project, - reported by Björn Jacke (sf.net Bug ID 1592880). - * tests/1592880.*: test data by Björn Jacke - - * src/hunspell/affixmgr.cxx: fix CIRCUMFIX suggestion - Bug reported by Erdal Ronahi. - - * hunspell.cxx: reverse root word output (complex prefixes) - Bug reported by Munzir Taha. - - * tools/hunspell.cxx: fix Emacs compatibility, patch by marot at sf.net - - no % command in PIPE mode (SourceForge BugTracker 1595607) - - fix HUNSPELL_VERSION string - - * suggestmgr.[hc]xx: rename check() functions to checkword() (OOo Issue 68296) - adopt MySpell patch by Bryan Petty (tierra at ooo) for Hunspell source - - * csutil.cxx, munch.c, unmunch.c: adopt relevant parts of the MinGW patch - (OOo Issue 42504) by tonal at ooo - - * affigmgr.cxx: remove double candidate_check() call, reported by Bram Moolenaar - - * tests/test.sh: add LC_ALL="C" environment. Locale dependency of make check - reported by Gentoo project. - - * src/tools/hunspell.cxx: UTF-8 highlighting fix for console UI - (not solved: breaking long UTF-8 lines) - - * src/tools/unmunch.c: fix bad generation if strip is shorter than condition, - reported by Davide Prina - * src/tools/unmunch.h: increase 5000 -> 500000 - - * src/tools/hunspell.cxx: fix memory error in suggestion (uninitialized parameter), - Bug also reported by Björn Jacke in SourceForge Bug 1469957 - - * csutil.cxx, affixmgr.cxx: fix Caolan McNamara's patch for non OOo environment - -2006-11-11 Caolan McNamara <cmc at OO.o>: - * csutil.cxx, affixmgr.cxx: UTF-8 table patch (OOo Issue 71449) - Description: memory optimization (OOo doesn't use the large UTF-8 table). - - * Makefile.am: shared library patch (Sourceforge ID 1610756) - - * hunspell.h, hunspell.cxx: C API patch (Sourceforge ID 1616353) - - * hunspell.pc: pkgconfig patch (Sourceforge ID 1639128) - -2006-10-17 Ryan Jones <at Mozilla Bugzilla>: - * affixmgr.cxx: missing fclose(affixlst) calls - Reported by <gavins at ooo> in OOo Issue 70408 - -2007-07-11 Taha Zerrouki <taha at gawab>: - * affixmgr.cxx, hunspell.cxx, hashmgr.cxx, csutil.cxx: IGNORE feature to remove - optional Arabic and other characters from input and dictionary words. - * src/hunspell/langnum.hxx: add Arabic language number, lang_ar=96 - * tests/ignore.*: test data - -2006-05-28 Miha Vrhovnik <mvrhov at users.sourceforge>: - * src/win_api/*: C API for Windows DLLs - - also Delphi text editor example (see on Hunspell Sourceforge page) - -2006-05-18 Kevin F. Quinn <kevquinn at gentoo>: - * utf_info.cxx: struct -> static struct - Shared library patch also developed by Gentoo developers (Hanno Meyer-Thurow, - Diego Pettenò, Kevin F. Quinn) - -2006-02-02 Németh László <nemethl@gyorsposta.hu>: - * src/hunspell/hunspell.cxx: suggest(): replace "fooBar" -> "foo bar" suggestions - with "fooBar" ->"foo Bar" (missing spaces are typical OCR bugs). - Bug reported by stowrob at OOo in Issue 58202. - * src/hunspell/suggestmgr.cxx: twowords(): permit 1-character words. - (restore MySpell's original behavior). Here: "aNew" -> "a New". - * tests/i58202.*: test data - - * src/parsers/textparser.cxx: fix Unicode tokenization in is_wordchar() - (extra word characters (WORDCHARS) didn't work on big-endian platforms). - - * src/hunspell/{csutil,affixmgr}.cxx: inline isSubset(), isRevSubset(): - little speed optimalization for languages with rich morphology. - - * src/tools/hunspell.cxx: fix bad --with-ui and --with-readline compiling - when (N)curses is missing. Reported by Daniel Naber. - -2006-01-19 Tor Lillqvist <tml@novell.com> - * src/hunspell/csutil.cxx: mystrsep(): fix locale-dependent isspace() tokenization - -2006-01-06 András Tímár <timar@fsf.hu> - * src/hunspell/{hashmgr.hxx,hunspell.cxx}: fix Visual C++ compiling errors - -2006-01-05 Németh László <nemethl@gyorsposta.hu>: - * COPYING: set GPL/LGPL/MPL tri-license for Mozilla integration. - Rationale: Mozilla source code contains an old MySpell version - with GPL/LGPL/MPL tri-license. (MPL license is a copyleft license, similar - to the LGPL, but it acts on file level.) - * COPYING.LGPL: GNU Lesser General Public License 2.1 (LGPL) - * COPYING.MPL: Mozilla Public License 1.1 (MPL) - * license.hunspell, src/hunspell/license.hunspell: GPL/LGPL/MPL tri-license - - * src/hunspell/{affixmgr,hashmgr}.*: AF, AM alias definitions in affix file: - compression of flag sets and morphological descriptions (see manual, - and tests/alias* test files). - Rationale: Alias compression is also good for loading time and memory - efficiency, not only smaller resources. - * src/tools/makealias: alias compression utility - (usage: ./makealias file.dic file.aff) - * tests/alias{,2,3}: AF, AM tests - * man/hunspell.4: add AF, AM documentation - * src/hunspell/affentry.cxx, atypes.hxx: add new opts bits (aeALIASM, aeALIASF) - - * tools/hunspell, src/parser/*, src/hunspell/*: Hunspell program - tokenizes Unicode texts (only with UTF-8 encoded dictionaries). - Missing Unicode tokenization reported by Björn Jacke, Egmont Koblinger, - Jess Body and others. - Note: Curses interactive interface hasn't worked perfectly yet. - * tests/*.tests: remove -1 parameters of Hunspell - * tests/*.{good,wrong}: remove tabulators - - * src/hunspell/{hunspell,affixmgr}.cxx: BREAK option: break words at - specified break points and checking word parts separately (see manual). - Note: COMPOUNDRULE is better (or will be better) for handling dashes and - other compound joining characters or character strings. Use BREAK, if you - want check words with dashes or other joining characters and there is no time - or possibility to describe precise compound rules with COMPOUNDRULE. - * tests/break.*: BREAK example. - - * src/hunspell/{affixmgr,hunspell}.cxx: add CHECKSHARPS declaration instead - of LANG de_DE definitions to handle German sharp s in both spelling and - suggestion. - * src/hunspell/hunspell.cxx: With CHECKSHARPS, uppercase words are valid - with both lower sharp s (it's is optional for names in German legal texts) - and SS (MÜßIG, MÜSSIG). Missing lower sharp s form reported by Björn Jacke. - * src/hunspell/hunspell.cxx: KEEPCASE flag on a sharp s word has a special - meaning with CHECKSHARPS declaration: KEEPCASE permits capitalisation and SS upper - casing of a sharp s word (Müßig and MÜSSIG), but forbids the upper cased form - with lower sharp s character(s): *MÜßIG. - * tests/germancompounding*: add CHECKSHARPS, remove LANG - * tests/checksharps*: add CHECKSHARPS and KEEPCASE, remove LANG - - * src/hunspell/hunspell.cxx: improved suggestions: - - suggestions for pressed Caps Lock problems: macARONI -> macaroni - - suggestions for long shift problems: MAcaroni -> Macaroni, macaroni - - suggestions for KEEPCASE words: KG -> kg - * src/hunspell/csutil.cxx: fix mystrrep() function: - - suggestions for lower sharp s in uppercased words: MÜßIG -> MÜSSIG - * tests/checksharps{,utf}.sug: add tests for mystrrep() fix - - * src/hunspell/hashmgr.cxx: Now dictionary words can contain slashes - with the "\/" syntax. Problem reported by Frederik Fouvry. - - * src/hunspell/hunspell.cxx: fix bad duplicate filter in suggest(). - (Suggesting some capitalised compound words caused program crash - with Hungarian dictionary, OOo Issue 59055). - - * src/hunspell/affixmgr.cxx: fix bad defcpd_check() call in compound_check(). - (Overlapping new COMPOUNDRULE and old compounding methods caused program - crash at suggestion.) - - * src/hunspell/affixmgr.{cxx,hxx}: check affix flag duplication at affix classes. - Suggested by Daniel Naber. - - * src/hunspell/affentry.cxx: remove unused variable declarations (OOo i58338). - Compiler warnings reported by András Tímár and Martin Hollmichel. - - * src/hunspell/hunspell.cxx: morph(): not analyse bad mixed uppercased forms - (fix Arabic morphological analysis with Buckwalter's Arabic transliteration) - - * src/hunspell/affentry.{cxx,hxx}, atypes.hxx: little memory optimization - in affentry: - - using unsigned char fields instead of short (stripl, appndl, numconds) - - rename xpflg field to opts - - removing utf8 field, use aeUTF8 bit of opts field - - * configure.ac: set tests/maputf.test to XFAILED on ARM platform. - Fail reported by Rene Engelhard. - - * configure.ac: link Ncursesw library, if exists. - - * BUGS: add BUGS file - - * tests/complexprefixes2.*: test for morphological analysis with COMPLEXPREFIXES - - * src/hunspell/affixmgr.cxx: use "COMPOUNDRULE" instead of - "COMPOUND". The new name suggested by Bram Moolenaar. - * tests/compoundrule*: modified and renamed compound.* test files - - * man/hunspell.4: AF, AM, BREAK, CHECKSHARPS, COMPOUNDRULE, KEEPCASE. - - also new addition to the documentation: - Header of the dictionary file define approximate dictionary size: - ``A dictionary file (*.dic) contains a list of words, one per line. - The first line of the dictionaries (except personal dictionaries) - contains the _approximate_ word count (for optimal hash memory size).'' - Asked by Frederik Foudry. - - One-character replacements in REP definitions: ``It's very useful to - define replacements for the most typical one-character mistakes, too: - with REP you can add higher priority to a subset of the TRY suggestions - (suggestion list begins with the REP suggestions).'' - -2005-11-11 Németh László <nemethl@gyorsposta.hu>: - * src/hunspell/affixmgr.*: fix Unicode MAP errors (sorted only n-1 - characters instead of n ones in UTF-16 MAP character lists). - Bug reported by Rene Engelhard. - - * src/hunspell/affixmgr.*: fix infinite COMPOUND matching (default char - type is unsigned on PowerPC, s390 and ARM platforms and it will never - be negative). Bug reported by Rene Engelhard. - - * src/hunspell/{affixmgr,suggestmgr}.cxx: fix bad ONLYINCOMPOUND - word suggestions. - * tests/onlyincompound.sug: empty test file to check this fix. - Bug reported by Björn Jacke. - - * src/hunspell/affixmgr.cxx: fix backtracking in COMPOUND pattern matching. - * tests/compound6.*: test files to check this fix. - - * csutil.cxx: set bigger range types in flag_qsort() and flag_bsearch(). - - * affixmgr.hxx: set better type for cont_classes[] Boolean data (short -> char) - - * configure.ac, tests/automake.am: set platform specific XFAIL test - (flagutf8.test on ARM platform) - -2005-11-09 Németh László <nemethl@gyorsposta.hu>: -improvements: - * src/hunspell/affixmgr.*: new and improved affix file parameters: - - - COMPOUND definitions: compound patterns with regexp-like matching. - See manual and test files: tests/compound*.* - Suggested by Bram Moolenaar. - Also useful for simple word-level lexical scanning, for example - analysing numbers or words with numbers (OOo Issue #53643): - http://qa.openoffice.org/issues/show_bug.cgi?id=53643 - Examples: tests/compound{4,5}.*. - - - NOSUGGEST flag: words signed with NOSUGGEST flag are not suggested. - Proposed flag for vulgar and obscene words (OOo Issue #55498). - Example: tests/nosuggest.*. - Problem reported by bobharvey at OOo: - http://qa.openoffice.org/issues/show_bug.cgi?id=55498 - - - KEEPCASE flag: Forbid capitalized and uppercased forms of words - signed with KEEPCASE flags. Useful for special ortographies - (measurements and currency often keep their case in uppercased - texts) and other writing systems (eg. keeping lower case of IPA - characters). - - - CHECKCOMPOUNDCASE: Forbid upper case characters at word bound in compounds. - Examples: tests/checkcompoundcase* and tests/germancompounding.* - - - FLAG UTF-8: New flag type: Unicode character encoded with UTF-8. - Example: tests/flagutf8.*. - Rationale: Unicode character type can be more readable - (in a Unicode text editor) than `long' or `num' flag type. - -bug fixes: - * src/hunspell/hunspell.cxx: accept numbers and numbers with separators (i53643) - Bug reported by skelet at OOo: - http://qa.openoffice.org/issues/show_bug.cgi?id=53643 - - * src/hunspell/csutil.cxx: fix casing data in ISO 8859-13 character table. - - * src/hunspell/csutil.cxx: add ISO-8859-15 character encoding (i54980) - Rationale: ISO-8859-15 is the default encoding of the French OpenOffice.org - dictionary. ISO-8859-15 is a modified version of ISO-8859-1 - (latin-1) character encoding with French œ ligatures and euro - symbol. Problem reported by cbrunet at OOo in OOo Issue 54980: - http://qa.openoffice.org/issues/show_bug.cgi?id=54980 - - * src/hunspell/affixmgr.cxx: fix zero-byte malloc after a bad affix header. - Patch by Harri Pitkänen. - - * src/hunspell/suggestmgr.cxx: fix bad NEEDAFFIX word suggestion - in ngram suggestions. Reported by Daniel Naber and Friedel Wolff. - - * src/hunspell/hashmgr.cxx: fix bad white space checking in affix files. - src/hunspell/{csutil,affixmgr}.cxx: add other white space separators. - Problems with tabulators reported by Frederik Fouvry. - - * src/hunspell/*: replace system-dependent <license.*> #include - parameters with quoted ones. Problem reported by Dafydd Jones. - - * src/hunspell/hunspell.cxx: fix missing morphological analysis of dot(s) - Reported by Trón Viktor. - -changes: - * src/hunspell/affixmgr.cxx: rename PSEUDOROOT to NEEDAFFIX. - Suggested by Bram Moolenaar. - - * src/hunspell/suggestmgr.hxx: Increase default maximum of - ngram suggestions (3->5). Suggested by Kevin Hendricks. - - * src/hunspell/htypes.hxx: Increase MAXDELEN for long affix flags. - - * src/hunspell/suggestmgr.cxx: modify (perhaps fix) Unicode map suggestion. - tests/maputf test fail on ARM platform reported by Rene Engelhard. - - * src/hunspell/{affentry.cxx,atypes.hxx}: remove [PREFIX] and - MISSING_DESCRIPTION messages from morphological analysis. - Problems reported by Trón Viktor. - - * tests/germancompounding.{aff,good}: Add "Computer-Arbeit" test word. - Suggested by Daniel Naber. - - * doc/man/hunspell.4: Proof-reading patch by Goldman Eleonóra. - - * doc/man/hunspell.4: Fix bad affix example (replace `move' with `work'). - Bug reported by Frederik Fouvry. - - * tests/*: new test files: - affixes.*: simple affix compression example from Hunspell 4 manual page - checkcompoundcase.*, checkcompoundcase2.*, checkcompoundcaseutf.* - compound.*, compound2.*, compound3.*, compound4.*, compound5.* - compoundflag.* (former compound.*) - flagutf8.*: test for FLAG UTF-8 - germancompounding.*: simplification with CHECKCOMPOUNDCASE. - germancompoundingold.* (former germancompounding.*) - i53643.*: check numbers with separators - i54980.*: ISO8859-15 test - keepcase.*: test for KEEPCASE - needaffix*.* (former pseudoroot*.* tests) - nosuggest.*: test for NOSUGGEST - -2005-09-19 Németh László <nemethl@gyorsposta.hu>: - * src/hunspell/suggestmgr.cxx: improved ngram suggestion: - - detect not neighboring swap characters (pernament -> permanent) - Rationale: ngram method has a significant error with not neighboring - swap characters, especially when swap is in the middle of the word. - - suggest uppercase forms (unesco -> UNESCO, siggraph's -> SIGGRAPH's) - - suggest only ngram swap character and uppercase form, if they exist. - Rationale: swap character and casing equivalence give mutch better - suggestions as any other (weighted) ngram suggestions. - - add uppercase suggestion (PERMENANT -> PERMANENT) - - * src/hunspell/*: complete comparison with MySpell 3.2 (in OOo beta 2): - - affixmgr.cxx: add missing numrep initialization - - hashmgr.cxx: add_word(): don't allocate temporary records - - hunspell.cxx: in suggest(): - - check capitalized words first (better sug. order for proper names), - - check pSMgr->suggest() return value - - set pSMgr->suggest() call to not optional in HUHCAP - - csutil.cxx: fix bad KOI8-U -> koi8r_tbl reference in enc_entry encds - - csutil.cxx: fix casing data in ISO 8859-2, Windows 1251 and KOI8-U - encoding tables. Bug reported by Dmitri Gabinski. - - * src/hunspell/affixmgr.*: improved compound word and other features - - generalize hu_HU specific compound word features with new affix file - parameters, suggested by Bram Moolenaar: - - CHECKCOMPOUNDDUP: forbid word duplication in compounds (eg. foo|foo) - - CHECKCOMPOUNDTRIPLE: forbid triple letters in compounds (eg. foo|obar) - - CHECKCOMPOUNDPATTERN: forbid patterns at word bounds in compounds - - CHECKCOMPOUNDREP: using REP replacement table, forbid presumably bad - compounds (useful for languages with unlimited number of compounds) - - ONLYINCOMPOUND flag works also with words (see tests/onlyincompound.*) - Suggested by Daniel Naber, Björn Jacke, Trón Viktor & Bram Moolenaar. - - PSEUDOROOT works also with prefixes and prefix + suffix combinations - (see tests/pseudoroot5.*). Suggested by Trón Viktor. - - man/hunspell.4: updated man page - - * src/hunspell/affixmgr.*: fix incomplete prefix handling with twofold - suffixes (delete unnecessary contclasses[] conditions in - prefix_check_twosfx() and prefix_check_twosfx_morph()). - Bug reported by Trón Viktor. - - * src/hunspell/affixmgr.*: complete also *_morph() functions with - conditions of new Hunspell features (circumfix, pseudoroot etc.). - - * src/hunspell/suggestmgr.cxx: - - fix missing suggestions for words with crossed prefix and suffix - - fix redundant non compound word checking - - fix losing suggestions problem. Bug reported by Dmitri Gabinski. - - * src/hunspell/dictmgr.*: - - add new dictionary manager for Hunspell UNO modul - Problems with eo_ANY Esperanto locale reported by Dmitri Gabinski. - - * src/hunspell/*: use precise constant sizes for 8-bit and 16-bit character - arrays with MAXWORDUTF8LEN and MAXSWUTF8L macros. - - * src/hunspell/affixmgr.cxx: fix bad MAXNGRAMSUGS parameter handling - - * src/hunspell/affixmgr.cxx, src/tools/{un}munch.*: fix GCC 4.0 warnings - on fgets(), reported by Dvornik László - - * po/hu.po: improved translation by Dvornik László - - * tests/test.sh: improved test environment - - add suggestion testing (see tests/*.sug) - - add memory debugging environment, based on the excellent Valgrind debugger. - Usage on Linux and experimental platforms of Valgrind: - VALGRIND=memcheck make check - - rename test_hunmorph to test.sh - - * tests/*: new tests: - - base.*: base example based on MySpell's checkme.lst. - - map{,utf}.*, rep{,utf}: MAP and REP suggestion examples - - tests on new CHECKCOMPOUND, ONLYINCOMPOUND and PSEUDOROOT features - - i54633.*: capitalized suggestion test for Issue 54633 from OOo's Issuezilla - - i35725.*: improved ngram suggestion test for Issue 35725 - -2005-08-26 Németh László <nemethl@gyorsposta.hu>: -improvements: - - * src/hunspell/suggestmgr.cxx: - Unicode support in related character map suggestion - - * src/hunspell/suggestmgr.cxx: Unicode support in ngram suggestion - - * src/hunspell/{suggestmgr,affixmgr,hunspell}.cxx: improve ngram suggestion. - Fix http://qa.openoffice.org/issues/show_bug.cgi?id=35725. See release - notes for examples. This problem reported by beccablain at OOo. - - ngram suggestions now are case insensitive (see `Permenant' bug in Issuezilla) - - weight ngram suggestions (with the longest common subsequent algorithm, - also considering lengths of bad word and suggestion, identical first - letters and almost completely identical character positions) - - set strict affix congruency in expand_rootword(). Now ngram suggestions - are good for languages with rich morphology and also better for English. - Rationale: affixed forms of the first ngram suggestion - very often suppress the second and subsequent root word suggestions. But - faults in affixes are more uncommon, and can be fix without suggestions. - We must prefer the more informative second and subsequent root word - suggestions instead of the suggestions for bad affixes. - - a better suggestion may not be substring of a less good suggestion - Rationale: Suggesting affixed forms of a root word is - unnecessary, when root word has got better weighted ngram value. - (Checking substrings is a good approximation for this refinement.) - - lesser ngram suggestions (default 3 maximum instead of 10) - Rationale: For users need a big extra effort to check a lot of bad ngram - suggestions, nine times out of ten unnecessarily. It is very - distracting, because ngram suggestions could be very different. - Usually Myspell and Hunspell suggest one or two suggestions with - the old suggestion algorithms (maximum is 15), with ngram algorithm - often gives maximum number suggestions. With strict affix congruency - and other refinements, the good suggestion there is usually among the - first three elements. - - new affix parameter: MAXNGRAMSUG - - * src/hunspell/*: support agglutinative languages with rich prefix - morphology or with right-to-left writing system (for example, Turkic - and Austronesian languages with (modified) Arabic scripts). - - new affix parameter: COMPLEXPREFIXES - Set twofold prefix stripping (but single suffix stripping) - * src/hunspell/affixmgr.cxx: - - speed up prefix loading with tree sorting algorithm. - * tests/complexprefixes.*, tests/complexprefixesutf.*: - Coptic example posted by Moheb Mekhaiel - - * src/hunspell/hashmgr.cxx: check size attribute in dic file - suggested by Daniel Naber - Rationale: With missing size attribute Hunspell allocates too small and - more slower hash memory, and Hunspell can lose first dictionary word. - - * src/hunspell/affixmgr.cxx: check stripping characters and condition - compatibility in affix rules (bugs detected in cs_CZ, es_ES, es_NEW, - es_MX, lt_LT, nn_NO, pt_PT, ro_RO and sk_SK dictionaries). See release - notes of Hunspell 1.0.9 in NEWS. - - * src/hunspell/affixmgr.cxx: check unnecessary fields in affix rules - (bugs detected in ro_RO and sv_SE dictionaries). See release notes. - - * src/hunspell/affixmgr.cxx: remove redundant condition checking - in affix rules with stripping characters (redundancy in OpenOffice.org - dictionaries reported by Eleonóra Goldman) - Rationale: this is a little optimization, but it was excellent for - detect the bad ngram affixation with bad or weak affix conditions. - - * tests/germancompounding.aff: improve compound definition - - use dash prefix instead of language specific tokenizer - Rationale: Using uniform approach is the right way to check and analyze - compound words. Language specific word breaking is deprecated, need - a sophisticated grammar checking for word-like word pairs - (for example in Hungarian there is a substandard, but accepted - syntax with dash for word pairs: cats, dogs -> kutyák-macskák (like - cats/dogs in English). - - * test Hunspell with 54 OpenOffice.org dictionaries: see release notes - -bug fixes: - - * src/hunspell/suggestmgr.*: add time limit to exponential - algorithm of the related character map suggestion - Rationale: a long word in agglutinative languages or a special pattern - (for example a horizontal rule) made of map characters can `crash' the - spell checker. - - * src/hunspell/affentry.cxx: add() functions: fix bad word generation - checking stripping characters (see similar bug in unmunch) - - * src/hunspell/affixmgr.cxx: parse_file(): fix unconditional getNext() - call for ~AffixMgr() when affix file is corrupt. - - * src/hunspell/affixmgr.*: AffixMgr(), parse_cpdsyllable(): fix missing - string duplications for ~AffixMgr() when affix file is corrupt. - - * src/hunspell/affixmgr.*: parse_affix(): fix fprintf() call when affix - file is corrupt. Bug reported by Daniel Naber. - - * suggestmgr.cxx: replace single usage of 'strdup' with 'mystrdup' - patch by Chris Halls (debian.org) - - * src/hunspell/makefile.mk: add makefile.mk for compiling in OpenOffice.org - See README in Hunspell UNO modul. - Problems with separated compiling reported by Rene Engelhard - - * src/hunspell/hunspell.cxx: fix pseudoroot support - - search a not pseudoroot homonym in check() - * tests/pseudoroot4.*: test this fix - - * src/tools/unmunch.c: fix bad word generation when conditions - are shorter or incompatible with stripping characters in affix rules - - * src/tools/unmunch.c: fix mychomp() for de_AT.dic and other dic files - without last new line character. - -other changes: - * src/hunspell/suggestmgr.*: erase ACCENT suggestion - Rationale: ACCENT suggestion was the same as Kevin Hendrick's map - suggestion algorithm, but with a less good interface in affix file. - - * src/hunspell/suggestmgr.*: combine cycle number limit - in badchar(), and forgotchar() with a time limit. - - * src/hunspell/affixmgr.*: remove NOMAPSUGS affix parameter - - * src/hunspell/{suggestmgr,hunspell}.*: strip periods from - suggestions (restore MySpell's original behaviour) - Rationale: OpenOffice.org has an automatic period handling mechanism - and suggestions look better without periods. - - new affix file parameter: SUGSWITHDOTS - Add period(s) to suggestions, if input word terminates in period(s). - (No need for OpenOffice.org dictionaries.) - - * tests/germancompounding.aff: improve bad german affix in affix example - (computeren->computern). Suggested by Daniel Naber. - - * src/tools/example.cxx: add Myspell's example - - * src/tools/munch.cxx: add Myspell's munch - - * man{,/hu}/hunspell.4: refresh manual pages - -2005-08-01 Németh László <nemethl@gyorsposta.hu>: - * add missing MySpell files and features: - - add MySpell license.readme, README and CONTRIBUTORS ({license,README,AUTHORS}.myspell) - - add MySpell unmunch program (src/tools/unmunch.c) - - add licenses to source (src/hunspell/license.{myspell,hunspell}) - - port MAP suggestion (with imperfect UTF-8 support) - - add NOSPLITSUGS affix parameter - - add NOMAPSUGS affix parameter - - * src/man/man.4: MAP, COMPOUNDPERMITFLAG, NOSPLITSUGS, NOMAPSUGS - - * src/hunspell/aff{entry,ixmgr}.cxx: - - improve compound word support - - new affix parameter: COMPOUNDPERMITFLAG (see manual) - * src/tests/compoundaffix{,2}.*: examples for COMPOUNDPERMITFLAG - * src/tests/germancompounding.*: new solution for German compounding - Problems with German compounding reported by Daniel Naber - - * src/hunspell/hunspell.cxx: fix German uppercase word spelling - with the spellsharps() recursive algorithm. - Default recursive depth is 5 (MAXSHARPS). - * src/tests/germansharps*: extended German sharp s tests - - * src/tools/hunspell.cxx: fix fatal memory bug in non-interactive - subshells without HOME environmental variable - Bug detected with PHP by András Izsók. - -2005-07-22 Németh László <nemethl@gyorsposta.hu>: - * src/hunspell/csutil.hxx: utf16_u8() - - fix 3-byte UTF-8 character conversion - -2005-07-21 Németh László <nemethl@gyorsposta.hu>: - * src/hunspell/csutil.hxx: hunspell_version() for OOo UNO modul - -2005-07-19 Németh László <nemethl@gyorsposta.hu>: - * renaming: - - src/morphbase -> src/hunspell - - src/hunspell, src/hunmorph -> src/tools - - src/huntokens -> src/parsers - - * src/tools/hunstem.cxx: add stemmer example - -2005-07-18 Németh László <nemethl@gyorsposta.hu>: - * configure.ac: --with-ui, --with-readline configure options - * src/hunspell/hunspell.cxx: fix conditional compiling - - * src/hunspell/hunspell.cxx: set HunSPELL.bak temporaly file - in the same dictionary with the checked file. - - * src/morphbase/morphbase.cxx: - - - handling German sharp s (ß) - - - fix (temporaly) analyize() - - * tests: a lot of new tests - - * po/, intl/, m4/: add gettext from GNU hello - - * po/hu.po: add Hungarian translation - - * doc/, man/: rename doc to man - -2005-07-04 Németh László <nemethl@gyorsposta.hu>: - * src/morphbase/hashmgr.cxx: set FLAG attributum instead of FLAG_NUM and FLAG_LONG - - * doc/hunspell.4: manual in English - -2005-06-30 Németh László <nemethl@gyorsposta.hu>: - * src/morphbase/csutil.cxx: add character tables from csutil.cxx of OOo 1.1.4 - - * src/morphbase/affentry.cxx: fix Unicode condition checking - - * tests/{,utf}compound.*: tests compounding - -2005-06-27 Németh László <nemethl@gyorsposta.hu>: - * src/morphbase/*: fix Unicode compound handling - -2005-06-23 Halácsy Péter: - * src/hunmorph/hunmorph.cxx: delete spelling error message and suggest_auto() call - -2005-06-21 Németh László <nemethl@gyorsposta.hu>: - * src/morphbase: Unicode support - * tests/utf8.*: SET UTF-8 test - - * src/morphbase: checking and fixing with Valgrind - Memory handling error reported by Ferenc Szidarovszky - -2005-05-26 Németh László <nemethl@gyorsposta.hu>: - * suggestmgr.cxx: fix stemming - * AUTHORS, COPYING, ChangeLog: set CC-LGPL free software license - -2004-05-25 Varga Dániel <daniel@all.hu> - * src/stemtool: new subproject - -2005-05-25 Halácsy Péter <peter@halacsy.com> - * AUTHORS, COPYING: set CC Attribution license - -2004-05-23 Varga Dániel <daniel@all.hu> - * src: - modifications for compiling with Visual C++ - - * src/hunmorph/csutil.cxx: correcting header of flag_qsort(), - * src/hunmorph/*: correct csutil include - -2005-05-19 Németh László <nemethl@gyorsposta.hu> - * csutil.cxx: fix loop condition in lineuniq() - bug reported by Viktor Nagy (nagyv nyelvtud hu). - - * morphbase.cxx: handle PSEUDOROOT with zero affixes - bug reported by Viktor Nagy (nagyv nyelvtud hu). - * tests/zeroaffix.*: add zeroaffix tests - -2005-04-09 Németh László <nemethl@gyorsposta.hu> - * config.h.in: reset with autoheader - - * src/hunspell/hunspell.cxx: set version - -2005-04-06 Németh László <nemethl@gyorsposta.hu> - * tests: tests - - * src/morphbase: - New optional parameters in affix file: - - PSEUDOROOT: for forbidding root with not forbidden suffixed forms. - - COMPOUNDWORDMAX: max. words in compounds (default is no limit) - - COMPOUNDROOT: signs compounds in dictionary for handling special compound rules - - remove COMPOUNDWORD, ONLYROOT - -2005-03-21 Németh László <nemethl@gyorsposta.hu> - * src/morphbase/*: - - 2-byte flags, FLAG_NUM, FLAG_LONG - - CIRCUMFIX: signed suffixes and prefixes can only occur together - - ONLYINCOMPOUND for fogemorpheme (Swedish, Danish) or Flute-elements (German) - - COMPOUNDBEGIN: allow signed roots, and roots with signed suffix in begin of compounds - - COMPOUNDMIDDLE: like before, but middle of compounds - - COMPOUNDEND: like before, but end of compounds - - remove COMPOUNDFIRST, COMPOUNDLAST |
