• Skip to content
  • Skip to link menu
Trinity API Reference
  • Trinity API Reference
  • tdespell2
 

tdespell2

  • tdespell2
  • plugins
  • ispell
correct.cpp
1 /* enchant
2  * Copyright (C) 2003 Dom Lachowicz
3  *
4  * This library is free software; you can redistribute it and/or
5  * modify it under the terms of the GNU Lesser General Public
6  * License as published by the Free Software Foundation; either
7  * version 2.1 of the License, or (at your option) any later version.
8  *
9  * This library is distributed in the hope that it will be useful,
10  * but WITHOUT ANY WARRANTY; without even the implied warranty of
11  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12  * Lesser General Public License for more details.
13  *
14  * You should have received a copy of the GNU Lesser General Public
15  * License along with this library; if not, write to the
16  * Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
17  * Boston, MA 02110-1301, USA.
18  *
19  * In addition, as a special exception, Dom Lachowicz
20  * gives permission to link the code of this program with
21  * non-LGPL Spelling Provider libraries (eg: a MSFT Office
22  * spell checker backend) and distribute linked combinations including
23  * the two. You must obey the GNU Lesser General Public License in all
24  * respects for all of the code used other than said providers. If you modify
25  * this file, you may extend this exception to your version of the
26  * file, but you are not obligated to do so. If you do not wish to
27  * do so, delete this exception statement from your version.
28  */
29 
30 /*
31  * correct.c - Routines to manage the higher-level aspects of spell-checking
32  *
33  * This code originally resided in ispell.c, but was moved here to keep
34  * file sizes smaller.
35  *
36  * Copyright (c), 1983, by Pace Willisson
37  *
38  * Copyright 1992, 1993, Geoff Kuenning, Granada Hills, CA
39  * All rights reserved.
40  *
41  * Redistribution and use in source and binary forms, with or without
42  * modification, are permitted provided that the following conditions
43  * are met:
44  *
45  * 1. Redistributions of source code must retain the above copyright
46  * notice, this list of conditions and the following disclaimer.
47  * 2. Redistributions in binary form must reproduce the above copyright
48  * notice, this list of conditions and the following disclaimer in the
49  * documentation and/or other materials provided with the distribution.
50  * 3. All modifications to the source code must be clearly marked as
51  * such. Binary redistributions based on modified source code
52  * must be clearly marked as modified versions in the documentation
53  * and/or other materials provided with the distribution.
54  * 4. All advertising materials mentioning features or use of this software
55  * must display the following acknowledgment:
56  * This product includes software developed by Geoff Kuenning and
57  * other unpaid contributors.
58  * 5. The name of Geoff Kuenning may not be used to endorse or promote
59  * products derived from this software without specific prior
60  * written permission.
61  *
62  * THIS SOFTWARE IS PROVIDED BY GEOFF KUENNING AND CONTRIBUTORS ``AS IS'' AND
63  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
64  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
65  * ARE DISCLAIMED. IN NO EVENT SHALL GEOFF KUENNING OR CONTRIBUTORS BE LIABLE
66  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
67  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
68  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
69  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
70  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
71  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
72  * SUCH DAMAGE.
73  */
74 
75 /*
76  * $Log$
77  * Revision 1.1 2004/01/31 16:44:12 zrusin
78  * ISpell plugin.
79  *
80  * Revision 1.4 2003/08/14 17:51:26 dom
81  * update license - exception clause should be Lesser GPL
82  *
83  * Revision 1.3 2003/07/28 20:40:25 dom
84  * fix up the license clause, further win32-registry proof some directory getting functions
85  *
86  * Revision 1.2 2003/07/16 22:52:35 dom
87  * LGPL + exception license
88  *
89  * Revision 1.1 2003/07/15 01:15:04 dom
90  * ispell enchant backend
91  *
92  * Revision 1.2 2003/01/29 05:50:11 hippietrail
93  *
94  * Fixed my mess in EncodingManager.
95  * Changed many C casts to C++ casts.
96  *
97  * Revision 1.1 2003/01/24 05:52:31 hippietrail
98  *
99  * Refactored ispell code. Old ispell global variables had been put into
100  * an allocated structure, a pointer to which was passed to many functions.
101  * I have now made all such functions and variables private members of the
102  * ISpellChecker class. It was C OO, now it's C++ OO.
103  *
104  * I've fixed the makefiles and tested compilation but am unable to test
105  * operation. Please back out my changes if they cause problems which
106  * are not obvious or easy to fix.
107  *
108  * Revision 1.7 2002/09/19 05:31:15 hippietrail
109  *
110  * More Ispell cleanup. Conditional globals and DEREF macros are removed.
111  * K&R function declarations removed, converted to Doxygen style comments
112  * where possible. No code has been changed (I hope). Compiles for me but
113  * unable to test.
114  *
115  * Revision 1.6 2002/09/17 03:03:28 hippietrail
116  *
117  * After seeking permission on the developer list I've reformatted all the
118  * spelling source which seemed to have parts which used 2, 3, 4, and 8
119  * spaces for tabs. It should all look good with our standard 4-space
120  * tabs now.
121  * I've concentrated just on indentation in the actual code. More prettying
122  * could be done.
123  * * NO code changes were made *
124  *
125  * Revision 1.5 2002/09/13 17:20:12 mpritchett
126  * Fix more warnings for Linux build
127  *
128  * Revision 1.4 2002/03/06 08:27:16 fjfranklin
129  * o Only activate compound handling when the hash file says so (Per Larsson)
130  *
131  * Revision 1.3 2001/05/14 09:52:50 hub
132  * Removed newMain.c from GNUmakefile.am
133  *
134  * C++ comments are not C comment. Changed to C comments
135  *
136  * Revision 1.2 2001/05/12 16:05:42 thomasf
137  * Big pseudo changes to ispell to make it pass around a structure rather
138  * than rely on all sorts of gloabals willy nilly here and there. Also
139  * fixed our spelling class to work with accepting suggestions once more.
140  * This code is dirty, gross and ugly (not to mention still not supporting
141  * multiple hash sized just yet) but it works on my machine and will no
142  * doubt break other machines.
143  *
144  * Revision 1.1 2001/04/15 16:01:24 tomas_f
145  * moving to spell/xp
146  *
147  * Revision 1.2 1999/10/05 16:17:28 paul
148  * Fixed build, and other tidyness.
149  * Spell dialog enabled by default, with keyboard binding of F7.
150  *
151  * Revision 1.1 1999/09/29 23:33:32 justin
152  * Updates to the underlying ispell-based code to support suggested corrections.
153  *
154  * Revision 1.59 1995/08/05 23:19:43 geoff
155  * Fix a bug that caused offsets for long lines to be confused if the
156  * line started with a quoting uparrow.
157  *
158  * Revision 1.58 1994/11/02 06:56:00 geoff
159  * Remove the anyword feature, which I've decided is a bad idea.
160  *
161  * Revision 1.57 1994/10/26 05:12:39 geoff
162  * Try boundary characters when inserting or substituting letters, except
163  * (naturally) at word boundaries.
164  *
165  * Revision 1.56 1994/10/25 05:46:30 geoff
166  * Fix an assignment inside a conditional that could generate spurious
167  * warnings (as well as being bad style). Add support for the FF_ANYWORD
168  * option.
169  *
170  * Revision 1.55 1994/09/16 04:48:24 geoff
171  * Don't pass newlines from the input to various other routines, and
172  * don't assume that those routines leave the input unchanged.
173  *
174  * Revision 1.54 1994/09/01 06:06:41 geoff
175  * Change erasechar/killchar to uerasechar/ukillchar to avoid
176  * shared-library problems on HP systems.
177  *
178  * Revision 1.53 1994/08/31 05:58:38 geoff
179  * Add code to handle extremely long lines in -a mode without splitting
180  * words or reporting incorrect offsets.
181  *
182  * Revision 1.52 1994/05/25 04:29:24 geoff
183  * Fix a bug that caused line widths to be calculated incorrectly when
184  * displaying lines containing tabs. Fix a couple of places where
185  * characters were sign-extended incorrectly, which could cause 8-bit
186  * characters to be displayed wrong.
187  *
188  * Revision 1.51 1994/05/17 06:44:05 geoff
189  * Add support for controlled compound formation and the COMPOUNDONLY
190  * option to affix flags.
191  *
192  * Revision 1.50 1994/04/27 05:20:14 geoff
193  * Allow compound words to be formed from more than two components
194  *
195  * Revision 1.49 1994/04/27 01:50:31 geoff
196  * Add support to correctly capitalize words generated as a result of a
197  * missing-space suggestion.
198  *
199  * Revision 1.48 1994/04/03 23:23:02 geoff
200  * Clean up the code in missingspace() to be a bit simpler and more
201  * efficient.
202  *
203  * Revision 1.47 1994/03/15 06:24:23 geoff
204  * Fix the +/-/~ commands to be independent. Allow the + command to
205  * receive a suffix which is a deformatter type (currently hardwired to
206  * be either tex or nroff/troff).
207  *
208  * Revision 1.46 1994/02/21 00:20:03 geoff
209  * Fix some bugs that could cause bad displays in the interaction between
210  * TeX parsing and string characters. Show_char now will not overrun
211  * the inverse-video display area by accident.
212  *
213  * Revision 1.45 1994/02/14 00:34:51 geoff
214  * Fix correct to accept length parameters for ctok and itok, so that it
215  * can pass them to the to/from ichar routines.
216  *
217  * Revision 1.44 1994/01/25 07:11:22 geoff
218  * Get rid of all old RCS log lines in preparation for the 3.1 release.
219  *
220  */
221 
222 #include <stdlib.h>
223 #include <string.h>
224 #include <ctype.h>
225 #include "ispell_checker.h"
226 #include "msgs.h"
227 
228 /*
229 extern void upcase P ((ichar_t * string));
230 extern void lowcase P ((ichar_t * string));
231 extern ichar_t * strtosichar P ((char * in, int canonical));
232 
233 int compoundflag = COMPOUND_CONTROLLED;
234 */
235 
236 /*
237  * \param a
238  * \param b
239  * \param canonical NZ for canonical string chars
240  *
241  * \return
242  */
243 int
244 ISpellChecker::casecmp (char *a, char *b, int canonical)
245 {
246  ichar_t * ap;
247  ichar_t * bp;
248  ichar_t inta[INPUTWORDLEN + 4 * MAXAFFIXLEN + 4];
249  ichar_t intb[INPUTWORDLEN + 4 * MAXAFFIXLEN + 4];
250 
251  strtoichar (inta, a, sizeof inta, canonical);
252  strtoichar (intb, b, sizeof intb, canonical);
253  for (ap = inta, bp = intb; *ap != 0; ap++, bp++)
254  {
255  if (*ap != *bp)
256  {
257  if (*bp == '\0')
258  return m_hashheader.sortorder[*ap];
259  else if (mylower (*ap))
260  {
261  if (mylower (*bp) || mytoupper (*ap) != *bp)
262  return static_cast<int>(m_hashheader.sortorder[*ap])
263  - static_cast<int>(m_hashheader.sortorder[*bp]);
264  }
265  else
266  {
267  if (myupper (*bp) || mytolower (*ap) != *bp)
268  return static_cast<int>(m_hashheader.sortorder[*ap])
269  - static_cast<int>(m_hashheader.sortorder[*bp]);
270  }
271  }
272  }
273  if (*bp != '\0')
274  return -static_cast<int>(m_hashheader.sortorder[*bp]);
275  for (ap = inta, bp = intb; *ap; ap++, bp++)
276  {
277  if (*ap != *bp)
278  {
279  return static_cast<int>(m_hashheader.sortorder[*ap])
280  - static_cast<int>(m_hashheader.sortorder[*bp]);
281  }
282  }
283  return 0;
284 }
285 
286 /*
287  * \param word
288  */
289 void
290 ISpellChecker::makepossibilities (ichar_t *word)
291 {
292  int i;
293 
294  for (i = 0; i < MAXPOSSIBLE; i++)
295  m_possibilities[i][0] = 0;
296  m_pcount = 0;
297  m_maxposslen = 0;
298  m_easypossibilities = 0;
299 
300 #ifndef NO_CAPITALIZATION_SUPPORT
301  wrongcapital (word);
302 #endif
303 
304 /*
305  * according to Pollock and Zamora, CACM April 1984 (V. 27, No. 4),
306  * page 363, the correct order for this is:
307  * OMISSION = TRANSPOSITION > INSERTION > SUBSTITUTION
308  * thus, it was exactly backwards in the old version. -- PWP
309  */
310 
311  if (m_pcount < MAXPOSSIBLE)
312  missingletter (word); /* omission */
313  if (m_pcount < MAXPOSSIBLE)
314  transposedletter (word); /* transposition */
315  if (m_pcount < MAXPOSSIBLE)
316  extraletter (word); /* insertion */
317  if (m_pcount < MAXPOSSIBLE)
318  wrongletter (word); /* substitution */
319 
320  if ((m_hashheader.compoundflag != COMPOUND_ANYTIME) &&
321  m_pcount < MAXPOSSIBLE)
322  missingspace (word); /* two words */
323 
324 }
325 
326 /*
327  * \param word
328  *
329  * \return
330  */
331 int
332 ISpellChecker::insert (ichar_t *word)
333 {
334  int i;
335  char * realword;
336 
337  realword = ichartosstr (word, 0);
338  for (i = 0; i < m_pcount; i++)
339  {
340  if (strcmp (m_possibilities[i], realword) == 0)
341  return (0);
342  }
343 
344  strcpy (m_possibilities[m_pcount++], realword);
345  i = strlen (realword);
346  if (i > m_maxposslen)
347  m_maxposslen = i;
348  if (m_pcount >= MAXPOSSIBLE)
349  return (-1);
350  else
351  return (0);
352 }
353 
354 #ifndef NO_CAPITALIZATION_SUPPORT
355 /*
356  * \param word
357  */
358 void
359 ISpellChecker::wrongcapital (ichar_t *word)
360 {
361  ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
362 
363  /*
364  ** When the third parameter to "good" is nonzero, it ignores
365  ** case. If the word matches this way, "ins_cap" will recapitalize
366  ** it correctly.
367  */
368  if (good (word, 0, 1, 0, 0))
369  {
370  icharcpy (newword, word);
371  upcase (newword);
372  ins_cap (newword, word);
373  }
374 }
375 #endif
376 
377 /*
378  * \param word
379  */
380 void
381 ISpellChecker::wrongletter (ichar_t *word)
382 {
383  int i;
384  int j;
385  int n;
386  ichar_t savechar;
387  ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
388 
389  n = icharlen (word);
390  icharcpy (newword, word);
391 #ifndef NO_CAPITALIZATION_SUPPORT
392  upcase (newword);
393 #endif
394 
395  for (i = 0; i < n; i++)
396  {
397  savechar = newword[i];
398  for (j=0; j < m_Trynum; ++j)
399  {
400  if (m_Try[j] == savechar)
401  continue;
402  else if (isboundarych (m_Try[j]) && (i == 0 || i == n - 1))
403  continue;
404  newword[i] = m_Try[j];
405  if (good (newword, 0, 1, 0, 0))
406  {
407  if (ins_cap (newword, word) < 0)
408  return;
409  }
410  }
411  newword[i] = savechar;
412  }
413 }
414 
415 /*
416  * \param word
417  */
418 void
419 ISpellChecker::extraletter (ichar_t *word)
420 {
421  ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
422  ichar_t * p;
423  ichar_t * r;
424 
425  if (icharlen (word) < 2)
426  return;
427 
428  icharcpy (newword, word + 1);
429  for (p = word, r = newword; *p != 0; )
430  {
431  if (good (newword, 0, 1, 0, 0))
432  {
433  if (ins_cap (newword, word) < 0)
434  return;
435  }
436  *r++ = *p++;
437  }
438 }
439 
440 /*
441  * \param word
442  */
443 void
444 ISpellChecker::missingletter (ichar_t *word)
445 {
446  ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN + 1];
447  ichar_t * p;
448  ichar_t * r;
449  int i;
450 
451  icharcpy (newword + 1, word);
452  for (p = word, r = newword; *p != 0; )
453  {
454  for (i = 0; i < m_Trynum; i++)
455  {
456  if (isboundarych (m_Try[i]) && r == newword)
457  continue;
458  *r = m_Try[i];
459  if (good (newword, 0, 1, 0, 0))
460  {
461  if (ins_cap (newword, word) < 0)
462  return;
463  }
464  }
465  *r++ = *p++;
466  }
467  for (i = 0; i < m_Trynum; i++)
468  {
469  if (isboundarych (m_Try[i]))
470  continue;
471  *r = m_Try[i];
472  if (good (newword, 0, 1, 0, 0))
473  {
474  if (ins_cap (newword, word) < 0)
475  return;
476  }
477  }
478 }
479 
480 /*
481  * \param word
482  */
483 void ISpellChecker::missingspace (ichar_t *word)
484 {
485  ichar_t firsthalf[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
486  int firstno; /* Index into first */
487  ichar_t * firstp; /* Ptr into current firsthalf word */
488  ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN + 1];
489  int nfirsthalf; /* No. words saved in 1st half */
490  int nsecondhalf; /* No. words saved in 2nd half */
491  ichar_t * p;
492  ichar_t secondhalf[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
493  int secondno; /* Index into second */
494 
495  /*
496  ** We don't do words of length less than 3; this keeps us from
497  ** splitting all two-letter words into two single letters. We
498  ** also don't do maximum-length words, since adding the space
499  ** would exceed the size of the "possibilities" array.
500  */
501  nfirsthalf = icharlen (word);
502  if (nfirsthalf < 3 || nfirsthalf >= INPUTWORDLEN + MAXAFFIXLEN - 1)
503  return;
504  icharcpy (newword + 1, word);
505  for (p = newword + 1; p[1] != '\0'; p++)
506  {
507  p[-1] = *p;
508  *p = '\0';
509  if (good (newword, 0, 1, 0, 0))
510  {
511  /*
512  * Save_cap must be called before good() is called on the
513  * second half, because it uses state left around by
514  * good(). This is unfortunate because it wastes a bit of
515  * time, but I don't think it's a significant performance
516  * problem.
517  */
518  nfirsthalf = save_cap (newword, word, firsthalf);
519  if (good (p + 1, 0, 1, 0, 0))
520  {
521  nsecondhalf = save_cap (p + 1, p + 1, secondhalf);
522  for (firstno = 0; firstno < nfirsthalf; firstno++)
523  {
524  firstp = &firsthalf[firstno][p - newword];
525  for (secondno = 0; secondno < nsecondhalf; secondno++)
526  {
527  *firstp = ' ';
528  icharcpy (firstp + 1, secondhalf[secondno]);
529  if (insert (firsthalf[firstno]) < 0)
530  return;
531  *firstp = '-';
532  if (insert (firsthalf[firstno]) < 0)
533  return;
534  }
535  }
536  }
537  }
538  }
539 }
540 
541 /*
542  * \param word
543  * \param pfxopts Options to apply to prefixes
544  */
545 int
546 ISpellChecker::compoundgood (ichar_t *word, int pfxopts)
547 {
548  ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
549  ichar_t * p;
550  ichar_t savech;
551  long secondcap; /* Capitalization of 2nd half */
552 
553  /*
554  ** If compoundflag is COMPOUND_NEVER, compound words are never ok.
555  */
556  if (m_hashheader.compoundflag == COMPOUND_NEVER)
557  return 0;
558  /*
559  ** Test for a possible compound word (for languages like German that
560  ** form lots of compounds).
561  **
562  ** This is similar to missingspace, except we quit on the first hit,
563  ** and we won't allow either member of the compound to be a single
564  ** letter.
565  **
566  ** We don't do words of length less than 2 * compoundmin, since
567  ** both halves must at least compoundmin letters.
568  */
569  if (icharlen (word) < 2 * m_hashheader.compoundmin)
570  return 0;
571  icharcpy (newword, word);
572  p = newword + m_hashheader.compoundmin;
573  for ( ; p[m_hashheader.compoundmin - 1] != 0; p++)
574  {
575  savech = *p;
576  *p = 0;
577  if (good (newword, 0, 0, pfxopts, FF_COMPOUNDONLY))
578  {
579  *p = savech;
580  if (good (p, 0, 1, FF_COMPOUNDONLY, 0)
581  || compoundgood (p, FF_COMPOUNDONLY))
582  {
583  secondcap = whatcap (p);
584  switch (whatcap (newword))
585  {
586  case ANYCASE:
587  case CAPITALIZED:
588  case FOLLOWCASE: /* Followcase can have l.c. suffix */
589  return secondcap == ANYCASE;
590  case ALLCAPS:
591  return secondcap == ALLCAPS;
592  }
593  }
594  }
595  else
596  *p = savech;
597  }
598  return 0;
599 }
600 
601 /*
602  * \param word
603  */
604 void
605 ISpellChecker::transposedletter (ichar_t *word)
606 {
607  ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
608  ichar_t * p;
609  ichar_t temp;
610 
611  icharcpy (newword, word);
612  for (p = newword; p[1] != 0; p++)
613  {
614  temp = *p;
615  *p = p[1];
616  p[1] = temp;
617  if (good (newword, 0, 1, 0, 0))
618  {
619  if (ins_cap (newword, word) < 0)
620  return;
621  }
622  temp = *p;
623  *p = p[1];
624  p[1] = temp;
625  }
626 }
627 
636 int
637 ISpellChecker::ins_cap (ichar_t *word, ichar_t *pattern)
638 {
639  int i; /* Index into savearea */
640  int nsaved; /* No. of words saved */
641  ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
642 
643  nsaved = save_cap (word, pattern, savearea);
644  for (i = 0; i < nsaved; i++)
645  {
646  if (insert (savearea[i]) < 0)
647  return -1;
648  }
649  return 0;
650 }
651 
661 int
662 ISpellChecker::save_cap (ichar_t *word, ichar_t *pattern,
663  ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN])
664 {
665  int hitno; /* Index into hits array */
666  int nsaved; /* Number of words saved */
667  int preadd; /* No. chars added to front of root */
668  int prestrip; /* No. chars stripped from front */
669  int sufadd; /* No. chars added to back of root */
670  int sufstrip; /* No. chars stripped from back */
671 
672  if (*word == 0)
673  return 0;
674 
675  for (hitno = m_numhits, nsaved = 0; --hitno >= 0 && nsaved < MAX_CAPS; )
676  {
677  if (m_hits[hitno].prefix)
678  {
679  prestrip = m_hits[hitno].prefix->stripl;
680  preadd = m_hits[hitno].prefix->affl;
681  }
682  else
683  prestrip = preadd = 0;
684  if (m_hits[hitno].suffix)
685  {
686  sufstrip = m_hits[hitno].suffix->stripl;
687  sufadd = m_hits[hitno].suffix->affl;
688  }
689  else
690  sufadd = sufstrip = 0;
691  save_root_cap (word, pattern, prestrip, preadd,
692  sufstrip, sufadd,
693  m_hits[hitno].dictent, m_hits[hitno].prefix, m_hits[hitno].suffix,
694  savearea, &nsaved);
695  }
696  return nsaved;
697 }
698 
699 /*
700  * \param word
701  * \param pattern
702  * \param prestrip
703  * \param preadd
704  * \param sufstrip
705  * \param sufadd
706  * \param firstdent
707  * \param pfxent
708  * \param sufent
709  *
710  * \return
711  */
712 int
713 ISpellChecker::ins_root_cap (ichar_t *word, ichar_t *pattern,
714  int prestrip, int preadd, int sufstrip, int sufadd,
715  struct dent *firstdent, struct flagent *pfxent, struct flagent *sufent)
716 {
717  int i; /* Index into savearea */
718  ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
719  int nsaved; /* Number of words saved */
720 
721  nsaved = 0;
722  save_root_cap (word, pattern, prestrip, preadd, sufstrip, sufadd,
723  firstdent, pfxent, sufent, savearea, &nsaved);
724  for (i = 0; i < nsaved; i++)
725  {
726  if (insert (savearea[i]) < 0)
727  return -1;
728  }
729  return 0;
730 }
731 
732 /* ARGSUSED */
746 void
747 ISpellChecker::save_root_cap (ichar_t *word, ichar_t *pattern,
748  int prestrip, int preadd, int sufstrip, int sufadd,
749  struct dent *firstdent, struct flagent *pfxent, struct flagent *sufent,
750  ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN],
751  int * nsaved)
752 {
753 #ifndef NO_CAPITALIZATION_SUPPORT
754  struct dent * dent;
755 #endif /* NO_CAPITALIZATION_SUPPORT */
756  int firstisupper;
757  ichar_t newword[INPUTWORDLEN + 4 * MAXAFFIXLEN + 4];
758 #ifndef NO_CAPITALIZATION_SUPPORT
759  ichar_t * p;
760  int len;
761  int i;
762  int limit;
763 #endif /* NO_CAPITALIZATION_SUPPORT */
764 
765  if (*nsaved >= MAX_CAPS)
766  return;
767  icharcpy (newword, word);
768  firstisupper = myupper (pattern[0]);
769 #ifdef NO_CAPITALIZATION_SUPPORT
770  /*
771  ** Apply the old, simple-minded capitalization rules.
772  */
773  if (firstisupper)
774  {
775  if (myupper (pattern[1]))
776  upcase (newword);
777  else
778  {
779  lowcase (newword);
780  newword[0] = mytoupper (newword[0]);
781  }
782  }
783  else
784  lowcase (newword);
785  icharcpy (savearea[*nsaved], newword);
786  (*nsaved)++;
787  return;
788 #else /* NO_CAPITALIZATION_SUPPORT */
789 #define flagsareok(dent) \
790  ((pfxent == NULL \
791  || TSTMASKBIT (dent->mask, pfxent->flagbit)) \
792  && (sufent == NULL \
793  || TSTMASKBIT (dent->mask, sufent->flagbit)))
794 
795  dent = firstdent;
796  if ((dent->flagfield & (CAPTYPEMASK | MOREVARIANTS)) == ALLCAPS)
797  {
798  upcase (newword); /* Uppercase required */
799  icharcpy (savearea[*nsaved], newword);
800  (*nsaved)++;
801  return;
802  }
803  for (p = pattern; *p; p++)
804  {
805  if (mylower (*p))
806  break;
807  }
808  if (*p == 0)
809  {
810  upcase (newword); /* Pattern was all caps */
811  icharcpy (savearea[*nsaved], newword);
812  (*nsaved)++;
813  return;
814  }
815  for (p = pattern + 1; *p; p++)
816  {
817  if (myupper (*p))
818  break;
819  }
820  if (*p == 0)
821  {
822  /*
823  ** The pattern was all-lower or capitalized. If that's
824  ** legal, insert only that version.
825  */
826  if (firstisupper)
827  {
828  if (captype (dent->flagfield) == CAPITALIZED
829  || captype (dent->flagfield) == ANYCASE)
830  {
831  lowcase (newword);
832  newword[0] = mytoupper (newword[0]);
833  icharcpy (savearea[*nsaved], newword);
834  (*nsaved)++;
835  return;
836  }
837  }
838  else
839  {
840  if (captype (dent->flagfield) == ANYCASE)
841  {
842  lowcase (newword);
843  icharcpy (savearea[*nsaved], newword);
844  (*nsaved)++;
845  return;
846  }
847  }
848  while (dent->flagfield & MOREVARIANTS)
849  {
850  dent = dent->next;
851  if (captype (dent->flagfield) == FOLLOWCASE
852  || !flagsareok (dent))
853  continue;
854  if (firstisupper)
855  {
856  if (captype (dent->flagfield) == CAPITALIZED)
857  {
858  lowcase (newword);
859  newword[0] = mytoupper (newword[0]);
860  icharcpy (savearea[*nsaved], newword);
861  (*nsaved)++;
862  return;
863  }
864  }
865  else
866  {
867  if (captype (dent->flagfield) == ANYCASE)
868  {
869  lowcase (newword);
870  icharcpy (savearea[*nsaved], newword);
871  (*nsaved)++;
872  return;
873  }
874  }
875  }
876  }
877  /*
878  ** Either the sample had complex capitalization, or the simple
879  ** capitalizations (all-lower or capitalized) are illegal.
880  ** Insert all legal capitalizations, including those that are
881  ** all-lower or capitalized. If the prototype is capitalized,
882  ** capitalized all-lower samples. Watch out for affixes.
883  */
884  dent = firstdent;
885  p = strtosichar (dent->word, 1);
886  len = icharlen (p);
887  if (dent->flagfield & MOREVARIANTS)
888  dent = dent->next; /* Skip place-holder entry */
889  for ( ; ; )
890  {
891  if (flagsareok (dent))
892  {
893  if (captype (dent->flagfield) != FOLLOWCASE)
894  {
895  lowcase (newword);
896  if (firstisupper || captype (dent->flagfield) == CAPITALIZED)
897  newword[0] = mytoupper (newword[0]);
898  icharcpy (savearea[*nsaved], newword);
899  (*nsaved)++;
900  if (*nsaved >= MAX_CAPS)
901  return;
902  }
903  else
904  {
905  /* Followcase is the tough one. */
906  p = strtosichar (dent->word, 1);
907  memmove (
908  reinterpret_cast<char *>(newword + preadd),
909  reinterpret_cast<char *>(p + prestrip),
910  (len - prestrip - sufstrip) * sizeof (ichar_t));
911  if (myupper (p[prestrip]))
912  {
913  for (i = 0; i < preadd; i++)
914  newword[i] = mytoupper (newword[i]);
915  }
916  else
917  {
918  for (i = 0; i < preadd; i++)
919  newword[i] = mytolower (newword[i]);
920  }
921  limit = len + preadd + sufadd - prestrip - sufstrip;
922  i = len + preadd - prestrip - sufstrip;
923  p += len - sufstrip - 1;
924  if (myupper (*p))
925  {
926  for (p = newword + i; i < limit; i++, p++)
927  *p = mytoupper (*p);
928  }
929  else
930  {
931  for (p = newword + i; i < limit; i++, p++)
932  *p = mytolower (*p);
933  }
934  icharcpy (savearea[*nsaved], newword);
935  (*nsaved)++;
936  if (*nsaved >= MAX_CAPS)
937  return;
938  }
939  }
940  if ((dent->flagfield & MOREVARIANTS) == 0)
941  break; /* End of the line */
942  dent = dent->next;
943  }
944  return;
945 #endif /* NO_CAPITALIZATION_SUPPORT */
946 }
947 
948 

tdespell2

Skip menu "tdespell2"
  • Main Page
  • Namespace List
  • Class Hierarchy
  • Alphabetical List
  • Class List
  • File List
  • Class Members

tdespell2

Skip menu "tdespell2"
  • arts
  • dcop
  • dnssd
  • interfaces
  •   kspeech
  •     interface
  •     library
  •   tdetexteditor
  • kate
  • kded
  • kdoctools
  • kimgio
  • kjs
  • libtdemid
  • libtdescreensaver
  • tdeabc
  • tdecmshell
  • tdecore
  • tdefx
  • tdehtml
  • tdeinit
  • tdeio
  •   bookmarks
  •   httpfilter
  •   kpasswdserver
  •   kssl
  •   tdefile
  •   tdeio
  •   tdeioexec
  • tdeioslave
  •   http
  • tdemdi
  •   tdemdi
  • tdenewstuff
  • tdeparts
  • tdeprint
  • tderandr
  • tderesources
  • tdespell2
  • tdesu
  • tdeui
  • tdeunittest
  • tdeutils
  • tdewallet
Generated for tdespell2 by doxygen 1.9.1
This website is maintained by Timothy Pearson.