SmartWordBreakPlugin

Inserts word-breaks and soft hyphens into long words

Long words hurt readability

Long words make web pages harder to read. First prize is to use short words instead, but sometimes you cannot. WikiWords can be alarmly long e.g. SmartWordBreakPlugin, TextFormattingRules and HierarchicalNavigationChildExample. Technical content sometimes requires long words. For example, long names tend to crop up in software source code and thus also in the corresponding documentation. Finally, some authors simply use (too many) long words, and sometimes there is not much you can do about it. (When the Vice President uses long words, it might be more productive to celebrate that senior management is using the wiki and just live with the long words smile )

Long words can make a mess of an otherwise-clean table layout. Browsers tend not to split words automatically and table columns must be wide enough for the longest word in each column. Thus, long words make content interfere with presentation.

You can tell the browser that it may split a word, by using the <wbr> tag, but this tag is not supported by all browsers. In fact, there is *no* completely portable way to tell the browser that it may split a word.

Besides, you would rather concentrate on content and let the wiki take care of presentation and portability issues like choosing the correct word break for the browser and decding when and where to split long words.

Split long words portably

SmartWordBreakPlugin lets you split long words portably. The plugin inserts the correct form of word break for the reader's browser.

How SmartWordBreakPlugin splits words

SmartWordBreakPlugin uses a combination of approaches to split words.
  • WikiWords are split at the start of each word
  • Words_With_Underscores are split after underscores
  • Words breaks are inserted after punctuation
  • Hyphenation (see below)

SmartWordBreakPlugin inserts word breaks at these points. They inform the browser that it may split the words at these points.

If hyphenation is enabled, SmartWordBreakPlugin tries to hyphenate words that are longer than the longest-unbroken-word-segment setting, which is controlled with the SMARTWORDBREAKPLUGIN_LONGEST preference and the longest parameter to %SMARTWORDBREAK{...}%.

The plugin will apply hyphenation both to ordinary words and to the segments of words split using other heuristics. The plugin inserts soft hyphens (&shy;) into ordinary words. Soft hyphens can look odd when inserted into WikiWords and words_with_underscores, so the plugin inserts word breaks there instead.

Some search engines do not handle word breaks and soft hyphens well; they may register the parts of thw word but not the whole word. This could render search facilities useless. To mitigate against this, SmartWordBreakPlugin inserts the unsplit (original) version of split words as an HTML comment.

Control which words are split

The plugin is configurable. You can adjust how aggressively and intelligently the plugin splits words, and you can control which parts of each topic are processed. This is a tradeoff between convenience and performance. At the one extreme, SmartWordBreakPlugin will split all long words on a wiki page using a combination of wiki-specific heuristics and the TeX hyphenation algorithm, preserving the original text in HTML comments to assist search engines. At the other extreme, you can insert word breaks only where you want them.

Automatic word breaks in tables

Setting the SMARTWORDBREAKPLUGIN_TABLES preference makes the SmartWordBreakPlugin insert word breaks automatically in all tables on a page. This preference probably provides the best balance between convenience and performance for most applications.

The SMARTWORDBREAKPLUGIN_LONGEST and SMARTWORDBREAKPLUGIN_HYPHENATE preferences affect table-based insertion of word breaks.

The table-based processing does not work well with nested tables, so it is not useful with skins that use tables to lay out the page.

Focussed automatic word breaks: %SMARTWORDBREAK{...}%

The %SMARTWORDBREAK{...}% macro inserts word-breaks automatically. This macro lets you apply SmartWordBreakPlugin's automatic processing to a portion of a page. Using this macro, it is possible to automatically insert word breaks in a single table. This macro lets you enable automatic processing only where you need it.

Argument Comment Default value Example
hyphenate Enables the hyphenation algorithm. Value of SMARTWORDBREAKPLUGIN_HYPHENATE preference, which defaults to "on" hyphenate="off"
longest Specifies how long a sequence of letters may be; word-breaks or soft hyphens are inserted into words longer than this setting. Value of SMARTWORDBREAKPLUGIN_LONGEST prefernce, which defaults to 8 longest="5"

Calling all pockets: Automatic word breaks for whole pages

The SMARTWORDBREAKPLUGIN_WHOLEPAGE preference enables processing for the whole page, including the header, footer and side-bar. This can hurt performance, so it should probably not be enabled in SitePreferences. This preference may be useful on specific pages. It may also be useful to wiki users who use narrow browser windows - they could set this preference in their user topic.

The SMARTWORDBREAKPLUGIN_LONGEST and SMARTWORDBREAKPLUGIN_HYPHENATE preferences also affect whole-page processing.

Maximum performance: Insert word breaks manually

The %WBR% macro inserts the correct form of word-break for your browser. You choose where to put the break; SmartWordBreakPlugin inserts the correct one. The %WBR% macro has the lowest overhead of all options provided by the SmartWordBreakPlugin. This macro may be tedious to use, and topics that use it may be more difficult to read when editing. This macro may also interfere with the wiki search facility.

Demonstration

This table demonstrates what SmartWordBreakPlugin does (change the window width to see the effect):
%TABLE{columnwidths="30%,70%"}%
| *Selection* | *Manifestation* |
| %SMARTWORDBREAK{Navigate_With_Euclidean_Geometry}% | Works for short distances, based on the approximation that the world is flat. |
| %SMARTWORDBREAK{Navigate_With_Elliptical_Geometry}% | Works for long distances. |

Simulation (should works on Firefox 2 & 3 and IE 6 & 7; other browsers may vary):
Selection Manifestation
Navigate_With_Euclidean_Geometry Works for short distances, based on the approximation that the world is flat.
Navigate_With_Elliptical_Geometry Works for long distances.

If you have the plugin installed and enabled:
Selection Manifestation
%SMARTWORDBREAK{Navigate_With_Euclidean_Geometry}% Works for short distances, based on the approximation that the world is flat.
%SMARTWORDBREAK{Navigate_With_Elliptical_Geometry}% Works for long distances.

Without word-breaks:
Selection Manifestation
Navigate_With_Euclidean_Geometry Works for short distances, based on the approximation that the world is flat.
Navigate_With_Elliptical_Geometry Works for long distances.

Examples

Insert a single word break:
| *Name* | *Description* |
| Very_Long_%WBR%Function_Name | Passes the foo to the crumblicator, which turns it into biscuits. This function is not re-entrant because foo cannot be articulated. |

Make SmartWordBreakPlugin process a single table:
%SMARTWORDBREAK{"
%TABLE{columnwidths="30%,70%"}%
| *Selection* | *Manifestation* |
| Navigate_With_Euclidean_Geometry | Works for short distances, based on the approximation that the world is flat. |
| Navigate_With_Elliptical_Geometry | Works for long distances. |
"}%

Add this to specific pages to make SmartWordBreakPlugin process all tables on those pages (probably the best way to use this plugin):
<!--
   * Set SMARTWORDBREAKPLUGIN_TABLES=on
-->

Add this to a page to make SmartWordBreakPlugin process the whole of the page:
<!--
   * Set SMARTWORDBREAKPLUGIN_WHOLEPAGE=on
-->

Caveats

Browsers do not all wrap text in the same way. Some browsers only respect word-breaks for text in tables. Others respect word-breaks for all text.

The SMARTWORDBREAKPLUGIN_TABLES preference does not work well with nested tables. In consequence, it does not work well with skins that use tables for controlling the page layout.

After attempting to split words using hyphenation rules, the plugin simply chops up any remaining long word-segments into shorter fixed-length seqments. The resulting breaks are unfortunately not grammatically-correct.

Performance considerations

SmartWordBreakPlugin can hurt performance. It is better to use it only where it is needed, and to avoid widespread hyphenation.

Only use the plugin where it is needed:
  • Use %SMARTWORDBREAK{}% in preference to setting SMARTWORDBREAKPLUGIN_TABLES to on.
  • Set SMARTWORDBREAKPLUGIN_TABLES to on for a single page in preference to setting it on for a whole web.
  • Set SMARTWORDBREAKPLUGIN_TABLES to on for a single web in preference to setting it on for a whole site.
  • Avoid using SMARTWORDBREAKPLUGIN_WHOLEPAGE, unless you really need the whole page (including header and footer) processed, or you really do not mind the performance hit.

Hyphenation hurts performance too:
  • Only enable hyphenation where it is needed.
  • Reduce the number of words to be hyphenated by increasing the length of the longest unsplit word-segment (use the longest parameter or the SMARTWORDBREAKPLUGIN_LONGEST preference).

Installation Instructions

You do not need to install anything in the browser to use this extension. The following instructions are for the administrator who installs the extension on the server.

Open configure, and open the "Extensions" section. Use "Find More Extensions" to get a list of available extensions. Select "Install".

If you have any problems, or if the extension isn't available in configure, then you can still install manually from the command-line. See http://foswiki.org/Support/ManuallyInstallingExtensions for more help.

Preferences

No preferences are stored in this topic. The example settings here have no effect. To learn more about setting preference variables, see the PreferenceSettings topic.

Variable Default DescriptionSorted ascending
SMARTWORDBREAKPLUGIN_HYPHENATE on Enables hyphenation i.e. splitting words at hyphenation points. This preference affects the whole page when using the SMARTWORDBREAKPLUGIN_WHOLEPAGE setting, or all tables when using the SMARTWORDBREAKPLUGIN_TABLES setting. It also sets the default value for the hyphenate parameter to %SMARTWORDBREAK{}%.
SMARTWORDBREAKPLUGIN_TABLES off Enables processing of all text in tables. This should be faster than SMARTWORDBREAKPLUGIN_WHOLEPAGE. This works better than %SMARTWORDBREAK{}%, but it is slower, so this preference should only be set on the pages where it is needed. If SMARTWORDBREAKPLUGIN_WHOLEPAGE is true, then SMARTWORDBREAKPLUGIN_TABLES is ignored.
SMARTWORDBREAKPLUGIN_WHOLEPAGE off Enables processing of the whole web page.
SMARTWORDBREAKPLUGIN_LONGEST 8 Sets the length of the longest unbroken sequence of letters. This preference affects the whole page when using the SMARTWORDBREAKPLUGIN_WHOLEPAGE setting, or all tables when using the SMARTWORDBREAKPLUGIN_TABLES setting. It also sets the default value for the longest parameter to %SMARTWORDBREAK{}%.

Configuration

Some SmartWordBreakPlugin settings affect the whole site, and are not intended to have different values for different topics and/or webs. These settings are adjustable via configure. The default settings are suitable for English.

Hyphenation configuration files for additional languages are available from http://www.ctan.org/tex-archive/language/. Load the configuration file for your language onto your server and set the path to that configuration file via configure. For more information, see TeX::Hyphen on CPAN.

Supported browsers

This plugin should support the following browsers, but they have not all been tested:
  • Internet Explorer 6, 7 and 8
  • Firefox 3.x
  • Opera 9.62 and 10.x
  • Chrome 1 & 2
  • Konqueror 3.5.7

The list is not exhaustive.

Info

Author(s): Foswiki:Main.MichaelTempest
Copyright: © Michael Tempest 2009
License: GPL (Gnu General Public License)
Release: 07 Nov 2009
Version: 5454 (2009-11-07)
Change History:  
07 Nov 2009 Many bugfixes, hyphenation support, performance improvements
31 Oct 2009 Draft release
Dependencies: None
Home page: http://foswiki.org/bin/view/Extensions/SmartWordBreakPlugin
Support: http://foswiki.org/bin/view/Support/SmartWordBreakPlugin

PackageForm edit

ExtensionClassification Interface and Visualisation
ExtensionType PluginPackage
Compatibility Browsers: IE6, IE7, IE8, FF3, Opera 10
Foswiki: Should work will all versions from 1.0.4 onwards. Tested with 1.0.7 and trunk (1.1)
DemoUrl http://
SupportUrl SmartWordBreakPlugin
DevelopedInSVN Yes
ModificationPolicy CoordinateWithAuthor
Topic attachments
I Attachment Action Size Date Who Comment
SmartWordBreakPlugin.md5md5 SmartWordBreakPlugin.md5 manage 0.2 K 07 Nov 2009 - 18:19 MichaelTempest  
SmartWordBreakPlugin.sha1sha1 SmartWordBreakPlugin.sha1 manage 0.2 K 07 Nov 2009 - 18:19 MichaelTempest  
SmartWordBreakPlugin.tgztgz SmartWordBreakPlugin.tgz manage 33.8 K 07 Nov 2009 - 18:19 MichaelTempest  
SmartWordBreakPlugin.zipzip SmartWordBreakPlugin.zip manage 38.5 K 07 Nov 2009 - 18:18 MichaelTempest  
SmartWordBreakPlugin_installerEXT SmartWordBreakPlugin_installer manage 4.3 K 07 Nov 2009 - 18:19 MichaelTempest  
Topic revision: r4 - 09 Dec 2009, CrawfordCurrie
 
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. see CopyrightStatement. Creative Commons License