Variant glyphs in Gaelic fonts
for lowercase r, lowercase s, lowercase s-dot, and ampersand

 

Just as there is a choice of glyph for some characters in Latin styles

with each font having its own preference, so it is in Gaelic styles for ampersand and for lowercase r, s and s-dot – for each of these characters there is a choice of two glyphs

As in the Latin examples, this choice is not predictable from context or any other circumstance, but is made freely by stylistic preference, and a stylistically-homogeneous piece of text will be consistent in its choices.

Thus, the lowercase r, lowercase s and lowercase s-dot characters are realised in some styles of Gaelic script as short glyphs — the first one of each pair shown above — and in others as long glyphs.  And for the ampersand character, we find either an ampersand or et-ligature glyph, or a Tironian-et glyph, looking like a lowered figure seven.

The style of the font has principally determined the choice between the variant glyphs for these characters.  Basically, among manuscript hands, minuscule styles prefer long forms and Tironian-et; uncial styles prefer short forms and ampersand.  Gaelic metal fonts, even those with uncial features such as Petrie and Colmcille, prefer long forms and Tironian-et; but, beginning around 1913, minuscule fonts of the Newman–Figgins style have had "modernised" versions, produced by introducing short forms in place of the long ones.

There is of course nothing to prevent a font being designed containing another combination.

Variants in digital Gaelic fonts

Apart, therefore, from Newman–Figgins styles, the style of a Gaelic font generally determines the appropriate choices of variant glyphs, or at least it determines which choices are historically authentic for the font.  This is exactly as in the case of the Latin glyph choices shown above, where a font (eg. Times Roman) makes its choices and rarely offer alternatives.  But Gaelic digital fonts often seek to provide both glyphs of a pair (even when one of them has no prior historical validity to be in the font at all), and leave the choice — whether it is seen more as a benefit or a burden — to the user.

Thus with some Gaelic fonts the user who has already selected a font may still have to decide whether he should key long or short r and s; or whether he should key ampersand or Tironian-et; and he may be given little guidance in deciding.  It is like asking a user who has already decided on a particular Latin font "Now, which kind of lower-case a would you like?"  The user's reaction may be "I don't know, but I hoped I had implicitly made that choice for the best when I chose the font."  A mature Gaelic font technology would have optimal decisions in-built, not leave the user to experiment with glyph choice.  In fact, a genuine historical choice arises only for the Newman–Figgins style, and there it should not take the form of a decision between variant glyphs within one font, but between two styles of font, "modern (short) Newman–Figgins" and "traditional (long) Newman–Figgins" (represented, for example, by Bunchló and Bunchló Ársa respectively).

The hidden consequences of variant choice

On the surface, a choice between variant glyphs looks like a harmless inconvenience to the user.  But it may have hidden and potentially harmful consequences, depending on how it is implemented.  It is harmful if the variant glyphs are offered as separate characters.  This will result in the variant glyphs being treated as distinct from each other in text processing, for example, in alphabetization.  Yet this situation has developed with some Gaelic fonts, which have been designed without adequate consideration of the effect on the processing of encoded text.

My advice to the user is to key only the standard code positions for lowercase r, lowercase s, lowercase s-dot, and ampersand, irrespective of which variants you wish to display, and choose a font which will display the variants you desire.  In Unicode and Latin-8 the standard code positions are:

standard codes Unicode Latin-8
lowercase r $0072 $72
lowercase s $0073 $73
lowercase s-dot $1E61 $BF
ampersand $0026 $26

These code positions will be produced by the normal keyboardings of the characters, which are straightforward in the case of r, s and ampersand, and likely to involve a deadkey with s in the case of s-dot.

Variant glyphs in different code positions will require different keyboardings, so it is an easy matter to avoid keying them.  Frequently encountered non-standard code positions for variants of these four characters are:

deprecated codes Unicode Latin-8
lowercase r $027C $89
lowercase s $017F $8A
lowercase s-dot $1E9B $9A
ampersand $204A $84

By avoiding the deprecated codepoints and using only the standard codepoints, we voluntarily limit ourselves to a single variant glyph per character, even where the font contains two.  Which variant will it display?  And what if this is not the variant you want to display?

The variant glyphs assigned to the standard code points are clearly at the discretion of the font, but there is fair unanimity in practice among the badly-behaved fonts: they tend to place the short glyphs and the ampersand at the standard code points; and to place the long glyphs and the Tironian-et at the deprecated code points.  Thus, adherence to the standard code points will produce (with these fonts) only the short glyphs and the ampersand.  Unfortunately, the combination of short glyphs and ampersand is untraditional for the great majority of Gaelic font styles.

  Styles in which these glyphs are traditional Digital fonts with these glyphs in standard codepoints
Long glyphs and Tironian-et Traditional minuscule Gaelchló Ársa Latin-8; Gaelach; Gael A
Short glyphs and Tironian-et Modern minuscule Most Gaelchló Latin-8; Tuamach; Gaeilge 2; Gael B
Long glyphs and ampersand Rare Gaelchló Ársa Unicode
Short glyphs and ampersand Uncial Most Gaelchló Unicode; Gaeilge 1; Rudhraigheacht; Evertype; American Uncial; Kelt; Celtic Gaelige

So how do you get a different combination of glyphs, without using the deprecated code points?  The answer is, to use a different version of the font, one where the glyphs you require are placed at the standard codepoints.  If there is no such version of the font you are using, you should ask the font supplier to make you one — in actual fact, Newton–Figgins fonts ought to be supplied in pairs, one with the short glyphs at the standard codepoints, and one with the long glyphs at the standard codepoints (as with the pair Bunchló/Bunchló Ársa).  If a modified font is not forthcoming, you can easily make the necessary encoding changes to it yourself, as described below.

The deprecated code points

We may take a brief look at the deprecated Unicode and Latin-8 code positions, to see what they are intended to be used for.  When encoded texts are processed, these codepoints will be treated as these interpretations would require.

The deprecated Unicode codepoints:

The deprecated Latin-8 codepoints all fall in the $80..$9F region, which is undefined in Latin-8.  However, these codepoints have other uses in Microsoft codepages, which extend ISO into this region:

These four deprecated codepoints are used to display the Gaelic glyphs in Michael Everson's Extended Latin-8 v2.0, which is actually not an extension of standard Latin-8 since it requires text to be differently encoded (eg. a Tironian-et glyph is encoded as $26 in Latin-8 but as $84 in "extended" Latin-8; and similarly for the other three glyphs above).  "Extended" Latin-8 also places dotless-i at $9F (used for Y-diaereris in Microsoft codepages).  Other characters introduced by "extended" Latin-8 into the $80..$9F region generally coincide with Microsoft usage of this region.

For further discussion, in the context of Unicode, see here; or in the context of Latin-8, see here.

Advice for font developers

The first thing is to think about whether you really want to provide alternative glyphs within your font for any of these four characters.  Historically, I can think of very few Gaelic metal types in which both a Tironian-et and an ampersand have been used.  For r and s, the short forms were historically confined to the Newman–Figgins style, where they co-existed with the long forms.

But, if the font is in the Newman–Figgins style, or you wish nevertheless to provide alternative glyphs, there are several right ways and one wrong way of doing so.  The wrong way is to accomodate both sets of glyphs simultaneously in the encoding, by assigning separate codepoints  to the two sets.  For the user, the practical consequences of this approach, when the encoded text is processed, are that the several glyphs of a single character are not treated as identical.  For example, they will not all be found when a search is made for one of them; they will not all move to the same position in an alphabetical sort, but each to its own position; and, perhaps easiest to appreciate, if it is desired to change one set of glyphs to another for display purposes (eg. long forms to short forms, or vice versa), this cannot be achieved by a simple switch of font, but must be accompanied by global character replacements.  In any case, a user who avails of the deprecated codepoints will have to replace them globally by the standard codepoints, as a prelude to text processing; why make him take this step when the desired appearance of the text may be secured without it?

Of the valid ways to provide multiple glyphs for a set of characters, the simplest is to issue a separate version of the font for each set of glyphs.  For example, with a Newman–Figgins font, a pair of fonts may be created, one with long glyphs and one with short glyphs, but otherwise identical.  The essential feature is that only the standard set of codepoints is used.  This represents some duplication of effort for the font developer, but it is worth stressing that the Gaelic font styles in which it is historically precedented are quite limited.

Some labour may be saved by distributing such a pair of fonts as a TrueType collection file (.ttc).  This format is for groups of fonts which draw on a common set of glyphs but map different selections of those glyphs to the character-set, and it allows the duplicated tables — including, importantly, glyf and kern — to be included in the file only once.  The real saving would come if the .ttc file could be edited with a font editor, eg. to change a glyph shape or a kern value — these are things which are common to all the fonts in the collection and are thus stored only once in the .ttc file.  Unfortunately, it does not appear that such an editor exists; it seems rather that the .ttc has to be broken up into a set of .ttf files for editing, which negates the advantage.

For the future, it seems that OpenType features may provide a better solution to the inclusion of variant glyphs in a font.  For example, a font might have the short glyphs of r, s and s-dot mapped to the corresponding characters; but additionally a registered OpenType feature such as "hist" (historical glyph forms) could be attached to the long glyphs of r, s and s-dot in the font, nominating them to be substituted for the short glyphs whenever the user selects the font with the feature "hist".  This arrangement might suit a Newman–Figgins font, where the short glyphs might be the default, but in other styles we might wish the long glyphs to be the default, in which case we require to mark the short glyphs with a feature. At present, not much software implements OpenType features, but this seems likely to improve.  Even pending fuller implementation, however, OpenType features provide us with an authoritative and  principled method of constructing fonts containing variant glyphs.

Changing the encoding of the variant glyphs

We encourage users, who require a version of a font which places particular variant glyphs in the standard codepoints, to seek such a version from the font designer.  And it really would be preferable for designers to provide these versions in a controlled and uniform manner.  Nevertheless, if there is no other way for a user to obtain such a version, he may easily make the necessary changes to the encoding of the font by using a font editor, such as Font Creator or FontLab.  The method described here uses instead the free program ttf_edit by Richard J Kinch, which may be obtained by following the instructions here.

The ttf_edit program is run from the MS-DOS command line, to extract the encoding from the font, in the form of an afm file.  This file is edited by hand to change the encoding as required, and ttf_edit is run a second time, to add the amended afm file to the font, to change the font name, and to save the font under a new filename.  In the instructions which follow, filename should be substituted by the name of the existing ttf file, and newfilename by a name to be given to the amended ttf file.

1. Extract the Windows encoding information from the ttf file, into a text file called filename.afm.

The command line is as follows:

ttf_edit  filename.ttf  font  3  1  afm  >  filename.afm

2. Use a text editor to edit the information in filename.afm, then save it as newfilename.afm

To begin with, the file filename.afm may, for example, contain lines such as the following:

CH <0026> ; WX 917 ; N ampersand ; B 29 -18 899 743 ;
CH <0072> ; WX 673 ; N r ; B 23 -27 703 531 ;
CH <0073> ; WX 575 ; N s ; B 19 -6 544 509 ;
CH <017f> ; WX 565 ; N longs ; B 18 -20 583 724 ;
CH <027c> ; WX 719 ; N longr ; B 24 -236 707 531 ;
CH <1e61> ; WX 575 ; N sdot ; B 19 -6 544 705 ;
CH <1e9b> ; WX 565 ; N longsdot ; B 18 -20 583 900 ;
CH <204a> ; WX 604 ; N et ; B -1 -242 591 538 ;

These lines indicate that — in this example case — the ampersand glyph and the short glyphs of r, s, sdot are associated with the standard Unicode codepoints <0026>, <0072>, <0073> and <1e61> respectively, while the Tironian-et glyph and the long glyphs of r, s, sdot are associated with the deprecated codepoints <204a>, <027c>, <017f> and <1e9b> respectively.  Your font may differ with regard to which glyphs are assigned to the standard codepoints, what names are given to the glyphs (but hopefully they will be recognizable), and what other numbers are present on the lines.

Having decided whether you wish to retain the short or the long glyphs, and the ampersand or the Tironian-et glyph, you should now remove the four lines referring to the unwanted glyphs, and for each of the four lines which remain, the four-character string in angle brackets after CH should be changed in the following cases:

Nothing else in the afm file should be changed.

Suppose, in this example, that we wish to retain Tironian-et glyph and the long glyphs.  The above lines would be amended to read:

CH <0026> ; WX 604 ; N et ; B -1 -242 591 538 ;
CH <0072> ; WX 719 ; N longr ; B 24 -236 707 531 ;
CH <0073> ; WX 565 ; N longs ; B 18 -20 583 724 ;
CH <1e61> ; WX 565 ; N longsdot ; B 18 -20 583 900 ;

It is not necessary for the present purpose to place these lines in order, though we have done so.

The amended file should be saved as newfilename.afm.

3. Change the font name, to reflect the difference between the amended font and the original one, and to allow the amended font to be installed alongside the original one; reinsert the amended encoding information from newfilename.afm into the font; and save the amended font as newfilename.ttf.

ttf_edit  filename.ttf  font  newfilename.cmd  run

where the file newfilename.cmd is a text file containing the following (copy and paste it from here, and supply the required values of newfontname, newfontPSname, and newfilename):

dup  3  1  0x0409  1  (newfontname)  rename
dup  3  1  0x0409  4  (newfontname) rename
dup  3  1  0x0409  6  (newfontPSname) rename
dup  3  1  0x0409  16  (newfontname) rename
dup  3  1  0x0409  18  (newfontname) rename
newfilename.afm  3  1  encode  newfilename.ttf  gen

newfontname is the name by which the amended font will be known to Windows.  You should make it similar to the name of the unamended font, but with some characteristic addition (such as Ársa or Nua).  newfontPSname may be the same as newfontname but with any spaces or accents removed.


Ciarán Ó Duibhín
2006/05/26
Clár cinn / Home page / Page d'accueil / Hauptseite