The apostrophe challenge

A performance of Rossini's 'William Tell', sung in N'Ko, took place in the Côte d'Ivoire. 'Sure 'twas great,' said Paddy. 'Amn't I always sayin' that, nex' to Gwich'in, N'Ko is the musicalest language aroun'.'  Next in the series will be Brahms' 'German Requiem'.

(Please don't take the trouble to tell me that N'Ko is an alphabet, not a language.  Take up the challenge instead!)

The challenge is:

1. For the above piece of text, produce manually an alphabetically-ordered word frequency list. Define word in whatever way seems most linguistically useful to you.

2. Now, encode the text in Unicode, in such a way that your manual results could be produced from it, given a suitable computer application.

For encoding the various quotes and apostrophes in text, Unicode provides characters including, but not necessarily limited to, the following:

These characters are all distinct in Unicode, and none has any decomposition. Assume any processing you wish for each character, as long as it is deterministic.

Ciarán Ó Duibhín
Modified 2006/06/29