Work-log of software changes and thoughts in the TOOLS project
(followed by the POOLS-3 project)
by Caoimhín Ó Donnaíle

2012-01-01

Start date of the project
But I did of work done prior to this in setting up an initial demonstration Clilstore facility with login and editing capabilities

2012-01-08

Corrected a problem which had been preventing Wordlink for identifying words properly in Hindi (and no doubt other langages which use a lot of diacritic marks). Wordlink had been using \p{L} in PCRE regular expressions to determine whether a Unicode character was a letter and hence where a word began and ended. This is ok for languages with simple scripts. However, Hindi is written using not just “letters” but also “combining marks” which combine with the preceding letter to form a single display character. After reading up on http://www.regular-expressions.info/unicode.html, I decided I needed to change \p{L} to \p{L}\p{M}* - and this seems to have done the trick.

2012-01-09

Added the bab.la dictionaries to Multidict - 56 different sl-tl pairs.

2012-01-10

Added the student-online dictionary to Multidict. (en↔de and es↔de)

2012-01-25

Made a start on a filter/search facility for Clilstore. Provided a filter for language.

In the "Add a new page" facility in Clilstore, changed the label on the submit button from "Create page" to "Publish page", and put it above the form as well as below. Both suggestions came from the TOOLS kickoff workshop in Brussels.

Made Arabic have style "direction:rtl" for the HTML body, which means it is automatically aligned right instead of left. Also increased the default text size for Arabic. Both suggestions from the workshop.

2012-01-26

Corrected a bug in the Clilstore "Add page" facility - failing to save the float/scroll choice for the text

Added the Zodynas Lithuanian⇔English dictionary to Multidict.

2012-02-09

Tidying up my to-do list over the last week or so, and adding lots of things to it.

2012-02-10

Martin has installed three new DNS servers for smo.uhi.ac.uk, with new software, and this has cured the slow DNS problem and the timeouts on the dictionaries. Hurray!

2012-02-10

The Favereau Breton-French dictionary seemed not to be working. Since this dictionary works by html GET methodology, changed the handling to “redirect” and this seems to have cured it. On my to-do list is to make “redirect” the default for all GET methodology dictionaries which use utf-8.

2012-02-12

Got the Swedish Lexin family of dictionaries working again - Swedish to/from 18 mostly immigrant languages, including Arabic which is useful to the project. These had changed address, and had big changes too to their paramters. Disabled the Lexin Swedish-English dictionary for the time-being though, as this has also stopped working with Multidict and has a fiendishly difficult parameter system.

2012-02-14

Added to the Multidict “handling” methods an entirely new method of linking to dictionaries which take POST parameters. The existing method, since the very beginning of Multidict, has been for Multidict to issue the POST request, obtain the dictionary results and relay them straight out to the user. The new method is for Multidict to construct a temporary form, with all the required parameters filled in, and to send this to the user’s browser along with some Javascript to make the browser submit the form straight away. This seems to work well. The new method has the advantage, if it is an advantage, that the request comes from the user’s browser rather than from Multidict. (Perhaps useful for some paying dictionaries restricted by IP address or cookie? And perhaps useful if the dictionary allows the user to select preferences, remembering them via a cookie.) It has the slight disadvantage that it will not work if the user has Javascript disabled in their browser, but nearly everyone has Javascript enabled these days.

2012-02-14

Disabled the Dict.cc dictionary for the time being, as it has started jumping out of frames.

2012-02-17

Added the Rumanian monolingual dictionary Dexonline to Multidict.

Did a bit of tidying up of the Multidict/Wordlink pages and programs - moving towards HTML5.

2012-02-20

Added the EC Leonardo logo and disclaimer to the main Clilstore page

2012-02-27

Implemented a few decisions from the Brussels workshop - The links bar now appears at the top of each user created page as well as the bottom, and a link to Clilstore itself now appears automatically as the first link. In those cases where an author had already added a link to Clilstore, I removed this to avoid duplication.

2012-02-28

Disabled the Basque-Spanaish Hiztegia3000 dictionary in Multidict, following information from Kent that it was no longer working, and having failed to find any new address for it. Added the Gerenika Basque-Russian dictionary, which seems to be of good quality.

Did away with the former add.php program for adding new pages to Clilstore. Instead, the add facility is now a special case of the edit facility with the page number set to 0. (When the form is received by server, if the page number is 0, it inserts a new page with the next available page number.) This avoids a lot of duplicate programming, and it means that links buttons, for example, are now available when creating a page as well as when editing a page - something which people have been requesting.

2012-02-29

Added the Lithuanian-English dictionary m.zodynas24.lt to Multidict. Had quite a fight to get this working because of what turned out to be a long-standing bug in the Multidict program to do with dictionaries which use non-UTF-8 character sets. Now corrected.

2012-03-01

Added the English-Lithuanian dictionary m.zodynas24.lt/anglu-lietuviu-zodynas/, and the Lithuanian<>English dictionaries www.lietuviu-anglu.com and www.anglu-lietuviu.com to Multidict.

Technical stuff: Corrected another bug to do with dictionary character sets in the Multidict program. Tidied up the handling of character sets in the Multidict program (e.g. put ‘UTF-8’ and ‘ISO-8859-1’ in upper case to comply with the IANA standard. Changed the dictionaries database to make most of the ‘GET’ methodology dictionaries use ‘redirect’ handling, when there is no need for any special processing of the dictionary results.

Improved the Wordlink examples page a bit. Added in the TOOLS project languages and highlighted them.

2012-03-02

Added the English<>Lithuanian dictionary www.anglu-lietuviu.lt to Multidict.

Set about adding the lietuviuzodynas.lt family of dictionaries to Multidict, Lithuanian to and from 17 other languages. This turned out to have a rather complicated way of working. Once I saw through it, though, I was much more “successful” than intended and found that I could easily link to their raw dictionary results with no advertising. This would be ideal for Wordlink, of course, but it would be unethical. The trouble is that it is much more difficult to link to their dictionary pages - dictionary results with advertising, etc. - and I do not yet know how to do it.

2012-03-03

Looked at a long standing Gaelic dictionary problem in Wordlink - the Faclair na Pàrlamaid which stopped working last year. It seems that it is not going to come back, and in fact it seems to have been replaced by the Faclair airson Riaghaltas Ionadail, which unfortunately has a jazzy Flash interface - fine as a standalone, but not much good for linking to. I have already been in touch with the original project programmer, but so far no other interface has appeared.

The individual Flash pages of the Faclair airson Riaghaltas Ionadail are addressable, so I set about indexing them all, recording the last word on every single page, so that they could be used with dictpage.php, the page-image addition to Multidict. This worked fine - until it was put in a frame, where it didn’t work at all. The individual Flash page addressing does not work in a frame (nor an iframe).

So I tried another tack. I took the big pdf files which are available for Faclair airson Riaghaltas Ionadail, split them with pdfsam into individual pages and stored these on the SMO webserver where dictpage.php can address them individually (I believe the dictionary developers are happy with this). Faclair airson Riaghaltas Ionadail is now working fairly well with Multidict, which helps to make it accessible. (There is the usual proviso, of course, that big page images are not great with Wordlink.) In fact, this technique of splitting up pdfs into individual pages might possibly be very useful with other page-image dictionaries - those in the WebArchive. The advantage would be that it could give much quicker results than the WebArchive, which can be very slow to respond. It would require the creation, though, of some mechanism to page forwards and backwards.

In the process of adding Faclair airson Riaghaltas Ionadail, I added a new facility to dictpage.php, one I have been meaning to add for some time now, namely the ability to use indexing based on the last word of each page instead of the first. Indexing based on the last word is more appropriate because if a dictionary has several entries for the same word with different meanings, some falling on one page and some falling on the following page, if a user happens to be looking for that particular word, it is the first page, where the word appears last, that they normally want to be presented with first.

2012-03-03

Added the zodynas.lt family of dictionaries to Multidict - Lithuanian to and from 30 different languages. This required the new “form” handling methodology to make it work.

2012-03-12

Over the weekend and last several days working on a search/filter system for Clilstore. It is now partly in place, partly still under development.

2012-03-13

Noticed the Gaeilge-Gaeilge dictionary, An Foclóir Beag had stopped working. Got it going again. The old address, http://www.csis.ul.ie/focloir/ still gets you to the dictionary via a redirection, but the business end has all moved to http://193.1.97.44/focloir/ - ugly!

Change Clilstore userids to be case-independent (and accent-independent), since I noticed that the case-dependent userids had caused some confusion in the past. So now if you your userid is “caoimhin”, you can type your userid as “Caoimhin” or “CAOIMHIN” or “Caoimhín” (with an accent) and you will still get logged in. The system will treat all of these as if you had typed “caoimhin”, the userid as you first registered it. (Passwords remain case-dependent.)

2012-03-13

Came across the Apertium translation system and added its 42 language pairs (emphasis on Hispanic) to Multidict. Had to create language table entries for two dialects, Valencian and Aranese. Decided to use the LinguistList codes for these. In fact, I have pretty well decided to change Multidict, once I get time, to use LinguistList codes internally for everything, and probably only use ISO 639-1 two-letter codes (en, fr, de, es, etc.) for display.

2012-03-19

Working over the weekend to turn all the “composite trees” from the wonderful LinguistList multitree.org site into a languages table with 22,000 records in the relational database. This should make it a lot quicker to add new languages to Multidict when we come across dictionaries for them, and to provide a facility for easily finding dictionaries for related languages. Some problems encountered, though. In five places, most notably in the Balkans where there is a real tangle, the “composite trees” are not trees because nodes have more than one parent. Overcame this in an ad-hoc manner. Another problem needing overcome is that 165 nodes instead of having a single code have multiple codes.

2012-03-23

Corrected a long-standing problem when using Wordlink with urls containing ampersands. (Reported to me by an SMO student, the creater and owner of Fòram na Gàidhlig - Tapadh leat a Chatrìona!) The part of the url following the first ampersand was being lost, because because Wordlink was interpreting it as additional parameters to Wordlink rather than as part of the url. Cured it by adding some Javascript to convert ‘&’ to ‘{and}’ on submission of the form.

Found to my horror that all the WebArchive page-image dictionaries had stopped working from Multidict. Instead of taking you to the right page, it always took you to the title page for the dictionary. Found after tests that this only happened when the WebArchive url was given as http://www.archive.org/..., whereas when the url is given as http://archive.org/... things still work fine. Changed the urls in the Multidict database and all is well again. Whew!

Deleted code “pt-PT” (Portuguese Portugeuese) from the list of languages in Multidict and amalgamated the few references to it in the database with code “pt” (Portuguese), as recommended by the Portuguese team at the Brussels workshop. There are now only two codes for Portuguese, “pt” and “pt-BR”.

2012-03-23

More work on the table of 22,000 languages and dialects from LinguistList, standardising it for use with Multidict.

Technical maintainance: Updated all the tables and fields in the Multidict datbase to use the utf8_unicode_ci collation, which is now standard, instead of an untidy mixture of utf8_general_ci and utf8_unicode_ci as previously.

2012-03-24

Got Lexer’s Middle High German dictionary working with Multidict again. It had moved and stopped working.

2012-03-26

Some dissemination activities over the past week. Spent some time showing Multidict and Wordlink and Clilstore to Morgan Sleeper, a young visiting linguist from the US, who was impressed and keen to try them out when he gets home. And gave some advice to Catrìona Colsman, the owner of the main Scottish Gaelic bulletin board, Fòram na Gàidhlig, who has now put an active link to Wordlink at the top of each bulletin board page.

2012-04-06

Heard that a new version of Am Faclair Beag was available with concise display for mobile phones, etc: www.faclair.com/m/. Good news (especially for use with Wordlink), because Am Faclair Beag is probably the most comprehensive online Gaelic-English dictionary these days, and moreover does lemmatisation. Added the mini version to Multidict.

2012-04-18

Updated the Clilstore page create/edit program to always convert language codes to lower case, since this had been giving bother when authors wrongly gave language codes in upper case, such as “EN” instead of “en”. Changed any wrong codes for pages already Clilstore.

Corrected a bug whereby the owner of a new page was not being set as it should be to the logged in user.

2012-04-20

Made a major improvement to the filtering abilities of Clilstore. Still needs a sort facility.

2012-04-25

Updated the Dicts.info entry in the Multidict database to get it working again, since its parameters had changed. This is a sizable collection of dictionaries.

2012-05-02

The Clilstore page edit facility is supposed to
1) Stop users (other than admin) from trying to edit other users’ pages. 2) If any such attempts do arrive in despite this, they get ignored.

When trying to edit other users’ pages as admin, I found that the attempt was being correctly allowed through (1), but wrongly ignored at (2). So when admin was editing other users’ pages, it looked like it was working but it wasn’t. Corrected this bug.

2012-05-04

Altered the Wordlink and Multidict programs so that all frames now have unique names (based on the “sid”, session id). This means that if a user has several Wordlink or Multidict sessions going in different browser tabs, they are less likely to get confused, with the results going to the wrong tab. It might also cure a problem reported by Caoimhín Ó Dónaill, whereby the results of clicking a dictionary favicon incorrectly get sent to a new browser tab (probably due to Javascript bugs in some browsers).

Added a new information panel, (1) When a user first starts Wordlink; and (2) When a user first runs Wordlink on a webpage, before clicking to look up a word. These make use of otherwise empty space, and give basic helpful information for new users. This benefit also excents to Clilstore, since panel (2) is automatically seen in Clilstore units.

2012-05-06

Made a major upgrade to how Multidict uses language codes and gets information on languages. Behind the scenes it is now getting all information from a large (22000 record!) table derived from the Linguistlist Multitree Composite Trees. This should speed the process of adding new languages and dictionaries to Multidict. Since it has hierarchic information, it should enable a future tool which allows a knowledgable user to choose a language by language family.

Defined an ordering among “sibs” in the language family hierarchy for all languages currently in Multidict. This means in particular that this table of Multidict languages is now in a much more logical order, instead of an alphabetic/hierarchic hybrid as before.

2012-05-13

Multidict includes links to Wortschatz in various languages. (Multidict treats it as if it were a monolingual dictionary. It isn’t actually a dictionary at all, but an automatic corpus-generated wordnet and concordancer developed by the University of Leipzig, but it is a useful linguistic facility nevertheless.) I noticed that these links had stopped working because the paramters required by Wortschatz had changed. In fact, Wortschatz has got better and has had many more languages added to it. Got it working again, and got it linked to 90 of the languages currently in Multidict. Added Samogitian, a dialect of Lithuanian, to Multidict since Wortschatz includes it, but didn’t add any of the other 50 or so new languages which I could have from Wortschatz.

Noticed that the Albanian-English Argjiro dictionary had stopped working from Multidict because its parameters had changed. Got it working again.

Did quite a bit of work over the last few days pulling in the complete list of Wikipedias in 275 languages, got them all linked to the precise language code which they refer to (which was not always straightforward), and made the list into a table in the Multidict database. This should be useful for various purposes. It is a source of endonyms for languages when adding new languages to Multidict. And it should make it easier to keep the Wikipedias right in the Wordlink examples page when adding new languages to Multidict.

2012-05-25

A lot of programming recently to improve the filter facility on the Clilstore index. The filter choice is now remembered via persistent cookie, so the user does not have to keep reselecting it, and the choices persist from session to session on the same computer. Got fields where filter options had been set to display in yellow, to warn the user that they are in effect. Reversed the ordering of units to put the most recent at the top (Helle’s suggestion).

2012-06-17

Added the excellent (as far as I can tell) Sardinian dictionary DitzionàriuOnline to Multidict.

2012-06-17

Corrected a couple of bugs in the filter mechanism in the Clilstore index:
1. Tha language filter value was not being displayed in the new form show in the results. 2. The “Reset” button had only been working for Opera. Added some extra Javascript to make it work with other browsers.

2012-06-18

Added a new facility to the Clilstore index to filter for words in the text of the unit, not just the title. (Requested in one of the feedback forms from the Turksih college.)

2012-06-19

Added a new option to the new page form in Clilstore to preserve line breaks with <br> tags. This should be useful when inputing the words of songs.

2012-06-23

Added a new field “summary” to Clilstore, as suggested by Kent. Currently it:
- Is shown when hovering over the unit title in the Clilstore index;
- Is searchable in the Clilstore index along with the text;
- Is shown when the new “Summary” button in the unit itself is hovered over or clicked;
- Can be entered or changed when creating or editing a unit.

2012-06-24

Added “datalists” for Owner and Language to the Clilstore index filter form. Datalists are a new HTML5 feature and are currently only implemented in Opera, Firefox and Internet Explorer 9. The datalist shows the possible values and progressively reduces the list of possible values as the user types, making it easy to select a value.

2012-06-25

The login system for Clilstore is a quickly cannibalised version of the SMO login system and still has a few relics from the system in it. I spotted that if a user login had timed and and the user tried to create a new page, they would be directed to the SMO login instead of the Clilstore login. Corrected this.

2012-06-26

At Kent’s suggestion, added a new “summary” field to Clilstore, and a new entry field for it in the unit edit form. Arranged that the summary is also included when filtering for units with certain words in the text. The summary is currently displayed when hovering over the unit title in the Clilstore index. Following e-mail discussion with TOOLS partners, produced a new “Unit info” page linked to each unit, with useful information including the summary.

Altered the button/links bar mechanism slightly, and managed to arrange that the buttons for Arabic now go from right to left.

2012-06-27

Added a new “test unit” flag to Clilstore, a new checkbox for it on the unit edit form, and a new filter option (off by default) in the index to include test units. Following feedback from partners, arranged that authors when logged in will always be shown their own test units (to encourage them to complete them or delete them!).

2012-08-02

Made “Include test units” on the Clilstore index page into a button taking immediate effect (rather than having to click “Filter”) - Kent’s suggestion.

2012-08-14

Got the Clilstore index to display a warning message if a filter is in force which returns no units. (Need revealed in testing by Marialuce in Italy.)

2012-08-14

Added checks to the Clilstore edit/create unit form to generate an error if the language is not specified, or if the text is less than 100 characters long.

2012-08-14

For the past three months, Multidict had been annoyingly re-opening itself in a new tab, each time you started it by just going to the Multidict URL with no parameters, and then used it. The trouble stemmed from the change I made on 2012-05-04, whereby each instance of Multidict (and there could easily be quite a few in different tabs in a browser session) now runs in a window with a distinct name “MD<sid>”, where <sid> is the session id. This change ensures that dictionary lookup results get directed to the right place, especially when several Wordlink and Multidict sessions are on the go at the same time, and the change was on the whole successful. The problem was that although Multidict windows created by Wordlink had the right name, a newly created Multidict tab created from a no-parameter URL did not have a name. I have now cured this by adding a tiny bit of Javascript (this.name='MD$sid') to give the new tab the appropriate name.

2012-08-17

Made some of the columns (created date, changed date) in the Clilstore index "optional" to save space on small screens. They are now only revealed by ticking an "Include optional columns" checkbox.

2012-08-22

Added to the "optional" columns the favicon (a 'W' with a slash through it) which links to the the unwordlinked unit.

2012-08-22

Added the Bing (Microsoft) Chinese dictionary to Multidict - a strong recommendation from a Chinese participant at our EuroCALL workshop.

2012-09-09

Cooperation with Kent over the last few days to provide a “Report abuse” link in unitinfo.php. This has a captcha, thanks to Kent. And the unit number is filled in automatically in the form, as are the username and e-mail address if the user is logged in to Clilstore.

2012-09-24

Today (Monday) and over the weekend, added the new (since 2011) Glosbe multidictionary to Multidict. Glosbe claims to do thousands(!) of languages in any pairing, and it seems to be quite good for many of them, although no doubt for most of them it is very poor. Registered it with Multidict for 115 languages - i.e. most of those on its main page. This required quite a lot of work sometimes to work out what precise language it really meant in cases where it used a macrolanguage code.

2012-09-26

Created “Help” and “About” links and pages for Clilstore.

Added a new dictionary to Multidict: Linguee, which was recommended by one of our students. Its method of character handling is rather awkward (%hexadecimal encoded ISO-8859-15), and I had to change multidict.php a bit to cope with it.

2012-10-08

Corrected a bug in the Clilstore index page whereby the state of the “Include test units” and “Include option columns” checkboxes was not being remembered in the Clilstore cookie.

2012-10-09

Hid the Nodine Welsh⇔English dictionary in Multidict. It seems to be totally defunct, even though the search form is still on University of Cardiff website.

Corrected the Multidict parameters for the BBC’s Welsh⇔English dictionary, which had changed location and methodology and is now much better - really good. Added English⇒Welsh, which it did not have before.

2012-10-11

Kent pointed out to me that Wordlink was not coping properly with &lt; and &gt; in the html of a page. Had a look and found out that this was because Wordlink converts any character “entities” such as &eacute; early on to proper characters (so that the “eacute” does not get treated by Wordlink as a “word” to be Wordlinked). But that meant that &lt; and &gt; were being converted to < and > and were being treated by Wordlink as parts of tags. Got round the problem by getting Wordlink to convert &lt; and &gt; to something unlikely, «⁰ and ⁰», and then convert back again after the processing is complete.

2012-10-23

Added the Welsh⇔English terminology database termau.org to Multidict.

2012-10-24

Lots of work on the languages tables (codes and language families) to ensure that all the newly added languages (from Glosbe, etc.) appear correctly in the languages table.

2012-10-29

Bug report from Kent of a “Yellow menu” appearing at the top of Clilstore units when viewed with an old browser, Internet Explorer 7. I investigated and eventually found that this was information which should only appear in the raw unit, directing users to the proper Wordlinked unit if they happened to find the raw unit via Google (something requested a few months ago by Kent). It should be supressed in the Wordlinked unit, but the supression was not working in IE7, due to this old browser’s deficiencies. Developed a workaround.

2012-11-01

Lots of work over the previous week changing the Clilstore index page and the edit/create unit page to accept language names as well as language codes, to prompt the user with the language name, and to optionally display language names instead of codes in the index table. Presented partners with two options to see which they prefer, one using the new HTML5 datalist feature for form fields, the other using the traditional select box.

2012-11-07

Together with the college IT staff, sorted out an access problem reported by Kent and also Rasa. This turned out to be due to the SMO firewall using an outdated bogon list and blocking certain reserved IP ranges which have now been reallocated to public use. Now cured.

2012-11-08

Bug report from Kent, discovered by Valentine Litsiou in the POOLS-CX project. The Clilstore index was looping when no filters were in force an the user was not logged in. Corrected this. It was a programming bug introduced a week or two ago when I introduced language names instead of codes to the Clilstore index.

2012-11-08

Added a new checkbox facility in edit.php to allow the author to specify for each link button in a unit whether the link should open automatically in a new tab/window (with target=_blank). This is something requested by Kent, and also several times in student feedback.

2012-11-11

At the suggestion of Burak at the POOLS-CX meeting in Pitesti, added two new good Turkish-English dictionaries to Multidict: Tureng and SesliSözlük. (SesliSözlük is particularly good on English etymology)

2012-11-12

At the suggestion of Valentini at the POOLS-CX meeting in Pitesti, disabled the Greek-English in.gr dictionary, since it has been a pay dictionary for several years now.

Noticed that the character encoding was going wrong in Multidict for the dictionare.com Romanian-English dictionary, because the dictionary uses iso-8859-2 and not utf-8. Put this right.

2012-11-24

Noticed that the Hindi-English Shabdkosh dictionary now does nine other Devanagari script languages as well as Hindi. Added these to Multidict.

Added An Seotal, a Gaelic terminology database for schools, to Multidict.

2012-11-30

Added a new field the dictParam table in the Multidict database to record whether or not the dictionary does lemmatisation. Although not used at the moment, it could be very useful in the future to supplement the current simple “quality” rating and distinguish between dictionaries which do not do lemmatisation but might still be great when using Multidict as a standalone, and dicctionaries which do and therefore are far better for use with Wordlink.

Tried hard to add the Clave Spanish monolingual dictionary to Multidict, following a suggestion from Ana Gimeno, but failed - sadly, because this looks like a very useful dictionary. Unfortunately, it works internally in a very complicated way. Amonst other things, it would need Multidict to store and handle cookies for HTML requests. Something for the future.

2012-12-10

Busy over the weekend starting to set up the new server (computer) which will host Multidict, Wordlink and Clilstore.

2012-12-12

Trond Trosterud, from the Sámi language project at the University of Tromsø, who attended the TOOLS workshop at EuroCALL in Gothenburg, contacted me and asked me if I could add their North Sámi⇒Norwegian dictionary to Multidict, and I was very happy to obliged. They had prepared the computer interface to the dictionary to make this easy to do, and the online dictionary is designed to accept wordforms and do lemmatisation on them, so it is ideal for use with Wordlink.

2012-12-22

Working almost non-stop night and day over the last ten days or more to get the new server (hardware, purchased in the TOOLS budget) set up with SuSE linux, all systems working on it, virtual hosts moved over, and all the Multidict/Wordlink/Clilstore databases and programs moved over. Successfully completed - everthing working as far as I can see, and nothing lost. The new machine is far faster than the old one, with more memory, more filespace, a second disk for easy backups, and now the latest version of the operating system and PHP and mySQL.

2012-12-22

Added the LearnGaelic.net dictionary to Multidict. I could not get this working when the dictionary first appeared, but they have now given in HTTP GET parameters. It includes lemmatisation, which is good for use with Wordlink, and it includes a huge number of pronunciation sound files, which should be useful for learners. It still takes up more width than it needs to on the screen.

2012-12-23

Got all the Wordlink/Multidict/Clilstore programs moved over to the multidict.net domain and working there. This was quite a big job, involving lots of small changes, but they are now more self-contained and would be easier to move again in the future if that ever became necessary. Corrected a few small bugs on the way.

2013-01-07

Corrected a bug in the Clilstore filter by unit number (which was causing the search to crash).

2013-01-11

Added the Ojibwe People’s Dictionary to Multidict.

Added a facility to link Trìar Manach translations to Wordlink automatically in those cases (indicated by the yellow [W] symbol) where a dictionary exists in Multidict. Currently over half of the 153 translations are now Wordlinked. Trìar Manach is a separate project to translate a short Old Irish joke - suitable for language learners - into as many as possible of the world’s languages. Currently 65 of the 153 translations are languages, dialects or historic languages of the EC region.

2013-01-12

Corrected a bug in Wordlink (shown up by the above Trìar Manach exercise) which was sometimes causing it to fail due to an endless cycle of redirections. (This was turned out to originate from some experimental code I had added to try and clean out unnecessary default parameters from urls.)

2013-01-22

Added Foras na Gaeilge’s New English Irish Dictionary, focloir.ie to Multidict. Tweeked this following comments from Caoimhín Ó Dónaill and Neil Comer, University of Ulster.

2013-01-25

Set up a “nascent student homepage”, with a bit more explanation at the top, and fields removed which are of little or no relevance to students.

2013-01-29

Noticed that the naive sort method which I was using to order language names in language selection boxes in Multidict, Wordlink and Clilstore was not sorting accented characters properly. For example, it was placing Sámi at the bootom of 12 other language names beginning with S, which made it hard to find. Cured this by setlocale and a more sophisticated sort method. Noticed following this that the sophisticated sort was placing four south-east Asian languge names (Khmer, Korean, Lao and Thai) among the Latin names. A very minor matter to us in the project, but no doubt due to a bug somewhere. Chasing this up on the relevant language standards lists.

2013-01-30

Investigated a report from Caoimhín Ó Dónaill, Ulster, that trainees in a workshop in Southampton, if they forgot to specify the language when writing a new unit, found when they pressed the back button that the data they had laboriously entered into the form had disappeared. Found that this only occured with Internet Explorer. Found from this rather technical blog entry by a Microsoft engineer that it was due to Microsoft’s (possibly correct) interpretation of the standards, and that it could be cured by specifying “Cache-control:max-age=0” in the http header for the page instead of “Cache-control:no-cache”. In addition to this, I added the new HTML5 “required” attribute to the Language select box, which nips the problem in the bud for the browsers which have already implemented the “required” attribute: Opera, Firefox and Chrome.

2013-02-01

Added the excellent new online German-German dictionary resource DWDS to Multidict. It is very good for etymology.

2013-02-04

Noticed that some “www2.smo.uhi.ac.uk”s and “www.smo.uhi.ac.uk”s were still to be seen in Multidict/Wordlink/Clilstore links. Systematically searched from them and rooted them out from the programs, replacing them with multidict.net. This doesn’t affect functionality, but it removes the need for redirection and so improves speed (by a tiny amount), clarity, and reliability.

2013-02-10

Experiments over the past week with the layout and functioning of the Clilstore filter page. Following feedback from TOOLS team members, settled on a new version which re-submits the form automatically (by Javascript function) immediadely following each change in any criterion (in particular language).

2013-02-16

Added the six talking dictionaries from the Living Tongues project to Multidict.

2013-02-17

Updated the Giellatekno Sámi dictionaries (as advised in a message from Ryan Johnson, from Trond Trosterud’s team whom we met at Eurocall in Gothenburg). Updated the North Sámi⇔Bokmål dictionary to its new stable home. Added the new pairs, North Sámi⇔Finnish, and South Sámi⇔Bokmål.

2013-02-18

Added a word count facility to Clilstore. When a new unit is created, or when a unit is edited, the word count is updated and stored. (To count the words in the text, the html tags need to be stripped, and care must be taken over accented characters and non Latin scripts.) Currently, the word count is only displayed in the unit info, but it will be added as a new column to the Clilstore filter page.

2013-02-22

Correct the parameters in Multidict for the Michaelis of Portuguese dictionaries. Because of bad parameters the Portuguese-Portuguese dictionary had previously just been looking up the word “sobre” every time! And the other dictionaries in the family had probably not been dealing with accented characters properly.

2013-02-23

Wordlink has for a very long time been failing for no aparent reason on certain pages, saying “Page appears to have no <body> tag”, even though a look at the html source for the page shows that it does have a body tag. Found by persistence that in certain cases, and very likely in most cases, this was due to illegal character codes in a page which should be in UTF-8. Found code at http://magp.ie/2011/01/06/remove-non-utf8-characters-from-string-with-php/ to clear out illegal character codes and added this to wordlink.php. This seems to have cured the problem, so hopefully Wordlink will now fail on fewer pages.

2013-02-23

Found that the Svenska Akademiens Ordbok had stopped working with Multidict. Got it working again.

2013-02-23

Added all Pons dictionaries, 30 language pairs to Multidict. Previously only 8 of them had been included. Also added the Pons German-German learners dictionary (“Deutsch als Fremdsprache”).

2013-02-24

Found and cured a rather serious bug in Clilstore. The search for words in “Text or Summary” was not working. In fact, it was often going into a bit of a loop and not responding. This applied to the “Nascent student homepage” as well as the main filter page. It was due to bad bracketing of ANDs and ORs in the SQL query.

2013-02-25

Added all the Gyldendal pay dictionaries (8 language pairs, to and from Danish) to Multidict at Kent’s request.

Added a new field to the “dict” table in the Multidict database, an information field for the dictionary which is automatically displayed in Multidict.

2013-02-26

The multidict.net website already had a favicon [M], suitable for Multidict. But this was also appearing in pages belonging to Wordlink and Clilstore. Altered the Wordlink pages wo that the Wordlink favicon [W] appears as a favicon for them. Created a new favicon for Clilstore [C] and did the same for the Clilstore pages.

2013-02-28

Removed the BabylonEnglish Arabic Dictionary Online” from Multidict because I have found by bitter experience that the Babylon dictionaries are hijackware and are dangerous to use. The Babylon toolbar will hijack your browser’s websearch and it can take hours of difficult work to fully remove all the roots it lays down in your computer. Beware of it, and beware of some of the tools on the Internet claiming to remove it, because some of them are malware too. AdwCleaner and Spybot Search & Destroy are good, but make sure that you are really downloading them and not some fake program.

2013-03-01

Added the Traperko dictionaries to Multidict - 9 language pairs, to and from Esperanto.

Added the Priberam Portuguese-Portuguese dictionary to Multidict, and also the Infopédia Portuguese⇔Italian pair. Both look pretty good.

Updated the parameters for the Sråkrådet Norwegian monolingual dictionaries at the University of Oslo, because they had changed address.

Added the Heinzelnisse Norwegian⇔German dictionary to Multidict

2013-03-02

The set of Hungarian Sztaki dictionaries has improved and they have sound. Updated six pairs (Hungarian to and from other languages) to the new link parameters to take advantage of the new features. Added Hungarian⇔Bulgarian. And Sztaki now has concise mini dictionaries (Hungarian to and from six languages) designed for mobile phones and therefore good for use with Wordlink. Added these too to Multidict.

Added the FinnHun six-way Finnish-Hungarian-English dictionary. It has IPA and sound and at least some kind of lemmatisation and looks very good.

Added the Pathfinder English>Greek dictcionary.

2013-03-04

Major clearout of Clilstore userids over the last two days (in preparation for starting to program more facilities for users, such as the ability to change passwords or reset lost passwords, or change the userid). Cleared out some old userids which had never even been logged in to. Found about 22 e-mail addresses which had more than one userid registered against them. Went through them one by one, either removing old userids which had not been used to create units, or merging userids together. In a few cases wrote to the authors, but it was nearly always obvious how to proceed in a manner which would make no difference to the authors. It will now be possible to enforce a system whereby at most one Clilstore userid can be registered against any given e-mail address. This will simplify the programming, will agree better with the way in other systems such as Facebook and Google behave, and will allow users to login using their e-mail address if they have forgotten their Clilstore userid.

One instance of duplication was caused by two userids which differed only by a leading space character. Added “trim” to new user registration to eliminate leading and trailing spaces in userids, names and e-mail addresses.

Cleared out a few old units of Gordon’s which are now in the series of Guthan nan Eilean units, so as to eliminate duplication, having first moved across any links (e.g. to exercises).

Altered the clilstore/page.php program to improve (cure a bug in!) the handling of numeric links (to other Clilstore units) in the buttons.

Altered the clilstore/page.php and clilstore/edit.php programs to consistently refer to Clilstore units rather than Clilstore pages, since we seem to have settled on that terminology.

2013-03-07

Continued, over the last three days, clearing out some “duplicate” and rubbishy userids and rubbishy units. This allowed me to enforce a new rule in the Clilstore user table in the database of only one userid per e-mail address. Improved the user registration page to enforce this rule, and to enforce better checking all round. A full name and an e-mail address are now insisted on. And the e-mail address is checked immediately to ensure that the Internet domain exists and the syntax is correct for an e-mail address. Wrote in initial privacy policy and linked the registration page to it. Improved the login page to allow users to login with their e-mail address as an alternative to their userid.

2013-03-08

I see regarding language codes that there is a bit of a conflict between the ISO 639-3 Registration Authority on the one hand, and CLDR on the other as to what “macrolanguage” codes, and in particular their corresponding two-character ISO 639-1 codes can be used for. As shown by this CLDR alias table, CLDR believes that these codes can be used as the standard code for the predominant language (or dialect, if you like) comprised by the macrolanguage. This view is generally shared by Wikipedia. Whereas it seems pretty clear to me that that is not what the ISO 639-3 Registration Authority indended or would approve of. Multidict (and Clilstore) currently adheres to the ISO 639-3 interpretation. By comparing the Multidict language codes automatically with the CLDR alias table, I produced the following table showing conflicting codes:

LanguageMultidictISO 639-3CLDRWikipedia
Albanian (Tosk)alsalssqsq
Azariazjazjazaz
Estonianekkekketet
Guaranígugguggngn
Mongoliankhkkhkmnmn
Kurmanjikmrkmrkuku
Latvianlvslvsltlt
Persianpespesfafa
Swahiliswhswhswsw
Tagalogtltglfiltl
Uzbekuznuznuzuz
Malayzsmzsmmsms

I don’t propose to change the codes we are using in Multidict, but it is interesting to note the differences.

2013-03-08

Spotted that the ABBYY family of Russian dictionaries had changed address and stopped working. Corrected this in the Multidict database, and removed some of the dictionaries which had gone pay-only.

Added to Multidict the Welsh-English and Welsh-Catalan dictionaries on the Gwefan Cymru-Catalonia site. These have very good material, especially etymology, but diabolic organisation as regards web interface. Added them using Multidict’s mechanism for page-image dictionaries, even though they are divided into files rather than page images. Added a warning using Multidict’s new warning message mechanism.

2013-03-10

Lots and lots of work on Clilstore over the weekend. Changed the login page, http://multidict.net/clilstore/login.php so that users can now login using their e-mail address as an alternative to their userid. This will hopefully solve some of the problem of users forgetting the userid they set up.

In the new user registration page, I am now enforcing much stricter rules: only one userid is now allowed per e-mail address; the e-mail address is checked immediately to ensure that the domain exists and that the rest of the address is syntactically correct; and new users must give a fullname, although they can still lie about it. Wrote a draft privacy policy, http://multidict.net/clilstore/privacyPolicy.html, and linked to it from the new user registration page.

Did a big rewrite on the page for creating and editing units, http://multidict.net/clilstore/edit.php. This is now much more sophisticated, and when it spots an error in the data or any required fields which are missing, it doesn’t just report the error and stop, leaving the unit to use the back button to get back (hopefully!) to their data correct it. It now reports the error and repeats the form including all the data filled in by the author. Also wrote a draft “copyleft” policy for Clilstore, http://multidict.net/clilstore/copyleftPolicy.html, and linked to this from the place where the author has to tick to certify that they have authority to release the unit.

2013-03-11

Working nearly all day on a change password facility for Clilstore, http://wordlink.net/clilstore/changePassword.php, which is now working well. Designed it too to be able to accept as authentication input from password reset links (including a code), which people who users who forget their password will (in a facility still to be written) be able to request be sent to their registered e-mail address.

2013-03-12

Added program code to wordlink.php to convert urls such as “http://example.com” to “http://example.com/” (i.e. with a trailing slash) before Wordlink starts working on the page. Because I found that without this, the ->resovle() function in PEAR package NET_URL2 which Wordlink uses is going very badly wrong in resolving relative links without a leading slash. This was causing Wordlink to fail on pages such as Lone’s holiday house page “http://franskferiehus.dk”, producing a rubbishy layout. Hopefully this will have got Wordlink working now on lots of other pages where it previously failed.

Working with Lone Olstrup who is here from SDE college in Denmark. Modified her three icons (multidict/wordlink/clilstore) to give them all 45 pixel height, and changed the http://multidict.net/ page to use them.

2013-04-14

Major changes over the past month. And especially big changes in the lead up to the Valencia meeting and during it to the way in which Clilstore works behind the scenes. Things are now much more flexible, with a session number remembered in a session cookie, and session information remembered in the database, including the current “mode” from among four modes: “student”, “student - more options”, “author” and “author - more options”, and the user’s choices of fields to be be shown and their sort order. It is now possible to hide and add fields and restore the default fields for the mode, and it is now possible to sort the units in various ways, on both primary and secondary columns. The new “mode” mechanism gives a huge simplification compared to having separate author and student index pages with different addresses. It means that a new simple and spartan student page can be produced, which students return to after viewing a unit, without this having a detrimental effect on authors, because authors will return to the author index with its richer facilities. The author and student pages are now just different aspects of the same Clilstore index, but they can still be as different as we want to make them.

2013-04-15

Added the English-Thai, LEXiTRON dictionary to Multidict, following a strong recommendation from Thai visitors in Skye.

Added the dict.cc German⇔English dictionary to Multidict, following a strong recommendation from my daughter, who is studying German. In fact, what I did was to restore this dictionary, because I had hidden it nine months ago when it started misbehaving and jumping out of frames. This problem has now been overcome, and in fact it has a new mode, which I am now making use of, specially designed for use in frames. I saw that dict.cc is now a family of dictionaries covering German to and from 25 other languages, and English to and from 25 other languages. Added all these to Multidict.

2013-06-16

Lots of work and major changes to the Clilstore interface over the past two months. Changes too numerous and coming too fast and under too much time pressure to document individually. There are now facilities for sorting, adding, and deleting columns. The basic student page now has “buttons” for chosing language level.

2013-06-17

Noticed strange behaviour when Wordlinking articles from the Scotsman newspaper. Found that Wordlink was breaking a few of the words into “syllables” - and of course dictionary look up did not work on the syllables.. On further investigation, found that this was the Scotsman was inserting “soft-hyphen” characters into words as hypenation hints for web browsers. Got Wordlink to strip out soft hyphens and the strange behaviour has gone. (A more sophisticated treatment would retain the soft hypens in the words but omit them from links, but this is hardly worth it. It is possible that the Unicode zero-width-space character should also be stripped, but I did not do that.)

2013-08-16

Delivered lots of Clilstore workshops over the preceding couple of weeks to adult learners attending week long short courses at SMO.

2013-08-17

Upgraded the operating system on the server (funded by the TOOLS project) at SMO which hosts multidict.net, from OpenSUSE 12.2 to OpenSUSE 12.3. This would normally be a small change, but this particular upgrade involved a change from the MySQL database management on which we have relied since the begining of the POOLS projects, to MariaDB. In fact, MariaDB was as it claimed to be, a “drop in” replacement for MySQL, and the change of database management system caused no difficulties whatsoever. MariaDB has some new features (e.g. virtual columns) which might be useful to us in the future. One thing in the upgrade which did cause a bit of difficulty was that OpenSUSE 12.3 had reverted from PHP 5.4 to PHP 5.3. This required me to do a bit of backtracking in the programs because I had started to use some new features of PHP 5.4 in Clilstore.

Prior to the upgrade, made backups of all the databases and programs using the backup disk on the server

2013-08-18

Noticed that multidict.net itself, and Multidict, Wordlink and Clilsore and Clilstore units in particular, were not appearing very well in Google search results. Read up on the robots.txt protocol, and produced what I hope is a more sophisticated, more specific robots.txt file for multidict.net, which hopefully will encourage Google to do better indexing. We should see within a few weeks.

2013-08-19

Added two Nyanja-English dictionaries from the Web Archive to Multidict - to impress five guests from Zambia whom I am going to have staying over the next week ;-)

2013-08-20

Added an HTML5 “placeholder” to the url input field in Wordlink, saying “Type or copy the webpage address (url) to here” - because has become clear from the workshops I have been giving to SMO short courses that for many people it is not obvious that this is what they should do.

2013-08-24

Investigated a report from Daniel MacDonald, attending at the Clilstore school currently running in Malta, of problems with editing units due to “?:??” appearing (and being rejected with an error) in the media length field in units where the video length is unspecified. Found that this was problem in the old edithtml.php program, but not in the new edit.php program. Corrected the old program, and made the “?:??” appear as a harmless placeholder in both programs.

It would probably be good, though to add a facility to the edit programs to generate warning messages instead of just errors - so that even if the unit was successfully editing and saved, the author would be warned about the media length being missing, or titles which were unreasonably short or long, or if the total length of link text was too long in the links to external pages.

2013-09-01

Over the last few days, as a preparation for a visit to SMO on 2013-08-28 by Mereanna Selby, head of the Māori college Te Wānanga o Raukawa in New Zealand, and as a followup thereafter, added three new Māori dictionaries to Multidict, including two Web Archive dictionaries.

This involved some special character handling due to the unusual alphabetisation of ‘ng' and ‘wh’ in the Web Archive dictionaries. Māori would also benefit from some “lemmatisation” to remove the causitive prefix “whaka” from words such as “whakapae” prior to lookup, and also to remove reduplication from words such as “paepae”, which is conventionally alphabetised under “pae”. Something for a future project covering lemmatisation?

2013-09-20

Added two English-Igbo dictionaries to Multidict: Mkpuruokwu Igbo and igbo911 - inspired by a couple of Nigerian teachers of English who I met at EuroCALL.

2013-09-21

Spotted a good new Cornish-English dictionary on the Internet, the maga dictionary. Added it to Multidict, created a Cornish unit in Clilstore, and publicised it in the Celtic Linguistics group on Facebook.

2013-10-02 - 2013-10-03

Two days at An t-Alltan, the annual Scottish Gaelic teachers conference in Aviemore. Delivered a plenary lecture on Clilstore, and set up and manned a stall for two days with computers where teachers could try out Clilstore

2013-10-08

Added the huge Old Irish dictionary, Dictionary of the Irish Language, eDIL, to Multidict. I had added this before, but it had never worked properly and had stopped working completely. Added eDIL in its new incarnation on the qub.ac.uk website, which is much more straightforward to link to than before. Moreover, I constructed an alphabetic index to all 5000 columns of eDIL - a major effort! - which results in much more straightforward searches than before, in cases where basic results suffice, and generally much more successful searches too, considering that with all its word inflexions and vagaries of spelling, an exact search on an Old Irish word was never likely to yield good results.

2013-10-09

Disabled all the w3dictionary Arabic dictionary in Multidict in Multidict, following a report from Helle that it was jumping out of frames and hijacking the screen and severely disrupting the work of Clilstore. Tried but failed to get round this jumping out of frames behaviour, so disabled the w3dictionary in Multidict for all its 65 language pairs.

2013-10-10

Added the lt-en.lt Lithuanian dictionaries to Multidict - English, Russian and German, but Asta reported back that they were not great quality, so left them with a very low quality rating.

Added the TV5MONDE French monolingual dictionary to Multidict.

2013-10-22

Added the Brazilian Portuguese monolingual dictionary Aulete to Multidict. It looks very good.

Disabled the Arabic Almanakh Arabic-English dictionary in Multidict, following a report from Helle that it was taking over the screen on an iPad. It didn’t seem to be working properly anyway.

2013-10-29

Changed the “opaque” CSS class in Wordlink to “display:none”, so that the “Wordlink” link button which is seen in the links bar in raw units does not take up space in the links bar in ordinary Wordlinked Clilstore units - something which had been causing the “Unit info” link button to sometimes mysteriously drop to the following line, even though there should have been room for it.

2013-10-31

The two main Irish-English and English-Irish paper dictionaries were launched today on the Internet, namely Ó Dónaill 1977, and de Bhaldraithe, 1959. This is a day which people have long been waiting for, and it will make a big difference to Irish Gaelic speakers and learners. The Internet versions have a very nice interface, including lemmatization, and they were very easy to link in to Multidict. This should make a big difference to learners using the Irish Gaelic units in Clilstore.

I took the opportunity to remove the extremely primitive “lemmatization” which Wordlink had done since it was first written, namely throwing out any letter ‘h’ found as the second letter of an Irish or Scottish Gaelic word (which often was inappropriate). Instead, Multidict now makes a slightly better attempt at lemmatization (by removing lenition, eclipsis, etc), and confines its efforts to the case of dictionaries which do not do their own lemmatization. This is still nowhere near as good as the kind of lemmatization which I would like Multidict to do, but is a big improvement. The inapproriate removal of ‘h’ had shown up as a problem in the workshops which I did with adult learners on Sabhal Mòr Ostaig short courses over the summer - especially inappropriate since the best Scottish Gaelic to English dictionary, Am Faclair Beag, does its own lemmatization.

2013-11-06

A good new online Old-English (Anglo-Saxon) online dictionary has appeared, Bosworth-Toller. Added this to Multidict.

Noticed that the www.online-dictionary.nl dictionary had stopped working with Multidict because it had changed parameters. Updated its record in Multidict to get it working again. However, noticed that it is not displaying accented characters (e.g. in French) correctly, even outside of Multidict, and is failing to find words with accented characters. Sent a note to “webmaster” on the site to let them know the technical reason for this: the website is now sending out HTTP headers saying that it uses UTF-8 character encoding (like most websites nowadays), while the actual pages are still in ISO-8859-1 encoding.

2013-11-07

Added Javascript to Clilstore pages to resize the scrolling text area to fit the available screen size - which makes a big big improvement to how it looks in tall screens, especially in portrait mode on tablets and smartphones. And moreover, it nearly always eliminates the double-scrollbar, scroll within a scroll problem on desktop computers.

(Spent a lot of time on this over the past week, finding out how to do things in Javascript and testing, especially on the iPad. Got automatic update working following a window resize working, but removed it again - it could cause infinite looping on the iPad and isn’t really needed on the desktop. Tried unsuccessfully to get Javascript “onorientationchange” working on the iPad. Tried unsuccessfully to get momentum scrolling (as opposed to ordinary scrolling) working in the scrolling area on the iPad - but Kent tested and says things are fine.)

2013-11-07

Got the Oxford Advanced Learners dictionary working again in Multidict. Its search parameters had changed (for the better) and it had stopped working.

Noticed that the www.online-dictionary.nl dictionary had changed parameters and stopped working with Multidict. Changed its record in Multidict to get it working again. However, noticed that it is not displaying accented characters correctly, in French for example, even outside of Multidict, and is not finding words with accented characters. Would have liked to send them a note to let them know the technical reason for this - The site is now advertising in the HTTP headers that it uses the UTF-8 character encoding, like most websites nowadays, but the pages are in fact still in the ISO-8859-1 encoding - but could not find any contact address. Gave the dictionary a much lower ranking in Multidict meantime.

2013-12-23

Lots of tests in the week or so prior to Chrismas on using Hunspell to improve Multidict, in particular when it is used with Clilstore, by giving it the ability to find the dictionary headword corresponding to any particular inflected wordform which the student might click on in a unit. This follows a tipoff from Mìcheal Bauer of Gaelic computing fame. We had been discussing Basque, which Mìcheal speaks, and how Wordlink did not work very well for Basque because Basque is a highly inflected language and the Basque online dictionaries do not do lemmatization. The same holds true, to a greater or lesser extent, for all languages, and this has been one of the points which students have raised during Clilstore workshops. Mìcheal pointed me to the excellent Hunspell implementation for Basque, which has inflexion rules incorporated into its .aff file. Hunspell is a clever spellchecker, used in opensource programs such as Firefox and LibreOffice, which in addition to the old-fashioned .dic wordlist file for a language, also uses a .aff file specifying the production of wordforms according to regular rules. In fact, I found to my surprise that I didn’t have to understand and reverse-engineer the rules in the Hunspell .aff file, because Hunspell already incorporates a function to make use of its .dic and .aff files to lemmatize any particular wordform.

The tests showed that Hunspell has tremendous potential, often leading to successful dictionary lookups where otherwise they would have failed. However, very often its attemts to find headwords made things worse instead of better. So what I ended up doing was leaving Hunspell lemmatization in place in Multidict for Basque, where it undoubtedly improved matters, but did not implement it yet for other languages. What is required is a more sophisticated system in Multidict, whereby Hunspell lemmatization is only called up if the user reclicks on the same word. I hope to implement this at a later date.

2014-01-12

Added an [Esc] button to the Multidict navigation frame, to make it clearer how to escape from Multidict to the dictionary’s own home page or search page. This is both for the convenience of users, and so that the project makes sure to be fair to dictionary owners.

Added a Yoruk dictionary to Multidict.

2014-02-14

A lot of work over the last few weeks implementing the Hunspell “lemmatization” or “dictionary headword suggestions” scheme which the tests just before Christmas showed is required in Multidict, and I am now very happy with it. The dictionry headword suggestions are shown in brown following the searchword entry box in Multidict. Reclicking “Go” in Multidict or reclicking the word in Clilstore caused Multidict to cycle on to the next suggestion, so the process is very ergonomically efficient - so much so that it is often quicker and easier to reclick and let Multidict do the work even when the dictionary has its own headword suggestions mechanism.

The mechanism I have programmed to support the system is super flexible and tailorable to suit the needs and circumstances of individual languages. Mostly it uses Hunspell, and I have pulled in the Hunspell .dic and .aff files for over 40 languages, including all the project languages. However, it can also use its own lemmatization table - a new table now included in the Multidict database. I have loaded into this (1) a huge lemmatization table for Scottish Gaelic with over a quarter million wordforms, very generously given to me by Mìcheal Bauer; (2) a huge lemmatization table for Irish Gaelic generously given to me by Kevin Scannell; (3) a public domain lemmatization table for Italian which I obtained from the Internet. These lemmatization tables for Scottish and Irish Gaelic are very important because the Hunspell implementations for Scottish and Irish Gaelic are not yet very clever, making little use of the rule-based .aff file, so they do us little good.

Hunspell can be very good at lemmatizing the wordforms of regular verbs, nouns, etc, because that is what its rule-based .aff file is designed for - the aim being to save having to include hundreds of thousands of regular wordforms in the .dic file. However, Hunspell is not normally much good at lemmatizing irregular verbs or nouns. These take up relatively little space, so the Hunspell implementations for most languages just chuck them into the .dic file. For this reason, I input all the Scottish Gaelic and Irish Gaelic irregular verbforms and prepositional pronouns into a specially category in the Multidict lemmatization table (and also threw in lots of Old Irish wordforms from In Dúil Bélraí, another project I have been involved in. The new dictionary headword suggestions mechanism in Multidict is super-flexible and can pull in the various categories from the lemmatization table, suggestions generated by Hunspell, and algorithmically generated suggesions, pulling them in in an order suited to the language in question. Regarding algorithmically generated suggestions, I used the programming I had previously written for removing initial mutations (lenition, eclipsis, etc) in Irish and Scottish Gaelic. But whereas the programming was previously always used, even when it was sometimes inappropriate, it now only generates suggesions, and the user is in control by reclicking. For English, the only algorithmic suggestion currently generated is to throw away any final letter ‘s’ in a last-gasp effort to lemmatize “new” words even if Hunspell doesn’t recognise them.

There is great scope, now that the framework is in place, for improving it further for many individual languages, by adding all their irregular verbforms and nounforms to the lemmatization table, and by adding common lemmatization rules for them to the lemmatization algorithm. However, none of this was envisaged in the current project, so it is something for another project.

2014-03-01

Added four Catalan dictionaries to Multidict, suggested by a colleague of Anna Gimeno.

2014-03-28

A lot of work over the past week or two completely rewriting and improving the green link button system for Clilstore units. The information pertaining to buttons is now stored in a separate related table rather than in the main Clilstore table. This means that the main table is a lot tidier, and there is now no restriction on the number of buttons.

2014-04-04

A lot of work over the previous few weeks setting up a system whereby Clilstore can store (in a separate database table) files attached to units, which can then be linked to the green link buttons on units, for example, obviating the need to store exercises in Dropbox. This is because Dropbox is a big and fairly difficult extra thing for teachers to learn, because Dropbox has been getting more difficult to use over the last year, with more restrictions, and because certain workplaces such as some school regions and even universities have been starting to block the use of Dropbox. Set up a system whereby attached files can be uploaded, renamed and deleted.

2014-04-15

Report from Kent that “Firefox is now behaving badly when displaying a Clilstore unit”. Turned out to be due to Firefox in particular caching an old version of the stylesheet for far too long. Put it right by the expedient of adding a spurious “?version=...” parameter to the stylesheet reference to force reload.

2014-04-18

Greatly improved the program for deleting Clilstore units, delete.php. This now lists any attached files, warning the author that these will be permanently deleted if the unit is deleted. It then goes on to delete any buttons and attached files first (rather than failing due to foreign key constraint violations as it had been doing recently!) and finally the unit itself. It wraps all this in a database transaction so that if anything fails it rolls back and leaves the unit intact.

Made the login.php program take you automatically to the Clilstore index page after it displays a successful login message. This was a suggestion which Kent made at the Belfast meeting. Tidied and improoved the login program at the same time.

2014-04-26

Added to Multidict the Linguee dictionary which was recommended by my daughter, with 470 language pairs. (Actually I already had it in Multidict, but with only 6 of the 470 language pairs.)

2014-04-29

Wordlink (and hence Clilstore) is now working for Japanese. The big problem had been that Japanese text, like Chinese and Korean and Thai, is normally written without any breaks between words, so Wordlink did not know how to split it up. Japanese word segmentation is not a simple problem, but I pulled in the Mecab segmenter from the Internet and it seems to do the job quite well. It has the spinoff benefit that it gives the pronunciation in kana, so Wordlink and Clilstore now show this if you hover over the word. Created a test Japanese unit in Clilstore.

2014-05-11

Noticed that Wordlink was going wrong with Breton. In Breton spelling, most varieties at any rate, “c’h” is counted as a single letter, but Wordlink was wrongly splitting words such as “c’hallo” into two ‘words’, “c” and “hallo”. I managed to put this right, and the technique might be useful for other languages too. Created two Breton units in Clilstore to celebreate.

Realised that the situation for Catalan “l·l” is exactly the same as for Breton “c’h”. Put this right too. The Hunspell lemmatization of Catalan an Breton was also being messed up by the non-alphabetic character in the middle of words. Put everthing right in a neat way by defining in the wlSession class constructor a preg pattern for word recognition in the source language, “wordpreg”. This should be useful for any similar situations in other languages.

2014-05-12

The P-celtic languages, Breton, Cornish and Welsh, have a complicated system of word-initial mutations which can make the word unrecognisable to learnears. Added to the lemalg algorithm which is used for headword suggestions in Multidict, the initial mutation tables for these three languages and a demutation procedure.

2014-05-13

Corrected my translation of Katakana to Hiragana in the Japanese pronunciation which Wordlink now shows on hovering over a Japanese word (thanks to Mecab). My translation table was missing the small form characters, occasionally leading to a weird mixture of Katakana and Hiragana, as my daughter pointed out to me.

Disabled the Apertium “dictionary” (actually family of machine translators) from Multidict, as it had stopped working. Its interface has changed, and I couldn’t find any way of getting it working with the new interface.

2014-05-20

Previously edit.php would not create or save a Clilstore unit with less than 100 characters. I found, particularly at the demonstrations and workshop Gordon and I gave at Glasgow Gaelic School yesterday, that this was a nuisance when giving demonstrations, especially when the unit (at least at present) has to be saved first before you can attach files to it. The restriction often forced you to add some rubbish text to test units. Changed this to remove the text length restriction entirely for test units, but to increase it to 200 characters for published units.

2014-05-23

I had previously tried a minor modification to the Clilstore index whereby if you hovered over a row (unit) with the mouse, the whole of that row was highlighted. I found this feature very useful myself, especially in the widescreen “Author page - more options” mode where you had lots and lots of columns and it could be difficult to follow them across the screen, but Kent did not like it - perhaps partly because there is no mouse hover control on tablets, perhaps partly because it changed things from the look of the screen in the training videos. I found it so useful in the demonstrations at the Glasgow Gaelic School (where I switched it on temporarily) that I have now programmed a compromise system where it is an option. There are now three optional behaviours, and registered users can change what the see when they are logged in: (1) never highlight the row; (2) highlight the row only in “Author page - more options”; (3) always hightlight the row. I have made option (2) the default for users, and for anyone who is not logged in.

2014-05-24

Tried again and failed again unfortunately to get the Lëtzebuerger Online Dictionaire working with Multidict. However, dico.lu, the dictionnaire luxembourgeois français does provide fairly good translation to French. Added Lëtzebuergesch lemmatization with Hunspell to Multidict, and this seems to be quite good. Created a Clilstore unit, 2085 in Lëtzebuergesch using material supplied by Jean-Marie Nau.

2014-05-28

As proposed by Kent and agreed in e-mail consulation with the teams, I have completely removed the minimum text length requirement for Clilstore units. This is because Clilstore can be a useful place to store videos, possibly with links and exercises, even without any transcript or text. Instead, I have made edit.php generate a warning (which can be ignored) if you are saving a non-test unit with no text or hardly any text.

2014-05-30

Made a further correction to the Japanese prononunciation which Wordlink now shows when you hover over words in Japanese texts. This time thanks to a Steaphan MacRisnidh, a Japanese speaking ex-colleague at Sabhal Mòr Ostaig, who noticed problems. Mecab outputs pronunciation in Katakana, including length marks ‘ー’ (which I had been translating wrongly to Hiragana, but that is another story). However, length marks are inappropriate in Hiragana. I found that a pronunciation using vowels instead of length marks, and hence suitable for translation to Hiragana, was to be found in another of Mecabls output fields and am now using that. Also tidied up the programming a bit.

2014-05-30

In the process of analysing carefully what Wordlink was doing with Japanese, I noticed that it is also trying to add wordlinks to the contents of Javascript embedded in the body of the webpage. This is completely inappropriate, and might be the cause of Wordlink working imperfectly on some webpages. The best way to tackle this I think, it to change Wordlink to make use of one of the utilities now available which will parse and supply a parse tree for the entire webpage. That would be instead of the sequential processing of tags and text which Wordlink has been doing itself since I first wrote it six years ago. However, this is a big change - no doubt for sometime after the TOOLS project has ended.

2014-06-18

Succeeded, after a bit of a struggle in getting Chinese word segmentation working in Wordlink (and hence Clilstore) using the Urheen lexical analysis toolkit, in the same kind of way as I got Japanese word segmentation working using Mecab. The results look good. And as a side benefit, hovering over the word in Wordlink displays a good for the part of speech (as provided by Urheen), something which might be useful to learners. However, this is currently working far far too slowly to be of any use - ten minutes to process a page with about 50 lines of text - so I have switched the facility off for now. The main trouble seems to be that Urheen spends a long time “initializing” (presumably reading in its dictionaries of Chinese characters) every time it is called. And the way Wordlink works at the moment, processing the page pregressively bit by bit, it has to call Urheen repeatedly. It might be possible remove the overhead of repeated initializations by rewriting Wordlink so that it parses and processes the page as a whole. This is something I would like to try anyway. However, it is a big change and a big job.

Urheen also seems to work in the GBK character encoding, requiring conversion from UTF-8 to GBK and then back again, which is a bit of a nuisance.

2014-06-23

Looked around at possibilities for getting Thai word segmentation working in Wordlink. It looks as if all sorts of word segmentation possibilities might be possible with the PHP IntlBreakIterator class. Installed the intl extension in PHP5, but it seems that BreakIterator doesn’t come with this yet. It looks like it might with PHP 5.5, but OpenSuse, the operating system on the multidict.net just has PHP 5.4. The next release of OpenSuse, currently expected about November, does have PHP 5.5.

2014-07-01

End of TOOLS project, but work continues in support of the POOLS-3 project.

2014-07-30

Made the HTML and PHP source code for all programs and classes available online. Also a dump of the SQL database (minus email addresses, and minus passwords even though they are of course encrypted). Tidied things beforehand to remove DbMultidict.class.php in favour of DbMultidictPDO.class.php in wordlink.php, the last remaining place where it had been used, hence allowing DbMultidict.class.php to be completely discarded.

2014-08-01

Kent notified me that lookups on the EC terminological dictionary IATE using Multidict were failing with “gzinflate() call failed”. Got round this by a hack: getting Multidict to send requests to IATE using the old HTTP 1.0 protocol instead of HTTP 1.1, so that the responses are sent uncompressed.

2014-08-01

The main Welsh-Welsh dictionary, the wonderful Geiriadur Prifysgol Cymru has just become available online. As well as Welsh-Welsh, it also does Welsh-English to a large extent. Following a message from me and a very helpful reply from Andrew Hawke, the managing editor, I managed to get it working with Multidict.

2014-08-17

Wrote an Arabic root extraction utility, and added this among the Arabic “dictionaries” in Multidict. Although the Arabic root extractor is not in itself a dictionary, many of the best Arabic dictionaries, such as Hans-Wehr, are organised not alphabetically but based on the “root” of the Arabic word, a skeleton framework usually consisting of three consonants. To get this working, I first got the Python3 language installed on the server hosting multidict.net. Python3 unlike Python2 has native support for UTF-8 character encoding. Then I installed the Python3 version of ntlk, the “natural language toolkit”. This includes a utility for extracting roots of Arabic words, and I wrote a Python program to interface this with Multidict, which is written in PHP.

2014-08-18

Although the Arabic root extraction was working well with most of the Arabic units in Clilstore, it was not working properly at all for others. After investigation I found that the culprit was the tatweel character which is sometimes liberally inserted for typographic purposes into the middle of Arabic words in texts. This was breaking the root extraction utility. When I tested, I found it was also breaking the lookup of Arabic words in dictionaries. I changed Multidict to remove tatweel characters, which should cure these problems.

2014-08-23

Delivered the paper Tools facilitating better use of online dictionaries: Technical aspects of Multidict, Wordlink and Clilstore at the Celtic Languages Technology Workshop at COLING 2014.

2014-08-30

Added Majdi Sawalha’s Arabic root-meaning search as an “Arabic→Arabic” dictionary in Multidict - preceded of course by Arabic root extraction using the ISRIStemmer in nltk as described above on 2014-08-17.

2014-08-31

Added a 350,000 word lemmatization table for Breton to Multidict. This has greatly improved the headword suggestions for Breton. The table was kindly sent to me by Francis Tyers, who I met at the COLING Celtic Languages Technology Worshop in Dublin.

2014-09-02

Added the Czech grammar tool, Internetová jazyková příručka as a “dictionary” to Multidict.

Added the Seznam family of dictionaries to Multidict. These translate Czech to and from seven other languages: en, de, fr, es, it, ru, sk.

2014-09-03

Thanks to a tip from Francis Tyers, who was at the CLTW in Dublin, I installed Apertium lttoolbox, which includes the utility lt-expand, which enables lemmatization tables to be generated from the .dix files in the monolingual section of the Apertium project. As a start off, I generated lemmatization tables for Welsh, Czech, German and Arabic, and added these to the lemmas table in the Multidict database. This will give a significant improvement to the dictionary headword suggestions which Multidict provides, at least for Welsh, Czech and German, in particular for irregular verbs and nouns which Hunspell usually fails to lemmatize since its priorities are spellchecking and not lemmatization.

The Apertium project contains monolingual .dix files for nearly 40 languages, so it looks hopefully that even if only half of them can be made to produce good lemmatization tables, this will lead to a big improvement to Multidict for many languages.

2014-09-08

Corrected a bug discovered by Kent. Before uploading a new file to attach to a Clilstore unit the file upload facility manageFiles.php, had been checking whether a file by that name already existed, instead of doing what it should have been doing and checking whether a file by that name already existed *for that particular unit*.

2014-09-18

Implemented the > operator in the algebra behind the programming for the headword suggestions feature in Multidict. Although > is described at the end of section 3.4 in my Dublin COLING paper, it was never actually fully implemented. It was, however, required for the following...

2014-09-18

I have improved the way Wordlink handles the middle-dot (interpunct) when it processes (normalized/modernized) Old Irish texts. To see this in action, have a look at Clilstore unit 1412.

Wordlink now treats words containing a middle-dot (as·bert, at·taam and mani·léicthe in this particular text) as single words and passes them on to Multidict to try and find them in the dictionaries.

Multidict now has to deal with words containing middle-dot. It removes this middle-dot before searching for the word in eDIL (whether eDIL’s own search, or the “eDIL colbh” column-based search).

Multidict’s “headword suggestions” feature (the words shown in brown), however, will suggest splitting the word as a possible way of finding it in the dictionary. So when Multidict is asked to find “mani·léicthe”, in addition to the favoured suggestion “mani·léicthe” itself, it suggests trying “mani”, “léicthe” and “léicid”.

2014-09-24

Indexed the pages of Volume 2 of the Dictionarium Scoto-Celticum, 1828, otherwise known as the Highland Society Dictionary. This makes available via Multidict: (1) a Latin to Gàidhlig dictionary; and (2) another good (albeit old-fashioned) English to Gàidhlig dictionary.

Customs were different in 1828, and this dictionary treats, for alphabetic sorting purposes, the letters i and j as if they were merely two variants of the same letter. And similarly for u and v. It does this for both Latin and English. So when searching for words in the dictionary it is necessary to convert j’s to i’s, and v’s to u’s. Implemented this via the charextra field in the dictParam table in the Multidict database. Implemented it via a new general mechanism which I have written for specifying character translation via a string such as “tr:j:i|tr:v:u” in the charextra field. This should be useful for other dictionaries too.

2014-09-27

Changed to allow ‘s·h’ and ‘n·h’ as “letters” in Occitan, in a similar way to the change to allow ‘l·l’ in Catalan, as described above on 2014-05-11. See https://en.wikipedia.org/wiki/Interpunct#Occitan.

2014-10-08

Noticed that the Concise Dictionary of Middle English, by Mayhew and Skeat, 1888, available from Project Gutenberg, was refusing to appear in frames and so was not working properly with Multidict. Got round this by taking my own copy of the files from http://www.gutenberg.org/files/10625/10625-h/ and getting Multidict to access these instead. This is ok since the dictionary is well out of copyright and Gutenberg allows it.

2014-10-10

Moved over to the main multidict.net site some important new features which I have been programming over the last few weeks on test.multidict.net

The author of a unit now has choice of which Creative Commons licence to give it. The default is still BY-SA as before, but the author now has the option of disallowing commercial use, or of disallowing the production of derivative works. This is a feature which was particularly requested at the workshops which we gave at Leeds and Sheffield Universities, since many large repositories of open language teaching materials, videos and transcripts which would be very suitable for Clilstore, do not allow commercial use.

The Clilstore index now has three additional optional column, showings: the Creative Commons licence for the unit; the number of link buttons; and the number of attached files.

There is a new mechanism for two Clilstore users to cooperate to transfer ownership of one to another. If the current owner goes to the Unit Info, he or she will see an option to offer the unit to another Clilstore user (or to withdraw an existing offer). If the other user goes to his Clilstore Options, he will see that the unit is on offer to him and can accept the offer, upon which ownership is immediately transfered.

2014-10-21

Added the Middle English Dictionary at the University of Michigan to Multidict.

2014-10-29

Added DASG, the newly released Gàidhlig corpus, to Multidict as a Gàidhlig-Gàidhlig resource.

Noticed that Craine’s Manx-English dictionary is not finding properly words with a ç character in them, but that it finds them ok if the cedilla is removed. Added “tr:ç:c” as a charextra parameter to the dictionary’s entry in Multidict so as to bypass the problem.

2014-10-31

Feedback on the Arabic entry in another project I am involved in indicated that Arabic requires not just a larger font-size to be readable, but is also much better if given a bit more space between lines. Changed the style for Arabic in Clilstore to have not just font-size:150%, but now also line-height:1.25em.

2014-12-02

A major new Manx↔English dictionary, Fockleyreen has just appeared online. Added it to Multidict.

2014-12-03

Added to Multidict the Chinese↔English and Esperanto↔English TLex dictionaries. Also added to Multidict the MDBG Chinese↔English dictionary, and also the mobile version.

2015-02-19

A couple of fairly important changes over the past few weeks, following the (probably purely erroneous) block on wordlink.php by Google Safe Browsing (now cleared). The first change was to get Wordlink to check each url with the Google Safe Browsing database before going ahead and wordlinking it. (It would be nice to cache the results of this for a short while to reduce latency.) The second change was to get Wordlink to disable any html <input> fields it finds by adding the html5 attribute “readonly” and the placeholder attribute “Disabled in Wordlink”. This second change turned out to be too severe - It was blocking input fields in HotPotatoes exercises in cases where authors had wordlinked the exercise, which ver occasionally they did. So made this second change not apply to files stored at multidict.net, nor to files stored at dropbox.com. This might not catch every case, but will catch the vast majority, especially for new units where the exercises will normally be stored attached to the unit at multidict.net.

2015-02-23

Got a report from Kent that “Clilstore cannot count”, because it had rejected a Greek unit title which clearly had less than 120 characters and claimed that it was over the 120 character limit. Cured it by changing edit.php to use the PHP function iconv_strlen, which counts actual characters rather than relying on strlen which actually counts bytes, not characters.

2015-03-29

Added a new facility to Clilstore (only on the test.multidict.net site for now). In the two author modes, “Author page” and “Author page (more options)”, there is a new grey area at the foot of the index which shows statistics (averages and totals) whenever these could be relevant to the column. This includes the average number of views and clicks per unit, “average” date of creation and last change, average level, number of words, video length, number of user buttons, number of attached files. These statistics just apply to the units displayed in the index, so by filtering by language or by author, for example, statistics can be found for each language or author. Hovering over the statistics for the number of views or clicks (dictionary lookups) displays additional statistics for the average number of views or clicks per day, month or year (taking into account that we only started collecting click counts on 2014-03-18).

2015-04-02

Got the AnSeotal, the Gaelic terminology database for schools working again with Multidict. It had stopped working due to a reconfiguration on its website.

2015-04-04

Moved the multidict.net domainname registration from GoDaddy to NameCheap.

2015-04-06

Moved the new statistics features (reported above uner 2015-03-29) out from text.multidict.net to the main multidict.net site.

At the same time cured a problem which had been occuring with the language dropdown field in certain browsers on certain operating systems (Firefox on Linux at least), namely that the language name was being displayed too low, so that it could not be read in the field. It turned out that this was being caused by the browser making allowance for larger characters in certain outlandish native names for languages, and allowing too much height for the language name in all languages. Cured by adding option { max-height:1.5em } to the stylesheet.

Also got the Scots Online Dictionary working again with Multidict, following a report from a user that it had stopped working. (They had changed the interface.)

2015-04-26

A lot of maintenance work on the Multidict database:

  • Changed the name of the Irish terminology dictionary focal.ie to tearma.ie.
  • Added a new French etymological dictionary.
  • Got the Welsh DECHE corpus working for cy-cy working - it had been broken.
  • Noticed that the Welsh/Cornish Eurfa dictionary is not working (and has not been for some time). Sent an email to its author.
  • Got the dict.info Universal Dictionary working - it had changed parameters and stopped working.

2015-05-30

Got the Generalitat de Catalunya Optimot dictionary site working with Multidict again. It has stopped working due to a change of parameters.

2015-06-03

Added the new InterGaelic Scottish Gaelic - Irish Gaelic dictionary and translation service to Multidict.

2015-06-28

Created a new Help page in Clilstore, with examples of urls linking directly to units in a particular language, or at a particular level, or by various other filter and sort criteria.

Saw that urls including sortCol= were not working as they should. Put this right, and simplified the programing slightly in the process.

2015-07-02

Received a message from a user in Ireland, Gearóid Ó Casaide, telling me that the main Irish language terminology database tearma.ie had stopped working with Multidict. Put that right. (It had changed parameters.) Also got the mini/mobile version, m.tearma.ie, working properly with Multidict.

2015-07-29

Changed the endonym (native name) for Old English from Englisc to Ǣrenglisc, and changed the endonym for Middle English from Englisse to Middelenglisch - after seeking advice from an expert, Clive Tolley. The main reason for this is to avoid confusion in the dropdown list of language names in Clilstore, where some teachers had been mistakenly choosing “Englisc” for their units when they should have chosen “English”. As well as that, it establishes a new standard policy on endonyms for historical versions of languages, that they should be what the people would have called their language if they were alive today - rather than what they actually called their language at the time, which is often too similar to the name of the modern language, and unclear in meaning to people today. We have already been doing this for many years by calling Old Irish Sengoídelc rather than Goídelc, so the move to Ǣrenglisc and Middelenglisch is merely extending this system.

2015-07-30

Noticed that the Oxford Advanced Learners Dictionary (English-English) had changed parameters stopped working. Got it working again, and changed its name to the current “Oxford Learner’s Dictionaries”.

2015-09-02

Tidied up the text formatting in some Italian CLILstore units

2015-09-03

Got EUdict working again in Multidict. It had been linking to m.eudict.com, which had disappeared.

2015-09-17

Updated the links in the “Parentage” column at http://multidict.net/multidict/languages.php, since multitree.linguistlist.org has now moved to multitree.org.

2015-09-20

Added the Dizionario Latino-Italiano Latin-Italian dictionary to Multidict.

Added the Forvo.com crowdsourced pronouncing dictionary to Multidict. Added it as a monolingual “dictionary” for 40 of the languages for which it currently has most pronunciations.

2015-12-18

Added TheFreeDictionary.com to Multidict, both English-English monolingual, and 15 monolingual dictionaries in other languages.

2015-12-19

Got the Glosbe dictionary working again with Multidict, following a report from Kent that it had stopped working. What seems to have done the trick is enabling the SSL module in Apache, because Glosbe now seems to require initial contact to be made via https. (Not totally sure of this, though)

2015-12-21

Noticed that Glosbe and the Oxford Learner’s Dictionaries have both started returning no result whatsoever, but instead issuing an HTTP 404 (“Not found”) error response whenever they are asked for a word which they do not have. Turned this problem to advantage by picking this up in Multidict to give a nice error message to the user, and if Multidict has other headword suggestions giving the user advice to reclick to try them.

2016-01-13

Added the following Irish Gaelic monolingual resources to Multidict, all of the from teanglann.ie: An Foclóir Beag (better on the whole than the old version at the University of Limerick); Gramadach (grammar), and Foghraíocht (pronunciation).

2016-01-15

Added some English-English dictionaries to Multidict: Macmillan open dictionary and Longmans Dictionary of Contemporary English.

The Reverso and Colins Cobuild dictionaries had stopped working. Got them working again (by removing “redirect” handling - not sure why this made any difference).

2016-01-18

Added the gaois.ie Irish↔English translation corpus to Multidict.

2016-01-20

Added isiZulu, Northern Sotho, and Hawaiian dictionaries to Multidict.

2016-02-08

Found a very serious webscreen layout problem in the Clilstore index page when used with Internet Explorer (which I don’t normally use much myself), at least on Windows 7. The blue stripe from the bottom of the page was completely overwriting the Student page/Author page choice, making it unusable. Presumably this first arose with some reinterpretation (or misinterpretation?) of standards in a very recent upgrade to Internet Explorer, or else someone would have reported the problem to me or I would have noticed it myself. Managed to put it right (by changing height:1px to min-height:1px in a div).

2016-03-02

Added the Terminologue English-French-Arabic terminology databank to Multidict.

2016-03-04

Found that Javascript was not working in Internet Explorer on computers at SMO (and probably the rest of UHI), on the multidict.net and other websites. Found that this was caused by “compatibility mode” being set on the “intranet” by Internet Explorer. Added the Apache2 directive
   Header set X-UA-Compatible:"IE=edge"
to the multidict.net virtual server (as well as test.multidict.net), etc. This seems to have cured the problem and should prevent it from cropping up in any other circumstances.

2016-03-30

Updated and improved the link from Multidict to eDIL, the Old-Irish dictionary.

2016-03-31

Sorted a bug reported by Kent - the link to published units from userinfo.php does not work in Student mode. “Cured” this by removing the link in Student mode.

2016-04-02

Removed http://www.convertaal.nl/ from Multidict, since the site seems now to have been taken over by Cybersquatters.

Now that the CEFR range slider in the edit facility for Clilstore units is working again in Internet Explorer (It had been zapped by the previous problem), noticed that in Internet Explorer it was not updating the A1..C2 buttons and other things as it should when the slider is moved. After research, found that this is due to a bug in Internet Explorer, which fires the onchange event instead of the oninput event when the slider is moved. Cured the problem by getting the program to listen for onchange as well as oninput.

2016-04-17

Changed the Clilstore Edit program so that the WL is unset on user buttons for link names with any extension other than .html, .htm and .xml. In the Clilstore database, unset WL more than 100 buttons where it had been set on .pdf, .docx, and other filetypes which can not be wordlinked.

2016-09-03

Added Irish Gaelic resources to Multidict, namely: Briathra (FBG), Teasárus, and Gluais Beo, all to be found at Pota Focal.

2016-09-04

Added to Multidict the Intergaelic Manx-Irish dictionary developed by Kevin Scannell.

2016-09-05

Added to Multidict a table of 75,000 Manx wordform lemmatizations researched by Kevin Scannell, bringing immediate improvement to Manx dictionary lookups in Wordlink and Clilstore.

2016-10-11

Noticed that the online copy of the Favereau Breton-French dictionary at agencebretagnepress.com had disappeared recently. Found after searching that it had migrated to the new top-level domain. abp.bzh But also found a much better (on the whole, except for mobiles) online implementation at arkaevraz.net and linked to that as well from Multidict.

2016-10-29

Added Victor Henry’s Lexique Étymologique du Breton Moderne (1990), a WebArchive dictionary, to Multidict, by the usual method of typing in the last word on each page to construct a page-index. Some special measures were required to cope with the unusual alphabetization of k, ch, c’h.

2016-11-04

Got the Eurfa Welsh-English dictionary working again with Multidict. It had changed parameters and stopped working.

2016-11-23

Added to Multidict some Ladin language dictionaries, including that by Instituto Ladin de la Dolomites.

2016-11-23

Found an add-on for Firefox, HttpFox which can capture the POST parameters of outgoing HTTP requests. This should be very useful for deducing the POST parameters required by dictionaries and adding them to Multidict.

2017-01-29

Investigated a report from Gordon Wells that Glosbe was no longer working Multidict. Looks like Glosbe now requires https instead of http. Got it working again by changing the methodology used to "redirect" (which is simpler and better anyway).

2017-02-05

Major upgrade, first in two years, to the OpenSUSE operating system on tarbh, the computer hosting Clilstore/Wordlink/Multidict, from OpenSUSE 13.2 to OpenSUSE Leap 42.2.

2017-02-19

Major upgrade to the PHP language on tarbh, the language in which Clilstore, Wordlink and Multidict are written, from PHP 5.5.14 to PHP 7.1.2. This seems to have cured a problem which emerged after upgrading the OpenSUSE operating system, namely that Wordlink was not working at all with Wikipedia. [This was due, I suspect, to a downgrade(!) in the version of PHP parcelled with the newer OpenSUSE, from 5.6 to 5.5, to Wikipedia trying to make an https connection, and to PHP 5.5 not having the same SSH connection capabilities as PHP 5.6 - but I am not sure.]

PHP 7 is reputedly a lot faster than PHP 5, so this may result in some performance improvements.

2017-02-23

Received a notification from the administrator of Fòram na Gàidhlig (the main Gàidhlig online forum) using phpBB, that Wordlink had stopped working with Fòram na Gàidhlig. Since Fòram na Gàidhlig was automatically adding a link to Wordlink on every page, this was a serious problem. Investigated and found that the Fòram na Gàidhlig host was returning no content and a 204 HTTP status code to Wordlink. Since Fòram na Gàidhlig was working ok in a WWW browser on the computer hostling Wordlink, I suspected that the Fòram na Gàidhlig host did not like the User-agent string which the PEAR module Net/HTTP_Request2 was sending to it. A ticket opened with the hosting company confirmed this. Added a line to the Wordlink program to set the User-agent string to null when used with Fòram na Gàidhlig, and things are working again.