Using the TIGERSearch treebank query program in Windows

Ciarán Ó Duibhín

TIGERSearch is a treebank query program developed by Wolfgang Lezius at the University of Stuttgart in the period 1999–2003. It was created for use with the TIGER German Treebank, but it can be used with treebanks (i.e. syntactically-analysed corpora) in several formats.

This document gives some information about setting up TIGERSearch in Windows.

The program is written in Java and requires a Java Runtime Environment to be installed on your computer. (I am using Java 6 Update 23 in Windows Vista Home Premium SP1.)

Downloading

TIGERSearch is best downloaded from the version held by Ciprian Gerstenberger at the University of Tromsø. Unless you want the program source, use the link there under Tools labelled TIGERSearch and TIGERRegistry [tar.gz], which link is reproduced here.

Unpacking

The download is an archive file called TIGERSearchTools.tar.tar It cannot be unpacked using WinZip (at least, not using version 9.0 SR-1), but can be unpacked by WinRAR (version 3.80). The contents consist of a folder TIGERSearchTools, which it is suggested be placed in your Program Files folder (the manual, which describes an earlier version, hints that Java programs may have problems with folder names containing a space, so you may wish to create a ProgramFiles folder for this purpose, but I have had no trouble with Program Files).

The unpacked folder TIGERSearchTools contains two .jar files (Java programs, for search and registry) and two .sh files (a batch file to run each program), as well as subfolders at several levels. Beyond preserving this folder structure during unpacking, no further installation is necessary. Uninstallation involves only removal of the unpacked folders and files.

Running

The two .sh files should have their extensions changed to .bat for Windows. They can then be run, eg. by a double-click on an icon, or from the command-line. Both however refused to run at first for me: runTSearch.bat reported:
The TIGERSearch configuration could not be loaded. Error in building: no protocol: tigersearch.dtd. A default configuraion can be used instead. Continue? (I chose not to continue.)
while runTRegistry.bat reported:
Could not start TIGERRegistry due to error(s) reading the configuration file. Error in building: no protocol: tigerregistry.dtd.
Both the named files, tigersearch.dtd and tigerregistry.dtd, were present in subfolder config.
When tried again some days later. both .bat files worked and ran their respective programs. I have no explanation for these errors, which have not recurred.

The search program

The download already contains a considerable number of sample treebanks, in various languages, and the search program may be run immediately on any of these. The program contains a Help menu option; a User Manual can be downloaded from Stuttgart, and User Tutorials are available at Stuttgart. The information in all of these sources concerning TIGERSearch installation does NOT apply to the version recommended here for download.

It is suggested that the User Manual be placed in the TIGERSearchTools folder, and that Windows Start Menu shortcuts be created to the batch file for the Search Program, to the batch file for the Registry Program, and to the User Manual.

The registry program

Any treebank not supplied along with TIGERSearch must be downloaded separately and registered. The treebank must be in a format for which a conversion exists, and conversion will take place in the course of registration.

The full TIGER Corpus may be obtained and registered. The non-commercial licence should be accepted, and this will lead to the download page. The downloaded file, tigercorpus2.1.zip, contains the corpus in two formats, Negra export format (tiger_release_aug07.export) and TIGER-XML format (tiger_release_aug07.xml), as well as some documentation. Either format is acceptable for registration: I have used the Negra export format.

The NEGRA Corpus can also be registered.

Problems

Nothing too serious so far! When the Registry program menu Options/Software Preferences is used, eg. with the intention of switching off Corpus autoload, it reports Error reading the configuration files: The configuration file TIGERSearch.lax does not exist.

The MORPHY German morphological processor

Completely separate from TIGERSearch, Wolfgang Lezius has also written the Morphy morphological processor for German (1999). Morphy will try to assign a part-of-speech tag and lemma to each word of a German text. Installation in Windows is unproblematic — just run the installer. A manual is available.

Disclaimer

This page is offered as a facility for corpus analysis on Windows.  By using it, you are deemed to accept that the author bears no responsibility for any adverse consequences.  Needless to say, he hopes that there will be no such consequences.  He will be pleased to receive comments, but cannot promise to act upon them.


Ciarán Ó Duibhín
2012/02/01
Clár cinn / Home page / Page d'accueil / Hauptseite