Thursday, June 14, 2007

How to construct a thesaurus

This is the first CILIP course that I've attended though it isn't the first time I've been here (while studying, I carried out a feasibility study for what was then the Library Association).

There are a couple of courses that were on yesterday and the coffee and registration is in one room for both courses. This is a great idea from a networking perspective but the room is full of tables and chairs meaning that delegates aren't as encouraged to mingle as much as they might be if there were only a few tables. The result, though, is that if people are talking they're only talking to the one or two people in their immediate vicinity. ( here first and sat at a began to fill up but no one is sitting at my table :( Could it be the shaved head, goatee and swastika tattooed on my forehead? Kidding.)

So, at 9:30, we were called up to our meeting room to begin the training session. The presenter, Keith Trickey, was very knowledgeable and a good speaker and he dove straight into things. I have a bit of an academic hidden inside me who really enjoys talking/thinking about things like the ways in which nouns/verbs define adjectives/adverbs as much as the adjectives/adverbs describe the nouns/verbs. For example: heavy suitcase (a suitcase that is heavy) vs. heavy smoker (a person who smokes a lot). This relationship is important because in a thesaurus, words are usually separated from each other meaning that the information that travels in the context is lost. The same is true when verbs are turned into nouns – the verb carries some additional information in its conjugation and context (e.g. manages --> management) that it loses when it becomes a noun.

Anyway, having talked about these problems with dissecting words and compiling them into a thesaurus, it was straight into some of the rules and methods. (It is not my intention to record all of my notes here – I have assembled a mind map to supplement the course handouts for my future reference). Then it was in to some exercises. The exercises were fun and helped to illustrate some of the challenges and to demonstrate some of the rules in practice.

Finally, we spent a bit of time talking about the process of creating a thesaurus. This included ‘top tips’ and a look at some of the software that can be used for creating thesauri. My only criticism of the course, which I thought was otherwise really good, was that the only piece of software that we really talked about in any detail and had a look at was Multites. Quite a lot of time was given over to playing around with it – I started to get bored here...I don’t need to see each of the ways you can add a term in this particular programme.

On the whole, my feedback form contained positive comments/scores (with the exception of this one point about the time spent on Multites).

So, what did I learn? Well, the rules for assembling a list (like whether terms should be singular or plural) are good practical things that I can take away and put into use when it comes to creating a subject thesaurus. I think that I also have a better understanding about the different strengths and weaknesses of the different options (subject headings, taxonomies, thesauri, etc). I also feel more confident about evaluating existing thesauri for application to my contexts.

Having consolidated my thoughts from yesterday, the next set of actions on this matter that I have are around identifying potential existing thesauri for our use and then looking for anyone else who is looking to introduce or has recently introduced, a rail industry-specific thesaurus. Having seen what is involved in creating one from scratch, doing so is definitely my last choice!

1 comment:

Leonard Will said...

David Bruce said "...looking for anyone else who is looking to introduce or has recently introduced, a rail industry-specific thesaurus."

There is a railway object name thesaurus on the MDA web site at

It is mainly concerned with objects rather than activities and the industry, but it might be some help, as you don't say exactly what your needs are.

Leonard Will