Wednesday, February 28, 2007

Rail Industry Metadata Standard (RIMS)

Having done some research, it looks like there isn’t yet a metadata standard that is either designed for use or in common use by the rail industry. Although it wasn’t a surprise, it was a bit of a set back as it meant having to assemble a proposed metadata schema for the description of materials within the rail industry, engage in consultation starting with the R&D senior managers, and testing to validate it.

Building the metadata standard: the birth of RIMS
I have chosen to draw from the Dublin Core Metadata Standard (DCMS), and the Dublin Core Terms (DCTerms) to flesh it out a little. I have also drawn from the e-Government Metadata Standard (eGMS) to ensure that any public sector aspects are also covered. Now, the eGMS is based on the DMS and so far, what I have described would pretty much describe the eGMS. The rail industry, and our R&D work in particular, requires a little more granularity than the eGMS currently provides, something the Cabinet Office acknowledges by encouraging enhancement and refinement to suit different contexts. So I have also added some additional elements that are specific to the rail industry (e.g. asset type) and to our organisation (research topic). The complete picture is what we will consider to be the Rail Industry Metadata Standard (RIMS).

Identifying, assembling and building controlled vocabularies
Once I had a proposed set of elements, I set about sorting out the required controlled vocabularies. In my, albeit relatively limited, experience, this is the most difficult part. For many of the fields drawn from established standards, it was pretty straight forward (e.g. date formats us the W3C-recomended date-time format). For some of the elements that I had to create (e.g. research topic), it was also pretty straight forward because such lists were specific to the company and, in many cases, already in current use. Others from both established standards and the new set, however, were much more difficult. One such example is the asset type element – how granular do you go? For most of us, the term ‘locomotive’ is sufficiently descriptive but for our engineers it’s just too broad. My approach to these controlled vocabularies has been to put together a starting point and seek input and comments. So far, I have only engaged the R&D team and the lists have been heavily refined and accepted by them.

My other challenge has been sorting out a subject matter controlled vocabulary and it is proving to be a somewhat daunting task. The Integrated Public Service Vocabulary (IPSV), recommended as the controlled vocabulary for DCMS Subject, treats everything to do with the rail industry as ‘Rail Transport’. Clearly, this isn’t going to be sufficient for our requirements. I started to have a go at this task in the same way as I approached sorting out some of the other controlled vocabularies but it has proven to be too big. At the moment, it’s on hold while I move the rest of the project forward with a space reserved for subject tags and start to look for other initiatives both here and around Europe that are working towards creating a controlled vocabulary of some sort for the rail industry.

Handling the metadata
There are basically two different ways of managing document metadata: you can hold the metadata in a table which includes the location of the document described and then use this table to search and retrieve documents or you can embed the metadata into the documents themselves and search that (In reality, the search software or engine will most likely create its own table of metadata as in the case of the first method but this is a temporary table that is understood to need regular updating so is not the source of the metadata). Each method has its strengths and weaknesses (e.g. the table is quicker and simpler to deliver while the embedded data means that when someone downloads the document to a local space, the metadata travels with it and isn’t lost).

At the moment, we are also in the process of introducing a business process management system (we are calling it the Research Management System or RMS). The RMS will allow us to store documents as well as manage their production and approval. As a result, it makes sense that we piggy back the metadata assignment on the RMS work meaning that we will be going down the table route. This isn’t my preferred option but it is the one that will mean that we get metadata gathered and stored sooner. Once that process in embedded, we can look at technologies that will enable us to embed that gathered metadata into the files so that users downloading them from our website take the metadata with them.

One challenge that remains, and for which we have a few options but haven’t decided on any one yet, is what we do with the legacy collection. It has been decided that past projects and their associated documents will not be uploaded into the RMS. So the RMS presents us with the solution for future publications but it doesn’t deal with the existing collection. It is most likely that we will upload the previous publications to a separated segment of the RMS which will store the metadata in the same way but there are a couple of alternatives solutions…more on this as it develops.

Where from here?
The next thing to do with this standard is to confirm that it works in practice which will be part of the embedding process for the RMS. We will then look to consult the rest of the organisation on the suitability of the metadata schema and its associated controlled vocabularies for wider use in the company. I guess you could think of our work as a bit of a pilot for the rest of the company.

I’d like to publish our schema and controlled vocabularies under creative commons and invite other organisations to comment on it or use it in their organisations.

Monday, February 26, 2007

PPDP on its way to CILIP

Karen has confirmed today that she has signed and sent over my PPDP to CILIP - yay!

Thursday, February 22, 2007

Personal Professional Development Plan Submitted

This morning, I finalised my PPDP, bound it, signed it, and sent it over to Karen for her to sign before she sends it in to CILIP.

It feels quite good to have that first 'assignment' completed and submitted. Neither Karen nor I are sure whether there is any feedback or acknowledgement of it so in a couple of weeks, if I haven't heard anything, I might give them a ring to check that they have receiwed it and that they are happy with it.

Friday, February 16, 2007

February mentoring meeting

Although it hasn’t been long since our January meeting, Karen and I met up last night in order to go through the personal professional development plan (PPDP) draft that I had sent over to her earlier in the week. This thing needs to be submitted in the first six months of the process and my six months are up in a few weeks so it was important that we get together to go over it. Aside from a few changes to the content and layout (and shamefully, a couple of typos), it looks about ready to go.

I have used the CDL PPDP for CILIP Chartership Candidates document as an outline for the different categories in my PPDP. At first, I found creating the PPDP quite daunting. With nothing more than a pretty generic template to guide me, I was struggling to get my head around how to structure it and what to include in it. The CDL document, though, provided me with a structure and some description around the sort of thing that should be included under each heading and sub-heading. This really helped to focus my effort and to generate the content. I haven’t followed the CLD document completely; I have left a couple of sub points out, mainly because they just didn’t apply to me.

One of the most (worrying?) glaring blank spaces in the document is under the ethics heading. I just couldn't think of anything that I could put under that heading that fit with my role. We had a lengthy discussion about ethics in librarianship in general and then talked a bit about how that might apply to my current role. In the end, it looked like with the exception of anything particularly out of the ordinary occurring, I would need to fill that space with actions to the effect of discussing it, reading about it, and abiding by CILIP's code of conduct. That code states that CILIP will pursue disciplinary measures against professionals acting in breach of it. I'm not too clear quite how that would work and one of the actions for me coming out of our conversation was:

Action: D - to post a question about ethics and CILIP's disciplinary actions

So the plan is for me to make the changes that we talked about, agree the final version, print it out and sign it before sending it over to her to sign and submit to CILIP. I’ll feel a lot better once this first ‘assignment’ is completed!