Classification and cataloguing principles

17 avril 2005

17 avril 2005, 20:00

Commentaire(s)

20:00

Commentaire(s)

Par

Partager cet article

Classification and cataloguing principles

lexpress.mu | Toute l'actualité de l'île Maurice en temps réel.

Part III

In a library, all the documents are classified and catalogued. They are classified according to one of the classification schemes such as the Dewey Decimal Classification Scheme and they are catalogued according to the Anglo American Cataloguing Rules for example. Both cataloguing and classification concepts help towards better information retrieval. However, no library would be able to classify and catalogue the vast amount of information found on the Web despite the efficiency and effectiveness of both concepts. As said by Gill (n.d), ?the Web is simply too big for any single organisation or service to catalog, irrespective of whether they use people or computers to generate their indices?.

Indexation of information

?In order for queries to be matched to documents, there must be an indexing language, which is a set of descriptors that describe the contents of documents and can be entered by users to retrieve them? (Hersh, 1998). As it can be clearly seen, indexing is a very important aspect of information retrieval. Indexing as carried out in the world of the World Wide Web is different from indexing in catalogues for example. Indexing in the traditional information retrieval world was performed by indexers manually. Catalogues, for instance, were indexed by humans. On the other hand, as far as the World Wide Web is concerned, indexing cannot be carried out by indexers since the sheer volume of Web based documents would make this impossible. As said by Gudivada et al. (1997), ?the sheer size of the Web together with the diversity of subject matter make manual indexing impractical?.

Thus, with the advent of the Web, automatic indexing was introduced. In a certain way, automatic indexing can help to improve information retrieval results since it ?offers the potential to represent many more aspects of a document than manual indexing can? (Gudivada et al., 1997) and has recourse to full text indexing. Unlike manual indexing, ?automatic indexing does not require the tightly controlled vocabularies that manual indexes use? (Gudivada et al., 1997), it makes use of natural language indexing.

The absence of controlled indexing vocabulary can in a certain sense be a difficulty to the user since the user would not have any list of terms to consult whilst carrying out the search. As said by Rowley (1993), ??natural language imposes the burden of vocabulary control on the user?. However, some would say that a controlled vocabulary would help avoid any ambiguities in the search; since it would provide for standardised search terms. Opinions as regards to which of natural language or controlled vocabulary is best therefore differ.

Vocabulary problem

Even though opinions differ as regards to which of natural language or controlled vocabulary is best, still we cannot deny the fact that vocabulary remains a problem whilst carrying out a search on the Web because for the same object, a diverse set of terms can be used. We might be aware of what we are looking for but when it comes to formulating our query, we are not sure as to what terms should be used. We would not get this sort of problem with a library catalogue, an electronic database or an OPAC since they use a controlled index; standards and they are not as vast as the Web.

Tara Héléna LAM