Taxonomy FAQs

Added by llempert , edited by llempert on May 18, 2014

Taxonomy FAQs are provided courtesy of the Taxonomy Community's Mentoring Committee for the purpose of supporting those new to the field of taxonomies.


Click on an FAQ to open it.



Taxonomy Basics

What is the difference between a thesaurus and a taxonomy?

A thesaurus usually includes more kinds of relationships for each term, than a simple taxonomy, namely associative relationships. A thesaurus is also more term focused, with an alphabetical arrangement of terms and their details, while a taxonomy is focused on the hierarchy.

Thesauri and taxonomies are both knowledge organization systems/controlled vocabularies that help information searchers find and retrieve the desired content. They both aim to support retrieval through improving precision and recall, and both allow the user to browse/navigate and explore for terms, based on their relationships with other terms. Both designations, "thesauri" and "taxonomies," are sometimes used interchangeably, and in some cases the distinctions are blurred, but usually there are differences.

It may be said that a thesaurus is like a taxonomy with the additional feature of associative relationships between terms, but the distinction is more complex than that. A thesaurus often has the immediate/direct purpose of serving consistent manual indexing, whereas a taxonomy supports end-user navigation/browsing. A Thesaurus focuses on terms and their relationships, whereas a taxonomy focuses on hierarchy and structure. What we call thesauri are expected to conform to standards, either ANSI/NISO Z 39.19  or ISO 2788 (to be replaced by ISO 25964), whereas taxonomies do not necessarily; they may comply only with the standard requirements for hierarchical relationships.

Thesauri are more commonly used for indexing/retrieving articles,reference documents, images and multimedia records. Taxonomies are more commonly used, for indexing/retrieving web pages, Intranet pages/content, and content management system pages/content. An individual term record display is also an option. Thesauri are usually displayed alphabetically, but often offer a hierarchical display option. Taxonomies are either hierarchical or faceted, but not alphabetical. Taxonomies do not have term record displays.

How do taxonomies relate to metadata?

Taxonomies can support metadata and thus are the source for terms in many descriptive metadata fields. Not all metadata uses taxonomies, and not all taxonomies are used for metadata support, though.

If a taxonomy is created solely for the purpose of hierarchical organization and navigation, such for folders on shared drives or libraries and folders in SharePoint, where users "put" a document in a folder, or as a part of a web site information architecture, then metadata may not be involved. Public web pages do have meta tags, and the position of a document in a hierarchical taxonomy could be a trigger to make that taxonomy term as a meta tag descriptor for the web page.

If, however, a taxonomy is used as a source of terms to index or tag documents, whereby these taxonomy terms are then attached to and associated with the document,  then the taxonomy terms do become part of the metadata for the document (along with the document title, date, author, description, and other metadata). This is quite common. Cataloging systems, content management systems (including web content management systems for web sites and intranets), document management systems, records management systems, digital asset management system, etc. all support metadata for documents/digital assets.

Taxonomies or simply forms of controlled vocabularies can be used as source for terms in various separate metadata fields, such as: Topic/Subject, Location, Organization, Person Name, and Document Type. More specific, customized metadata fields may also be desired, such as Product Type, Brand Name, Regulation, Issuing Department, etc

How do taxonomies relate to ontologies?

Ontologies are more complex knowledge organization systems than taxonomies. In ontologies, terms are categorized by classes and customized semantic relationships exist between terms of different pairs of classes. Ontologies are also supported by different technology.

How does taxonomy relate to Google and to other search engines?

Web search engines, such as Google, don't use a true taxonomy, but rather term equivalency tables to help support common matches, such as misspellings. Enterprise search engines, on the other hand may incorporate a taxonomy for a more limited scope of content.

Is a taxonomy that supports search different than a taxonomy for browsing/navigation, or could it be the same one taxonomy?

It's better if they are different taxonomies. Although they may share many of the same terms, they are constructed differently for different purposes.

How is content tagged with a taxonomy?

It can be done manually, by human indexers, or automatically by an autocategorization or autoclassification system which is a component of some search software.

How does taxonomy relate to social media, like Twitter and Facebook?

Social networking, publishing and resource sharing sites such as Facebook, Twitter, Delicious and Flickr use a less formal kind of taxonomy dubbed a 'folksonomy' by Thomas Vander Wal. Users tag content with whatever word - real or self-generated - that they believe will describe the content and improve its findability. More...

How does taxonomy relate to my corporate Intranet?

Many enterprise search tools, portals and digital asset management systems use taxonomies to better organize content and improve the chances it is found when needed. A taxonomy customized for your organization can increase efficiency, reduce duplication of content and reduce costs associated with recreating and storing redundant content.

 

Practicalities and Taxonomy Project Planning


I see the need for and would like to create a taxonomy in my organization. How do I make the case?

Research case studies of taxonomy projects in similar or complementary industries and propose a pilot program that will study the feasibility of doing a similar project in your own organization. One way to start is to do a baseline study to ascertain how long it takes to find information, how much content is recreated, and gauge user satisfaction with content management practices in your organization.

How do I get started creating a taxonomy in my organization?

Assuming you have made the case for, the first step is gather information by interviewing stakeholders and taking a survey or audit of the content which the taxonomy is supposed to cover. Before taxonomy development begins, a taxonomy project plans needs to be drawn up.

My IT department thinks that we don't need taxonomy and that our search engine is good enough; how can I convince them otherwise?

Try running some searches on terms that would have a See Also or synonymous relationship. Capture the precision and recall of such searches to show them the poor performance. Educate them on those qualities of controlled vocabularies. Be familiar with the capabilities of your particular search engine before you go in so you can have an intelligent conversation about how capable it is of integrating a taxonomy of some kind.

Can you point me to any good taxonomy management tools?

Tools include Data Harmony Thesaurus Master, Synaptica, SmartLogic, Mondeca, Wordmap, and MultiTes, among others. The choice of taxonomy management tool depends on whether you need additional capabilities, such as supporting indexing, search, ontology creation, etc.

How do I estimate the number of hours needed to create a taxonomy?

Don't. Gain experience in your organization and become a practiced taxonomist, then you will have a much clearer idea of how to estimate the time required for release 1 of any new taxonomy. Otherwise you'll just frustrate yourself and your organization.

Can I get a taxonomy my organization has evaluated by someone external?

Definitely. Taxonomy consultants, especially independent consultants, will take on both small evaluation projects comprising as few as 20 hours, in addition to bigger taxonomy design and development projects.

If I have a relatively large taxonomy to create is it practical to contract out to freelance taxonomy editors while still managing the project internally?

Yes, it is. If managing more than one taxonomy editor, then the internal project manager also needs to be skilled as a taxonomist to provide instruction and guidance to the taxonomy editors and reconcile any differences.

How and when should I engage subject matter experts in the creation and review of a taxonomy?

The initial taxonomy design plan stage, even for a technical subject, would not likely involve subject matter experts. If a specialty subject area is to be covered, subject matter experts may be consulted when preferred term names are being developed. They definitely should be included in the later stage of the review of a taxonomy.

Technical Aspects of Taxonomies

How do I connect a taxonomy to an indexing user interface? How do indexers access the taxonomy?

While some commercial taxonomy management software systems include an indexing interface, others do not, and many organizations with regular indexing operations develop their own simple indexing user interfaces that allow access to the imported taxonomy.

Commercial taxonomy management systems often include Application Programmer Interfaces (APIs) that developers use to access the terms in the taxonomies for presentation, and for use in other systems. Some commercial products include modules that export that taxonomy in formats like XML or HTML that are easy to browse or search. Often, indexing systems convert XML outputs from taxonomy management systems to formats that they can use.

How is a taxonomy integrated into a content management system?

Depends on the content management system! Many CMSs include their own taxonomy and metadata modules or tools, although these tools usually don't have as many features as commercial taxonomy management systems. Since you often use subsets of your taxonomy to populate specified metadata fields in the CMS, some CMSs include tools that you use to define a metadata schema, some allow you to import an XML file, and others allow an API connection to a stand-alone taxonomy and metadata management package. Determine how you want to manage your taxonomy, and research the capabilities of your CMS system, or of the systems being considered, before you decide how to integrate.

How is a taxonomy utilized and integrated into SharePoint?

Taxonomy can be utilized in SharePoint in different ways: (1) as a browsable navigation taxonomy in the hierarchy and names of content libraries and lists, (2) in the metadata of content items such as the content types, and (3) if through third-party integration, in search. SharePoint 2010 has a feature for the development and shared use of hierarchical taxonomies.

How do I import or export taxonomies from one system to another?

All commercial taxonomy management software offer at least some choices in import and export formats. CSV and variations of XML are the most common for export formats. You will often work with developers to determine the formats you need to provide to the other systems.

What is automatic categorization software? How does it relate to taxonomy?

Automatic categorization or auto-classification is a form of automated indexing that associates appropriate taxonomy terms with a document, based on one or more different technologies (such as rules or machine-learning) that automatically analyze the text and compare it with data stored with a given taxonomy term and possibly other data. Commercial auto-categorization tools often include their own taxonomy management modules.

Taxonomy Design Best Practices

When should I create a single taxonomy versus multiple taxonomies for the same project?

If there are to be any interrelations between the terms, even if they are in different hierarchies or facets, then they should be in the same taxonomy, but they can have different classification tags. Also consider the possibilities for future use of the terms.

How do I balance depth and breadth in creating a taxonomy? How many levels deep is ideal?

For consumers or the general public, usually not more than three levels, unless it is in an area that is naturally or logically hierarchical, such as product categories, or geographic locations. For a specialized expert users, more levels, to four or five, may be appropriate and expected.

How do I decide whether to build a faceted taxonomy or a hierarchical taxonomy?

Faceted taxonomies are better suited to content that is of a consistent type, so that most data records have the same kinds of attributes that could serve as facets. Hierarchical taxonomies are more suitable when the taxonomy concepts can logically be categorized into hierarchies, such as product categories or geographic locations.

Should I have related terms in my taxonomy or just broader/narrower terms?

If the purpose is to support navigation or browsing for topics, then just broader/narrower terms are often sufficient. Faceted taxonomies also don't use associative relationships. If you are trying support expert users, cover a domain of knowledge, or build a thesaurus for literature retrieval, then the addition of associative relationships is recommended.

For a faceted taxonomy, how many facets are ideal?

This depends on the type of content, but 3-6 is common. Ironically, the more narrow the subject scope, the greater number of facets that can be supported. A faceted taxonomy for a specific type of product may be able to support more than 6 and still be easy to use.