Metadata to enrich content
Stakeholders brainstorming everything they need to know about their content
Tutorial of custom metadata schema
Metadata schema guidelines
Metadata schema in CMS
Scenario
We had more than 12,000 pages of unstructured content across dozens of help projects and authors couldn't easily see the full scope of our content or granular administrative details about our content.
We needed to find any easy way for project leads and contributors to see the different types of content that existed, the subject matter relationships between articles, the status of articles, and more. Comprehensive reports of such information would help us identify gaps, redundancies, and inconsistencies across our body of content, as well as help us manage our content throughout its development cycle.
By embedding detailed metadata directly in our content as HTML, project contributors could conveniently reference the data while working on their projects and we could leverage the data in the future to create more sophisticated, user-friendly content for our customers.
Process
I led team meetings to identify the different pieces of information that we needed to know about our content, which included native metadata such as file names and topic titles, as well as custom metadata such as subjects, content types (e.g., task, reference, or concept), outputs (i.e., the product and market for which the content exists), pending tasks, and others.
I researched existing metadata schemas (e.g., DITA and Dublin Core) to learn about best practices and to see if an existing schema fit our needs. Ultimately, we determined a custom schema was a better fit for us, so I designed one to accommodate the administrative, descriptive, and structural information that we needed.
We created a checklist to help project leads add metadata to their content. Our team's technical lead wrote a customizable BAT file and Perl script that crawled entire help projects and extracted all metadata into a spreadsheet, which authors could quickly generate and use to track and hone their content. For example, authors could filter the spreadsheet to view a list of all topics that were tagged with a particular subject or that required updates before being published.
I led meetings and wrote guidelines to educate colleagues about metadata basics and how to use our custom metadata schema. I created a tracking spreadsheet to record the precise syntax and values of metadata elements for all products so any contributor can accurately tag content.
Results
We used the metadata reports to identify content gaps, eliminate redundancies across related articles, improve the consistency of article titles and file names, improve navigation between related topics, and more.
A few years later (after upgrading our tools and publishing our content to a website), we used the metadata in our Google Search Appliance to create search filters.
For more details about the subject metadata, please see how I created a taxonomy to organize and improve access to content.