Taxonomy Boot Camp 2024 Themes & Takeaways
Two weeks ago it was again my pleasure and privilege to attend Taxonomy Boot Camp (TBC) in Washington, DC. This was TBC's 20th year, and, in the four times I've attended since 2016, was the strongest argument yet for the enduring importance of well-designed and integrated taxonomies.
For those who have not had a chance to attend TBC, the "Boot Camp" title may sell the range of content covered short. In addition to tracks for "taxonomy fundamentals," this year's event also featured tracks for taxonomy applications, taxonomy operations, and advanced data and semantic layers. If that sounds like a lot of content ... it was!
Here is a roundup of the top themes and insights I took away from the sessions I attended. Many speakers have shared their slides and TBC has graciously made these available for anyone to download, so I’ve included links to decks where available. You can access the whole list here.
Top Themes & Takeaways
1. Taxonomy = Context
The most typical use of taxonomy is, arguably, to provide real-world context and meaning to collections of data. Taxonomies capture the domain-specific ways classes of instances relate and express those relationships in machine- and human-readable ways. This context is a critical element for "bigger picture" findability, discovery, and understanding.
Joseph Busch presented a spot-on example of this use case in a study of a taxonomy project his firm, Taxonomy Strategies, performed on a huge set of government archives. In order to identify the correct attributes to make this content findable to the 3,000 employees who potentially needed access to it, Busch and team identified use cases for all the document types—which included architectural plans, binders, aperture cards, and more—and then identified the attributes that needed to be present to support those use cases. This taxonomy became the standard specification used by the company that did the work of digitizing the records.
Similarly, Richard Huffine, Assistant Director of Enterprise Information & Records at the FDIC, presented a case study of a Data Catalog project he undertook with the help of Holly Maykow and Madeleine Adamec, both consultants with the firm Enterprise Knowledge. Similarly to the Taxonomy Strategies study above, the crucial element of the work at the FDIC wasn't so much technical as it was semantic: getting the metadata profile right was key. This was the FDIC's third attempt to bring a data catalog online, so the focus on the semantics (as opposed to just the technology) was a lesson they learned through direct experience.
Talks
- Using Taxonomy to Move From Paper to Knowledge Graph, Joseph Busch, Taxonomy Strategies
- Mastering Metadata With a Data Catalog, Richard Huffine, FDIC & X & Y from Enterprise Knowledge
Bonus Content: Michele Ann Jenkins and Katherine Black from Dovecot Studio have shared the deck from their Taxonomy 101 workshop. I was moderating the Taxonomy Applications track at the time so couldn't look in, but their deck is a fantastic resource for folks getting started in taxonomy development—or looking for ways to improve an existing taxonomy.
2. Embedding Taxonomy Across the Organization
While many taxonomists, I suspect, would be perfectly happy immersed in a heads-down world of fine-tuning, measuring, and iterating terms and relationships, this (sadly) isn't a good strategy for helping an organization get the most of their taxonomy efforts. Several presenters this year spoke directly to the challenges of making taxonomy "stick" in an organization and presented strategies for helping professional peers and colleagues see (and benefit from) the value of taxonomy work.
Ahren Lehnert, Principal Taxonomist at Nike, emphasized that advocating for taxonomy integration requires a vocal and sustained communication effort. Lehnert also recommends positioning taxonomy as a necessity (even a dependency) for projects. One does this, he says, by not only communicating what taxonomy is, but what it does for folks: how taxonomy makes their lives and work easier. We already know this as taxonomists, of course, but it is often far from obvious to our colleagues and collaborators.
In her rich talk on best practices for the solo taxonomist, Bonnie Griffin, a taxonomy consultant with Enterprise Knowledge, amplified the message of communicating the benefits of taxonomy work clearly, early, often, and loudly. Some of my favorite tips:
- find someone who can echo you, who can repeat your ideas as well as you can—and encourage them
- help people understand that working with you will make them look good
- provide different options for how to work, with different levels of value for different levels of involvement
- don't just say that AI can be improved by taxonomy; show that that's the case (for example, tagging risky content can identify risky content to exclude from RAG responses)
- communicate early and often that taxonomy work is never done
In a Q&A exchange after her talk, Griffin recommended a strategy for showing that taxonomy work is never actually "finished": find a use case in the organization where things change a lot, and in which it would be embarrassing not to reflect those changes. The talk following hers, by two speakers from game company Electronic Arts, offered a pithy example: patterns of game player language are constantly evolving. Tracking and reflecting those changes in the interface can help players engage in and discover new content on their own terms.
Talks
- Stand Still Like the Hummingbird: Enterprise Taxonomy Strategy When Nothing Stands Still, Ahren Lehnert, Nike
- Consulting From Within: Best Practices for the Solo Taxonomist, Bonnie Griffin, Enterprise Knowledge
3. The Role of AI in Taxonomy Design
Weirdly, there were no talks about AI related to taxonomy design.
LOL ... As if, right?
Refreshingly, talks that had AI elements were much more balanced and focused on lessons learned than the breathless AI hype so common for the last couple years. Presenters also moved past the fantasy that AI can "do all the things," and instead focused on a few explicit places "AI" (or, more properly, language models) can help: filling in for tedious tasks, augmenting human-led work, and expanding capabilities to outcomes that were simply not possible before.
AI for Tedious Tasks
Getting AI to help out with the repetitive tasks humans would rather pass on was the focus of my talk at this year's Boot Camp. I presented an example and a set of heuristics for "deskilling" language models to make them more consistent and more reliable for taxonomy's more mundane, mind-numbing slog work. I offered that by paying close attention to the composition, granularity, and observability of language model orchestration, we can use AI to accomplish the kinds of tasks we traditionally outsource to press-ganged client teams or grad students.
AI for Augmenting Human-Led Work
In addition to using AI to do work most humans would rather avoid in the first place, several presentations focused on ways to use language models to increase productivity and jump-start taxonomy projects. Rachael Maddison, a product manager with Adobe's Taxonomy as a Service Platform, gave a demonstration of how they're using AI in their "MetaHealth" content audit tools to dramatically improve tagging and targeting productivity for content creators (in one case measuring a 75% increase!).
Shannon Moore and Max Gaibort, taxonomists with Electronic Arts (EA), presented a case study of how they worked with EA's data science team to dramatically cut the time it takes to create new taxonomy terms from data in logs and customer support channels. By training an LLM to match candidate terms to their existing taxonomy, they were able to narrow 2,400 candidate terms down to 500. They emphasize, however, that human review remains essential. Sometimes the answers the model gave looked right, but didn't stand up to professional scrutiny. As Moore and Gaibort put it, "taxonomy shaped responses are not a taxonomy."
For a broader view of getting started with AI for taxonomy task augmentation, IA consultants Erik Lee from Factor Firm and Michele Ann Jenkins from Dovecot Studio presented a joint session on AI approaches to taxonomy and tagging. Their decks are full of valuable pointers and best practices. Perhaps my favorite takeaway was Jenkins's point that "for most projects, the net number of people you need won't go down — they'll just be able to focus on things that people are best at doing."
AI that Expands Human Capabilities
Like many folks in my professional circle, I'm a cautiously optimistic skeptic about the degrees to which language models might improve the work we do as knowledge professionals. Most of the benefits I've seen have, when put to the test of real world data and business conditions, shown incremental improvements over current practice. Benefits, to be sure, but benefits that should be carefully weighed against the ecological and human costs of the "AI" industry.
Unanimous.ai CEO Louis Rosenberg's opening keynote of TBC day two, however, showed me a language model application that stood out as categorically different than anything I'd ever seen. Unanimous.ai's Thinkscape application uses language models to orchestrate real-time large-scale conversations among people, bridging two otherwise intractable conversation barriers: humans can't "converse" in large groups, and humans can't participate in multiple conversations at once.
According to Rosenberg, the tool his company is building doesn't replace human intelligence; it channels it to support collective decision-making. Rosenberg notes that "people aren't data points; they are data processors." Thinkscape extends this processing capacity.
Rosenberg has not shared his presentation, which usually makes me reluctant to share an idea in a roundup like this, but this idea has kept my skeptic's brain humming since he laid it out. If it piques your interest at all, the "Introducing Thinkscape" demo on Unanimous.ai's website provides a good starting point for learning more.
Talks
- Orchestrating the Mundane: Deskilling AI for Consistency & Reliability, Andy Fitzgerald, Andy Fitzgerald Consulting
- Beyond Chat Bots: LLMs & ‘Human-in-the-Loop’ Taxonomy Development at EA Games, Shannon Moore & Max Gaibort, Electronic Arts
- Aligning AI Approaches for Taxonomy & Tagging (Part I), Erik Lee, Factor Firm
- Aligning AI Approaches for Taxonomy & Tagging (Part II), Michele Ann Jenkins, Dovecot Studio
- Collective Superintelligence: Humans in the Loop, Louis Rosenberg, Unanimous.ai
And that's it! ... for the shared talks I saw personally. Do have a look at TBC's Presentations page for a full list of shared talks—or, better yet, put the event on your calendar for next year: they'll be back in DC on the 17th and 18th of 2025.
In the meantime, if you're curious about previous TBC topics, check out Taxonomy Boot Camp 2022 Themes & Takeaways here.