Besides what the article goes into about auto-curation of social feeds reducing self-curation, the counterintuitive answer is that decentralized tagging requires strong centralization to work.
You need:
- agreement on what should be and what should not be tagged in a given domain
- standardized terminology (no multiple variants of tags)
- consistent grammar and formatting across all tags
- software support for tag editing that makes it easy to adhere to established tagging rules
- mechanisms to explain tagging rules to new users, at scale
- mechanisms to punish malicious/spam tagging (e.g. user history/reputation + bans)
Usually, all of these conditions together are only found in highly niche and specialized forums that care a lot about the quality of their content. While most large social platforms today do have some kind of tagging system (e.g. hash tags on Twitter/Instagram), the usefulness of these systems is generally limited due to the inherent difficulties of co-ordinating so many diverse users who have varying interests.
These are very nearly the exact opposite of the tagging ideas/motives on del.icio.us, an early popular system with tagging. There were lots of people who made similar arguments at the time as well! I thought they were wrong then and nothing has really appeared along the lines of what you're describing to convince me otherwise but it's probably worth taking more whacks at.
Ah, interesting! I never used del.icio.us myself, but from what I understand, it's fairly similar to Instagram (for example) in how tags work, optimizing for ease of use rather than ease of finding specific content. In my opinion, this is almost certainly the right decision for any platform where "absurdly detailed search" is not job #1, and I'm pretty sure I would have argued the same way as you did.
That said, having seen some of the centralized, intricate tagging systems out there, that let you filter down from Earth to one specific ant in the blink of an eye, that's what I think of when I think of "tagging" that's really effective. YMMV, but I would argue that if you can't type in 10 different tags and get 1 result that's exactly what you're looking for, tags aren't really delivering on their promise.
no. tags were a separate field, and you were shown the tags you used as prominently as the things themselves.
note that it was built as a memory aid so that you had a chance of finding something again later. your idea that it needs be exact, perfect, and precise or it won’t work is silly.
Ok, after reading this and also your top-level comment, seems we're talking about the two different types of tagging ("personal curation" vs "collaborative") which you list in that comment. That makes more sense.
Definitely agree that if you're mainly tagging for your future self, you can derive tons of value without constraints. Whereas if you have people tagging for others, that needs more support.
Although, if you're so inclined, you could also view these as a continuum --- with "personal" tagging being the apex of centralized consensus, and having to increasingly labor for consensus with every added user in the system.
Hmm yeah, kinda, or at least I remember it slightly differently - that the sort of 'winning point' for it was that it was useful for individual users because it helps you pick out tags you already have or add a new tag you didn't have, plus it tells you something about the url. The purposes overlap, of course and it's been a while.
The thing above is much closer to what some of the librarians were really into.
image boards like danbooru and similar websites are an example of the things mentioned by the parent comment. and from my personal experience they are the best implementation of tags I've seen on the internet. they are not perfect and still have lot of room to improve on but they are way better than what's used and available elsewhere.
they have their own description, their get moderated, other people can add tags, they can report them, you can alias tags, see related tags, and get feedback on them.
It is but nobody involved in making deli.cio.us used it (del predates it, too). The term was popular for a bit at the time and has now mostly gone out of use. Unsurprisingly, being a bit of a clunker.
It's two things, really, one is that 'folksonomy' was a term of its time, like, I dunno, 'blogosphere' or 'microformats' and is similarly obsolete. The other is that there were lots of people who thought or hoped that del.icio.us-style tags would lead to some sort of useful or interesting taxonomy or ontology, either emergent or by more prescriptive means (as in the initiating comment above) and that didn't really happen, perhaps because it wasn't (for the most part) what the tags were for. 'Foksonomy' is terminology that came from that line of thinking.
I've been generally inclined toward adopting an existing taxonomy (or at least a usefully-sized portion of it). Unfortunately, many of the more common ones are copyright-encumbered (e.g., Dewey classification). Library of Congress classification and subject headings are available. If somewhat unweildy.
If you haven't come across it before, this is still a fun read, a kind of manifesto of the 'emergent ontology is going to be better than designed ontology' notion.
It made librarians pretty mad. At the end of the day, though, putting things like 'ontology' and 'keywords you type so you can find your bookmarks again later' in tension was a category error.
Thanks, excellent reference. I have seen that before, and Shirky hits it on the head here:
What's being optimized is number of books on the shelf.
Libraries organise their content. And in a print-and-paper world, that content has locational specificity.
You don't have to organise materials by topic, but in an open-stacks model that's almost always preferable, and it has utility with closed stacks as well.[1] Where the indexing system can provide a space-transgressing capability of cross references, even that was originally bound printed volumes or journals, though as the 19th century progressed, increasingly the more open to random-access, but still locationalised index card within cabinets.
Having read A.R. Spofford's reports, the physicality of the archive dominates --- during his tenure the collection was housed in the North wing of the US Capitol, adjacent to what is now the Old Senate Chambers (directly west, best I can tell, with three floors and presumably some basement space). The collection had burnt shortly previously (1851), and beginning in the 1870s, Spofford was urging Congress to dedicate a building to the archive, and complained incessently of the challenges in even enumerating the holdings due to crowding of books and other materials --- 800,000 volumes in a space meant for 200,000.
The new building opened the year of Spofford's retirement, in 1897. (Spofford himself lived to 1908, remaining as Chief Assistant Librarian, and presumably enjoying the main fruit of his labours in the Jefferson Building.
Following Spofford, attention seems to turn to cataloguing. I'm reading those reports now, which expand greatly from the earlier brief form (about 6 pages during Spofford's term). I'm interested to see how that discussion develops.
It is made clear that the organisation borrows heavily from Francis Bacon's trinary distinction of history (or memory), poetry (or creative works), and philosophy, or reason, acquired through the Thomas Jefferson collection's organisation. (Jefferson's personal library re-established the Library after an earlier fire of British origin in 1812.)
I've also looked at other ontologies --- Diderot, the Encyclopaedia Britannica, Wikipedia, and several library classifications (Dewey, LoC, Colon, ...).
As with principles of truth, classifications should be useful, serving a purpose. Organising and using a collection should presumably be key amongst those purposes.
Hierarchical classifciations, like metaphors, melt if pushed loudly enough.
In kicking around some ideas for an information management ... thing ... which I variously call KFC (Krell Functional/Fucking Context), docfs, and/or webfs (latter two should be self evident, see Plan9's 9p for strong precedent). One notion is that search affords identity, in the sense that a search which is sufficiently defined to result in a single work is an identity function for that work. That's mated with the notion that a search can return a different value of results: 0, 1, a few, and many. "Few" and "many" are relative to the ability to work with the results, they're not a fixed quantity, and will vary by characteristics of the system and its user. For a skilled researcher and good tools, I'd suggest "few" might range from an order of magnitude of 10 -- 1,000. Capable of being winnowed further, with effort. "Many" might be 100 and above (there's overlap, yes), to million, billions, or more.
For purposes of this discussion, values from 2 -- 9 equal 10 ;-)
A search result is nothing (empty set), one (identity), or > 1 (list). Where a list is presented, further subdivisions might be suggested: publication dates, authors, subjects, publishers, concepts, statistically significant words or phrases, titles... At some point, those subdivisions would likely provide an identity.
My thinking is that a filesystem-like expression of qualifiers might identify a given work, or set of works. Or the empty set. Something like:
/docfs/au:fitzgerald/ti:gatsby
Or:
/docfs/dt:600-799/kw:transsubstantiation
If you're interested in texts of a medaeval theological concept.
There might be more paths to Gatsby:
/docfs/dt:1915-1930/kw:west egg/kw:daisy
Again: so long as you can come up with constructs which usefully winnow down the possible results set, you can find what you're looking for.
/docfs/dt:1980--1989/su:machine learning
The advantage of a controlled vocabulary is not that it is a strict hierarchy, which seems to be what many people get boggded down in, but that it is a useful and defined vocabulary.
And amongst the reasons why the US LoC classification and subject headings are useful is not that they are perfect and a single authority, but because over more than a century of use and adaptation they've acquired the institutional tools and processes to manage change and ambiguity reasonably well.
That, and the fact that they're freely available. Albeit in inconvenient forms.
(Another project I've been working on in fits and starts.)
________________________________
Notes:
1. Among alternatives I'm aware of is the SuDoc classification, used by the U.S. Supervisor of Documents, which is arranged by government department* and date. Which turns out to be a useful way for grouping that corpus physically.
> Usually, all of these conditions together are only found in highly niche and specialized forums that care a lot about the quality of their content.
Ooh do any of these still exist? If you know of any I'd love the links to look at how they're doing.
I was an inveterate tagger, debating taxonomies and ontologies late into the night (I have now forgotten the difference between the two!) and tried to run a curated forum. Eventually I gave up for most of the reasons you highlight - but mainly because I realised no one was as OCD about classification as I was.
In another life I would have run and catalogued a university library.
Stack Overflow exhibits (or exhibited) all the points that parent mentioned. If you look at [the discussion of tags on the Meta site][0], and especially what's called ["burnination"][1] you'll see these issues being hashed out over time.
To sustain a tagging system like that it takes dedicated and invested individuals, and the corollary of that is that such people tend to generate a lot of discussion.
The social cataloging site Rate Your Music has a very in-depth genre tagging system. For each album and track, users debate and vote on which primary genres and secondary genres apply. For example Radiohead's OK Computer has Alternative Rock and Art Rock primary genres and a highly controversial Space Rock Revival secondary.
Each genre has a lineage of parent genres so each release tagged with a genre must also be a part of each parent genre. For example: Electronic > Electronic Dance Music > House > Tribal House. Also: Rock > Metal > Thrash Metal > Technical Thrash Metal.
There's a queue for submitting proposals for new genres and modifying the definitions of existing ones. There's also a complex chart system for filtering releases by genres, types, and descriptors. I think I last heard there were ~1300 music genres on the site.
Some good examples have been posted prior to my reply here --- I'll reiterate Archive of our Own (fanfiction) and Danbooru (anime porn) as two fairly big sites with well-maintained tagging systems.
Both sites have abundant guides and documentation about their systems and it's very interesting to see how they manage the real-world complexity of their domains.
Here are some good entry points if you're interested:
You need:
- agreement on what should be and what should not be tagged in a given domain
- standardized terminology (no multiple variants of tags)
- consistent grammar and formatting across all tags
- software support for tag editing that makes it easy to adhere to established tagging rules
- mechanisms to explain tagging rules to new users, at scale
- mechanisms to punish malicious/spam tagging (e.g. user history/reputation + bans)
Usually, all of these conditions together are only found in highly niche and specialized forums that care a lot about the quality of their content. While most large social platforms today do have some kind of tagging system (e.g. hash tags on Twitter/Instagram), the usefulness of these systems is generally limited due to the inherent difficulties of co-ordinating so many diverse users who have varying interests.