Syllabuses and Sfadb.com rankings

This op-ed item in Sunday’s New York Times, What a Million Syllabuses Can Teach Us, caught my eye initially because part of my long-range plans for sfadb.com involves continuing to add ‘citation’ and ‘anthology’ references to complement the awards data, and I’ve long had the thought that an interesting set of citations could be from college syllabuses, to see which SF/F titles were most frequently assigned in courses about SF (or anything else). But there are thousands, perhaps, of such college courses, and how to do such research? Now the authors of this essay have done it for me, and posted it online as their Open Syllabus Explorer beta project.

(The data on that site can be filtered by academic field, but nothing finer for my purposes than “English”, where I see that Mary Shelley’s Frankenstein ranks 2nd, Shirley Jackson’s story “The Lottery” ranks 33rd, and William Gibson’s Neuromancer ranks 183rd. I think I can still compile much useful data from them, just by using the search box on individual authors, and do that for all authors with any kind of standing in my statistics so far.)

But what caught my eye especially in the NYT essay is their description of their “teaching score” metric. Rather than tally books and essays by the raw number of appearances on college curricula, they scale those counts so that the most frequently assigned title, “The Elements of Style”, gets a score of 100, any title getting four or five class citations get a score of 1, and everything in between is ranked on a percentage scale. Thus, on their overall list, Plato’s “Republic” has a score of 99.9 — even though its raw count of 3573 is only 90.8% of “Elements of Style”‘s 3934 count. Whatever.

Still, their metric is somewhat similar to the metric, or index, or score, I’ve been setting up to combine and rank the data in sfadb.com — except mine is a little more complex. (Note that these sfadb metrics are in-work and not yet posted, but they’ve been in development for several months.)

I’ll preview this content, which I expect to go online in the next few months, as follows:

1) Each book or story accumulates various references to awards (nominations and wins), citations (references in academic volumes or expert lists, almost all about books), and anthologies and collections (almost all about short fiction).

2) Rather than just tallying raw counts of these three references, the various references are weighted by (my editorial opinion of) their importance and contribution to the site’s overall goal of identifying the most significant books and stories for each year over the past century or more. This is done by awarding each title not just a single tally for each reference, but a weighted score depending on the reference. To take a simple example, a Hugo or Nebula award is worth more than any number of relatively minor awards; a reprint in an anthology that presents itself as a definitive volume (e.g. The SF Hall of Fame, or any number of Hartwell tomes) is worth more than any individual anthology characterized by theme (i.e. a theme such as cats, or time travel).

3) Next, the combined scores of every book and story are weighted against the combined *possible* score for any work published in that particular year. This is to remove the advantage recent books and stories have, among awards, given that there are so many awards in recent decades, compared to books and stories that appeared in earlier decades, especially the 1950s and ’40s, when there were few or no awards. Citation and anthology references are similarly allotted depending on the scope of each source.

4) Finally, after the previous steps have produced a percentage score for each book or story, the highest ranking book or story is awarded a 100% score, and all lower ranking items are scaled accordingly. This is where my metric is similar to the syllabus ranking… except that my notion was to score lower titles, percentage-wise, by the accumulated score, not some adjusted score based on the span of the complete list.

My latest notion (which I recorded in my development log on November 1st) is to call the overall score an ‘ARC Score’, or perhaps an ‘ARC Index’, where ARC is Awards/Reprints/Citations. This would entail renaming the tab on the main sfadb.com site from ‘anthologies’ to ‘reprints’.

Some of this weighting already appears on the current site, as on the annual Ranked Awards Titles page, e.g. for 2015, with the weighting of various awards implicit and not explicitly defined.

This entry was posted in science fiction, Website Issues. Bookmark the permalink.