Features and moans.

Monitoring the URLs that people submit to Greycite helps me find new sources of metadata, and, of course, new quirks and ways that people ignore standards. So now Greycite supports title tags that are embedded in the body rather than the head (I’m looking at you phdcomics [1]), Parsely metadata [2] which is supported by some publications such as The Atlantic [3], and and the use of a “title” meta tag as used by the American University Law Review [4] to encode paper titles (they don’t have any other useful metadata though). Interestingly Squidoo [5] appears to have supported Parsely in the past but no longer does so for some reason.

References

  1. "PHD Comic: What is Aging?"http://www.phdcomics.com/
  2. "Parse.ly Crawler — parselyapi 0.1 documentation"http://www.parsely.com/api/crawler.html
  3. "The Atlantic — News and analysis on politics, business, culture, technology, national, international, and life – TheAtlantic.com", The Atlantichttp://www.theatlantic.com/
  4. "American University Law Review", American University Law Reviewhttp://www.aulawreview.org/
  5. "Squidoo is now HubPages", Squidoohttp://www.squidoo.com/