Meta data scraper for Danish books
I have been looking for plugin that can scrape meta data from Danish sites, and didn't find any.
So I downloaded a few other plugins to see the code, and it is a bit too complex for me - I never tried coding in Python.
I browsed a few possible sites, and would recommend mofibo.dk or saxo.dk.
Mofibo.dk seems to have most meta data - there is a category, tags, series information and ratings. The search engine is a bit weird though. If I search for a specific ISBN, the book will show up among a few others. If I click on the book, then it expands within the page with some information. If I right click and open the book in a new tab, then I have all the information available including the ISBN.
Saxo.dk has a better search engine, but the meta data seems inferior. Often it will only contain the name of a series, but not the number. Other times the tags will include stuff like age recommendations.
If anyone is up for the task, I will gladly help with everything I can - crawling through the HTML tags, translating stuff, finding good examples and test the plugin.
|