Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-30-2012, 12:08 AM   #1
sk1
Member
sk1 began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Dec 2011
Device: Nook Simple Touch
Rules of Civil Procedure epub from website

I am looking for a way to convert a website containing the Tennessee Rules of Civil Procedure to a periodical-like epub, with each Rule (e.g. Rule 4) treated as a Section and each sub-rule (e.g. Rule 4.01) treated as an article.

The Table of Contents is here:
http://www.tncourts.gov/courts/supre...ivil-procedure

I am not sure whether using a recipe (or calibre) is the best or correct way to go about this, and would love some guidance. Thanks much!
sk1 is offline   Reply With Quote
Old 01-30-2012, 10:01 AM   #2
a.peter
Enthusiast
a.peter began at the beginning.
 
Posts: 28
Karma: 10
Join Date: Sep 2011
Device: Sony PRS-350, Kindle Touch
Quote:
Originally Posted by sk1 View Post
I am not sure whether using a recipe (or calibre) is the best or correct way to go about this, and would love some guidance. Thanks much!
I've took a look at the web site using Firebug. I just hat to modify an old recipe for a newspaper to have it work with the Tennessee rules page.

Your idea is right. I've used the rule number an the optional text as a 'ressort' and each sub-rule as a 'article' inside the 'ressort'.

Here it is:

Spoiler:
Code:
from calibre.web.feeds.recipes import BasicNewsRecipe
import re

class TennesseeRulesCivilProcedure(BasicNewsRecipe) :
    __author__    = 'ape'
    __copyright__ = 'ape'
    __license__   = 'GPL v3'
    language      = 'de'
    description   = 'Tennessee State Courts: Rules of civil procedure'
    version       = 1
    title         = u'TN Courts: Rules of civil procedure'
    timefmt       = ' [%d.%m.%Y]' 

    no_stylesheets = True
    remove_javascript = True
    use_embedded_content = False
    publication_type = 'newspaper'
    
    keep_only_tags = [dict(name='div', attrs={'id':'main-content'})]
    
    INDEX = 'http://www.tncourts.gov/courts/supreme-court/rules/rules-civil-procedure/'
    
    def parse_index(self):
        base = 'http://www.tncourts.gov'
        ressorts = []
        articles = {}
        more = 1
        
        soup = self.index_to_soup(self.INDEX)
        
        # Get list of links to ressorts from index page
        rules = soup.findAll('table', attrs={'class': re.compile('views-table')})
        for rule in rules:
            caption = rule.findAll('caption')[0].string
            articles[caption] = []
            ressorts.append(caption)
            sub_rules = rule.findAll('td', attrs={'class': re.compile('views-field')})
            for sub_rule in sub_rules:
                article = {'title': sub_rule.contents[0].strip() + ' ' + sub_rule.a.string, 'date': u'', 'url': base + sub_rule.a['href'], 'description': sub_rule.a.string}
                articles[caption].append(article)
        answer = [(ressort, articles[ressort]) for ressort in ressorts if articles.has_key(ressort)]
        # answer structure:
        # [('genre1', [{'date': ..., 'url': ..., 'description': ..., 'title': ...},
        #              {'date': ..., 'url': ..., 'description': ..., 'title': ...}]),
        #  ('genre2', [{'date': ..., 'url': ..., 'description': ..., 'title': ...}])]
        # List[ Tuple( genre, liste[{artikel},...]), Tuple( genre, liste[{artikel},...])]
        return answer
        
    def get_masthead_url(self):
        return 'http://www.tncourts.gov/sites/all/themes/tncourts/assets/images/logo-new.png'


The file is here: TennesseeRulesCivilProcedure.recipe.txt.

I hope that you don't mind that i've done the whole recipe for you. Don't hesitate to ask any questions.
a.peter is offline   Reply With Quote
Advert
Old 01-30-2012, 11:53 AM   #3
sk1
Member
sk1 began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Dec 2011
Device: Nook Simple Touch
a.peter: you're my hero. I'm at work right now, but I will test this when I'm home. You rock, seriously.
sk1 is offline   Reply With Quote
Old 01-31-2012, 01:53 PM   #4
sk1
Member
sk1 began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Dec 2011
Device: Nook Simple Touch
This worked extremely well. Thanks so much!
sk1 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Distributing free epub book from my own website mbishop ePub 13 10-24-2012 06:25 PM
Website > Ebook : ePub converter? re838uk ePub 9 07-13-2011 08:24 AM
Converting entire website to ePub... sharp21 Conversion 4 05-31-2011 12:00 PM
Epub as a website pittendrigh Introduce Yourself 4 03-29-2011 06:36 AM
epub file website downloads stunev ePub 3 07-23-2010 12:44 PM


All times are GMT -4. The time now is 04:43 PM.


MobileRead.com is a privately owned, operated and funded community.