Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-30-2012, 12:08 AM   #1
sk1
Member
sk1 began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Dec 2011
Device: Nook Simple Touch
Rules of Civil Procedure epub from website

I am looking for a way to convert a website containing the Tennessee Rules of Civil Procedure to a periodical-like epub, with each Rule (e.g. Rule 4) treated as a Section and each sub-rule (e.g. Rule 4.01) treated as an article.

The Table of Contents is here:
http://www.tncourts.gov/courts/supre...ivil-procedure

I am not sure whether using a recipe (or calibre) is the best or correct way to go about this, and would love some guidance. Thanks much!
sk1 is offline   Reply With Quote
Old 01-30-2012, 10:01 AM   #2
a.peter
Enthusiast
a.peter began at the beginning.
 
Posts: 28
Karma: 10
Join Date: Sep 2011
Device: Sony PRS-350, Kindle Touch
Quote:
Originally Posted by sk1 View Post
I am not sure whether using a recipe (or calibre) is the best or correct way to go about this, and would love some guidance. Thanks much!
I've took a look at the web site using Firebug. I just hat to modify an old recipe for a newspaper to have it work with the Tennessee rules page.

Your idea is right. I've used the rule number an the optional text as a 'ressort' and each sub-rule as a 'article' inside the 'ressort'.

Here it is:

Spoiler:
Code:
from calibre.web.feeds.recipes import BasicNewsRecipe
import re

class TennesseeRulesCivilProcedure(BasicNewsRecipe) :
    __author__    = 'ape'
    __copyright__ = 'ape'
    __license__   = 'GPL v3'
    language      = 'de'
    description   = 'Tennessee State Courts: Rules of civil procedure'
    version       = 1
    title         = u'TN Courts: Rules of civil procedure'
    timefmt       = ' [%d.%m.%Y]' 

    no_stylesheets = True
    remove_javascript = True
    use_embedded_content = False
    publication_type = 'newspaper'
    
    keep_only_tags = [dict(name='div', attrs={'id':'main-content'})]
    
    INDEX = 'http://www.tncourts.gov/courts/supreme-court/rules/rules-civil-procedure/'
    
    def parse_index(self):
        base = 'http://www.tncourts.gov'
        ressorts = []
        articles = {}
        more = 1
        
        soup = self.index_to_soup(self.INDEX)
        
        # Get list of links to ressorts from index page
        rules = soup.findAll('table', attrs={'class': re.compile('views-table')})
        for rule in rules:
            caption = rule.findAll('caption')[0].string
            articles[caption] = []
            ressorts.append(caption)
            sub_rules = rule.findAll('td', attrs={'class': re.compile('views-field')})
            for sub_rule in sub_rules:
                article = {'title': sub_rule.contents[0].strip() + ' ' + sub_rule.a.string, 'date': u'', 'url': base + sub_rule.a['href'], 'description': sub_rule.a.string}
                articles[caption].append(article)
        answer = [(ressort, articles[ressort]) for ressort in ressorts if articles.has_key(ressort)]
        # answer structure:
        # [('genre1', [{'date': ..., 'url': ..., 'description': ..., 'title': ...},
        #              {'date': ..., 'url': ..., 'description': ..., 'title': ...}]),
        #  ('genre2', [{'date': ..., 'url': ..., 'description': ..., 'title': ...}])]
        # List[ Tuple( genre, liste[{artikel},...]), Tuple( genre, liste[{artikel},...])]
        return answer
        
    def get_masthead_url(self):
        return 'http://www.tncourts.gov/sites/all/themes/tncourts/assets/images/logo-new.png'


The file is here: TennesseeRulesCivilProcedure.recipe.txt.

I hope that you don't mind that i've done the whole recipe for you. Don't hesitate to ask any questions.
a.peter is offline   Reply With Quote
Advert
Old 01-30-2012, 11:53 AM   #3
sk1
Member
sk1 began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Dec 2011
Device: Nook Simple Touch
a.peter: you're my hero. I'm at work right now, but I will test this when I'm home. You rock, seriously.
sk1 is offline   Reply With Quote
Old 01-31-2012, 01:53 PM   #4
sk1
Member
sk1 began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Dec 2011
Device: Nook Simple Touch
This worked extremely well. Thanks so much!
sk1 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Distributing free epub book from my own website mbishop ePub 13 10-24-2012 06:25 PM
Website > Ebook : ePub converter? re838uk ePub 9 07-13-2011 08:24 AM
Converting entire website to ePub... sharp21 Conversion 4 05-31-2011 12:00 PM
Epub as a website pittendrigh Introduce Yourself 4 03-29-2011 06:36 AM
epub file website downloads stunev ePub 3 07-23-2010 12:44 PM


All times are GMT -4. The time now is 11:16 AM.


MobileRead.com is a privately owned, operated and funded community.