Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 03-21-2013, 08:45 PM   #1
realbase
Junior Member
realbase began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
Dutch Weekly Newspaper "De Groene Amsterdammer" Recipe

This is the Recipe I'd built for "De Groene Amsterdammer" (for readers with a subscription only). They publish their new edition each wednesday-evening. I set mine to thursday-morning. Have fun with it!

Code:
#!/usr/bin/env  python2
# -*- coding: utf-8 -*-
#Based on veezh's original recipe and Kovid Goyal's New York Times recipe and Snaab's NRC-epub recipe

__license__   = 'GPL v3'
__copyright__ = '2013, RealBase'

'''
www.groene.nl
'''
import os, zipfile
import time
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile
from calibre.ebooks.conversion.cli import main

class GroeneAmsterdammer(BasicNewsRecipe):

    title = u'De Groene Amsterdammer'
    description = u'De ePub-versie van de Groene Amsterdammer'
    language = 'nl'
    lang = 'nl-NL'
    needs_subscription = True

    __author__ = 'Realbase'

    conversion_options = {
        'no_default_epub_cover' : True
    }

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            br.open('https://www.groene.nl/sessie/new')
            print [form for form in br.forms()][1]          
            br.select_form(nr=1)
            br['user_session[login]']   = self.username
            br['user_session[password]'] = self.password
            br.submit()
        return br

    def build_index(self):

        domain = "http://www.groene.nl"

        url = domain + "/deze-week.epub"
        #print url

        try:
            br = self.get_browser()
            f = br.open(url)

        except:
            self.report_progress(0,_('Kan niet inloggen om editie te downloaden'))
            raise ValueError('Groene van deze week nog niet beschikbaar')

        tmp = PersistentTemporaryFile(suffix='.epub')
        self.report_progress(0,_('downloading epub'))
        tmp.write(f.read())
        f.close()
        br.close()
        tmp.close()

        # convert
        self.report_progress(0.2,_('Converting to OEB'))
        oebdir = self.output_dir + '/INPUT/'
        main(['ebook-convert', tmp.name, oebdir])
        index = os.path.join(oebdir, 'content.opf')
        self.report_progress(1,_('epub downloaded and extracted'))


        return index
Please leave a comment if you use it!
realbase is offline   Reply With Quote
Old 03-22-2013, 12:28 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,359
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You should not call ebook-convert like that, it wont work if ebook-convert is not in the path. If you want to run a conversion, do it like this

from calibre.ebooks.conversion.cli import main
main(['ebook-convert', input_file, output_file])
kovidgoyal is offline   Reply With Quote
 
Advertisement
Old 03-22-2013, 08:44 AM   #3
realbase
Junior Member
realbase began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
The recipe above works in Calibre.
Mine version is just a trial and error of combining some code.

But how do you suggest implenting your part of the code, to make it more clean? I tried some different ways, but it doesn't work out. Error all the way.
realbase is offline   Reply With Quote
Old 03-22-2013, 10:19 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,359
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
What error do you get?
kovidgoyal is offline   Reply With Quote
Old 03-22-2013, 01:34 PM   #5
realbase
Junior Member
realbase began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
I just tried to remove the 'convert'-part, since I don't understand why I use it.
(I'm a newbie, thats why )

This is the code i tried to use.

Code:
#!/usr/bin/env  python2
# -*- coding: utf-8 -*-
#Based on veezh's original recipe and Kovid Goyal's New York Times recipe and Snaab's NRC-epub recipe

__license__   = 'GPL v3'
__copyright__ = '2013, RealBase'

'''
www.groene.nl
'''
import os, zipfile
import time
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile
from calibre.ebooks.conversion.cli import main

class GroeneAmsterdammer(BasicNewsRecipe):

    title = u'De Groene Amsterdammer'
    description = u'De ePub-versie van de Groene Amsterdammer'
    language = 'nl'
    lang = 'nl-NL'
    needs_subscription = True

    __author__ = 'Realbase'

    conversion_options = {
        'no_default_epub_cover' : True
    }

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            br.open('https://www.groene.nl/sessie/new')
            print [form for form in br.forms()][1]          
            br.select_form(nr=1)
            br['user_session[login]']   = self.username
            br['user_session[password]'] = self.password
            br.submit()
        return br

    def build_index(self):

        domain = "http://www.groene.nl"

        url = domain + "/deze-week.epub"
        #print url

        try:
            br = self.get_browser()
            f = br.open(url)

        except:
            self.report_progress(0,_('Kan niet inloggen om editie te downloaden'))
            raise ValueError('Groene van deze week nog niet beschikbaar')

        tmp = PersistentTemporaryFile(suffix='.epub')
        self.report_progress(0,_('downloading epub'))
        tmp.write(f.read())
        f.close()
        br.close()
        tmp.close()

        index = os.path.join(self.output_dir, 'metadata.opf')
       
	
        return index

Then I get this error
Code:
Download nieuws van De Groene Amsterdammer
Resolved conversion options
calibre version: 0.9.23
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0,
 'book_producer': None,
 'change_justification': 'original',
 'chapter': None,
 'chapter_mark': 'pagebreak',
 'comments': None,
 'cover': None,
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': False,
 'dont_download_recipe': False,
 'dont_split_on_page_breaks': True,
 'duplicate_links_in_toc': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'epub_flatten': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': None,
 'fix_indents': True,
 'flow_size': 260,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x108249590>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0,
 'linearize_tables': False,
 'lrf': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'no_chapters_in_toc': False,
 'no_default_epub_cover': False,
 'no_inline_navbars': False,
 'no_svg_cover': False,
 'output_profile': <calibre.customize.profiles.KoboReaderOutput object at 0x108249d10>,
 'page_breaks_before': None,
 'prefer_metadata_cover': False,
 'preserve_cover_aspect_ratio': False,
 'pretty_print': True,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': None,
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': '',
 'search_replace': None,
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'sr1_replace': '',
 'sr1_search': '',
 'sr2_replace': '',
 'sr2_search': '',
 'sr3_replace': '',
 'sr3_search': '',
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'test': False,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
Python function terminated unexpectedly: 'NoneType' object has no attribute 'rfind'
InputFormatPlugin: Recipe Input running
Using custom recipe
<POST https://www.groene.nl/sessie application/x-www-form-urlencoded
  <HiddenControl(authenticity_token=*****=) (readonly)>
  <TextControl(user_session[login]=)>
  <PasswordControl(user_session[password]=)>
  <SubmitControl(commit=Login) (readonly)>>
<POST https://www.groene.nl/sessie application/x-www-form-urlencoded
  <HiddenControl(authenticity_token=*****=) (readonly)>
  <TextControl(user_session[login]=)>
  <PasswordControl(user_session[password]=)>
  <SubmitControl(commit=Login) (readonly)>>
Parsing all content...
Traceback (most recent call last):
  File "/Applications/calibre.app/Contents/Resources/Python/lib/python2.7/site.py", line 147, in main
    return run_entry_point()
  File "/Applications/calibre.app/Contents/Resources/Python/lib/python2.7/site.py", line 116, in run_entry_point
    return getattr(pmod, func)()
  File "site-packages/calibre/utils/ipc/worker.py", line 189, in main
  File "site-packages/calibre/gui2/convert/gui_conversion.py", line 25, in gui_convert
  File "site-packages/calibre/ebooks/conversion/plumber.py", line 1018, in run
  File "site-packages/calibre/ebooks/conversion/plumber.py", line 1183, in create_oebbook
  File "site-packages/calibre/ebooks/oeb/reader.py", line 67, in __call__
  File "site-packages/calibre/ebooks/oeb/base.py", line 458, in __init__
  File "lib/python2.7/posixpath.py", line 96, in splitext
  File "lib/python2.7/genericpath.py", line 91, in _splitext
AttributeError: 'NoneType' object has no attribute 'rfind'
realbase is offline   Reply With Quote
Old 03-22-2013, 01:56 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,359
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You need to put the convert code back. What you want to do is unzip the epub and return the path to the opf inside the epub from build_index(). The way to do that is either to unzip it, or convert it to oeb.
kovidgoyal is offline   Reply With Quote
Old 04-19-2013, 05:07 PM   #7
realbase
Junior Member
realbase began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
I retried to trie to do something with your feedback.

(I followed the same route as this guy does: http://l_uka.pentax.org.pl/calibre/biweekly.recipe )

Now the code looks like this, and it really works:
Code:
#!/usr/bin/env  python2
# -*- coding: utf-8 -*-
#Based on veezh's original recipe and Kovid Goyal's New York Times recipe and Snaab's NRC-epub recipe

__license__   = 'GPL v3'
__copyright__ = '2013, RealBase'

'''
www.groene.nl
'''
import os, zipfile
import time
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile
from calibre.ebooks.conversion.cli import main

class GroeneAmsterdammer(BasicNewsRecipe):

    title = u'De Groene Amsterdammer'
    description = u'De ePub-versie van de Groene Amsterdammer'
    language = 'nl'
    lang = 'nl-NL'
    needs_subscription = True

    __author__ = 'Realbase'

    conversion_options = {
        'no_default_epub_cover' : True
    }

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            br.open('https://www.groene.nl/sessie/new')
            print [form for form in br.forms()][1]          
            br.select_form(nr=1)
            br['user_session[login]']   = self.username
            br['user_session[password]'] = self.password
            br.submit()
        return br

    def build_index(self):

        domain = "http://www.groene.nl"

        url = domain + "/deze-week.epub"
        #print url

        try:
            br = self.get_browser()
            f = br.open(url)

        except:
            self.report_progress(0,_('Kan niet inloggen om editie te downloaden'))
            raise ValueError('Groene van deze week nog niet beschikbaar')

   
        self.report_progress(0,_('downloading epub'))
        book_file = PersistentTemporaryFile(suffix='.epub')   
        book_file.write(f.read())
        f.close()
        br.close()
        book_file.close()

        # convert
        self.report_progress(0.2,_('Converting to OEB'))
        oebdir = self.output_dir + '/INPUT/'
        main(['ebook-convert', book_file.name, oebdir])

        #feed calibre
        index = os.path.join(oebdir, 'content.opf')
        return index
realbase is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Make a recipe for Dutch Magazine "Groene Amsterdammer" realbase Recipes 0 03-21-2013 08:05 PM
Recipe for german newspaper "Berliner Zeitung" a.peter Recipes 1 12-13-2011 04:02 PM
Recipe for Dutch newspaper "Dagblad van het Noorden" reijndert Recipes 2 05-18-2011 08:52 AM
Recipe for Dutch newssite "Hallo Assen" reijndert Recipes 0 04-13-2011 03:12 PM
Calibre recipe for daily Portuguese newspaper "Correio da Manhã" jmst Recipes 2 11-01-2010 02:01 PM


All times are GMT -4. The time now is 05:01 AM.


MobileRead.com is a privately owned, operated and funded community.