10-11-2010, 03:34 PM | #16 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Here's the list in my notes: Code:
['change_justification', 'extra_css', 'base_font_size', 'font_size_mapping', 'line_height', 'linearize_tables', 'smarten_punctuation', 'disable_font_rescaling', 'insert_blank_line', 'remove_paragraph_spacing', 'remove_paragraph_spacing_indent_size','input_encoding', 'asciiize', 'keep_ligatures'] |
|
10-11-2010, 03:43 PM | #17 | |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
Quote:
any other ideas? BTW, thank you very much for the help Kovid. some silly questions that i dont want sidetracking with my maya recipe. Spoiler:
Last edited by marbs; 10-11-2010 at 03:53 PM. Reason: wanted to ask starson17 a question |
|
Advert | |
|
10-11-2010, 03:55 PM | #18 |
creator of calibre
Posts: 44,017
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You're out of luck if your publication is only available as PDF, I'm afraid.
|
10-11-2010, 04:18 PM | #19 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Mostly, linearize_tables replaces <table>, <tr> and <td> tags with <div>. Instead of a table, you get a single column. It's handled better by small screen devices. Most of the other options are available on the Conversion screen. You add them to the recipe as follows: conversion_options = {'linearize_tables':True} You can add additional options separated by commas. |
|
10-11-2010, 04:39 PM | #20 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
noooooooooooooooooooooooo
|
Advert | |
|
10-11-2010, 04:42 PM | #21 |
creator of calibre
Posts: 44,017
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
PDF is not a format that will convert well, that's just the way it is. Dont think of PDF as an ebook format, think of it as a printed page. Now try to imagine writing an algorithm to convert a printed page to an ebook (that is essentially what all PDF conversion algorithms do).
|
10-11-2010, 05:02 PM | #22 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
but if my output is pdf in any case, and i think i read somewere that calibre converts all the articles and then merges them (i thin i saw something like that in the log file form the recipe) then why do anything? all i need is to get something that can merge the pdf files (the HTML articles that were converted and the pdf file) in order. maybe?
|
10-11-2010, 05:37 PM | #23 |
creator of calibre
Posts: 44,017
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
because calbre's recipe system (and calibre conversion system) work by fiorst converting the input to html.
|
10-12-2010, 04:59 AM | #24 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
still working on this.
i understand. ill try to get around it. thanks Kovid.
so i found this web site: http://www.pdfdownload.org/free-pdf-to-html.aspx it converts pdf to pics, page by page. you can do it with out the form like this: http://www.pdfdownload.org/pdf2html/pdf2html.php?url=your url here&images=yes i wrote code to see how this works out. it doesn't. any bright thoughts as to why i dont get my new HTML version of my articles? Spoiler:
|
10-18-2010, 05:01 AM | #25 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
i still need a bit more help on this one.
so i have changed what i what form my recipe. i will try to write the full version in pure python later, but now i want to do this as a recipe.
if you take a look here you will see a list links on the page. i want the article title to be "the 1st link text" - "the 2nd link text" right now it is just "the 2nd link next". the id of the 1st link is "CompNmHref" i just dont know how to do it with the for loop and soup. is there a "tag before" command in soup? because we are talikng about the tag befor the "item" in my code... Spoiler:
|
10-18-2010, 10:58 AM | #26 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
See here: http://www.crummy.com/software/Beaut...reviousSibling |
|
10-20-2010, 08:36 AM | #27 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
if i understand correctly
which i dont, i want someting like this:
print soup.item.previous.previousSibling i want to go to the previous <tr> tag and then i want the sibling befor that. not working. |
10-20-2010, 09:35 AM | #28 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
That's the question you're asking, and to answer it, you just print the entire soup, or the previous element, or the previous sibling to figure out where you've gone wrong. Be aware that you should look at the soup, and not just the page source. BeautifulSoup loads the page source into its database, and as it does that, it fixes errors and makes other modifications that may not be apparent in the page itself. |
|
10-23-2010, 01:55 PM | #29 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
ha ha!
go it!
it works now. Spoiler:
now i want to build a loop like in C language. ill write it in psudo code: index = 1 if the article count reaches 30, then post request rsSearchRes_pgNo=index + 1 my instinct says i would do it with recursion. but i am not sure that is wise python.... can you point me in the right direction? |
10-23-2010, 02:17 PM | #30 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Congratulations.
Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
New recipe | kiklop74 | Recipes | 0 | 10-05-2010 04:41 PM |
New recipe | kiklop74 | Recipes | 0 | 10-01-2010 02:42 PM |
New Title from Book View Cafe: A Princess of Passyunk by Maya Kaathryn Bohnhoff | suelange | Self-Promotions by Authors and Publishers | 0 | 08-11-2010 04:35 PM |
Recipe Help | lrain5 | Calibre | 3 | 05-09-2010 10:42 PM |
Recipe Help Please | estral | Calibre | 1 | 06-11-2009 02:35 PM |