Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 07-16-2012, 07:43 AM   #1
Steven630
Zealot
Steven630 began at the beginning.
 
Posts: 142
Karma: 10
Join Date: May 2012
Device: Kindle Paperwhite2
captureing all articles but all under the first section

Here is how the webpage looks like

Code:
<div class='module'>

<h3>Section1</h3>

<ul>
<li>
<h4>articles links and article titles</h4>
</li>
……(other articles)
</ul>

<h3>Section2</h3>

<ul>
<li>
<h4>articles links and article titles</h4>
</li>
…… (other articles)
</ul>
So I tried to parse it, setting 'module' as the section.:

But it turned out that though all the articles were fetched correctly, they all end up in the first section.But I am at a loss what to do, since all section names are included in "h3", and unlike webpage of built-in recipes, <div class='module'> appears only before the first section, not every section (which I think explains the failure). Can anyone help me out? Just a quick answer is appreciated.

Last edited by Steven630; 07-19-2012 at 03:52 AM.
Steven630 is offline   Reply With Quote
Old 07-17-2012, 12:10 AM   #2
Steven630
Zealot
Steven630 began at the beginning.
 
Posts: 142
Karma: 10
Join Date: May 2012
Device: Kindle Paperwhite2
I think it has something to do with section.find('h3') this only find the first one. But when I changed this to section.findAll('h3') , it failed to fetch anything (not even the first section).
Steven630 is offline   Reply With Quote
Old 07-17-2012, 01:44 AM   #3
Steven630
Zealot
Steven630 began at the beginning.
 
Posts: 142
Karma: 10
Join Date: May 2012
Device: Kindle Paperwhite2
Just edited the thread starter to reflect a string I didn't mention.
Steven630 is offline   Reply With Quote
Old 07-18-2012, 09:40 AM   #4
Steven630
Zealot
Steven630 began at the beginning.
 
Posts: 142
Karma: 10
Join Date: May 2012
Device: Kindle Paperwhite2
I changed the recipe and made every single "ul" as a section, leaving out section title. This time, Calibre correctly divided articles into several sections. But the drawback is that each section has no name. I'm thinking about grabbing h3 and add those to section names. Is that possible? Or can I just enter the section name in the recipe one by one?
Steven630 is offline   Reply With Quote
Old 07-19-2012, 01:18 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,899
Karma: 5035037
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
When you find a ul use ul.findPreviousSibling('h3') to get the title. Read the beautifulsoup documentation, it will teach you how to do this, and many other things.
kovidgoyal is online now   Reply With Quote
Old 07-19-2012, 02:10 AM   #6
Steven630
Zealot
Steven630 began at the beginning.
 
Posts: 142
Karma: 10
Join Date: May 2012
Device: Kindle Paperwhite2
Quote:
Originally Posted by kovidgoyal View Post
When you find a ul use ul.findPreviousSibling('h3') to get the title. Read the beautifulsoup documentation, it will teach you how to do this, and many other things.
Thank you so much! I'm been looking for something like that.
Steven630 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Kindle Fire: also missing "section/articles" in all news mkgtu Devices 3 11-21-2011 10:14 AM
What do you think I would ask for in this section? Saras Which one should I buy? 4 01-20-2011 03:15 PM
cant seem to keep my articles! marbs Recipes 2 11-18-2010 12:56 PM
PRS-600 Articles like this scottjl Sony Reader 31 12-30-2009 05:41 AM


All times are GMT -4. The time now is 10:03 PM.


MobileRead.com is a privately owned, operated and funded community.