![]() |
#1 |
Junior Member
![]() Posts: 9
Karma: 10
Join Date: Dec 2009
Device: Nook
|
Python/Calibre question
The website I am trying to parse has lists on it, and it is messing up my python script. What I get for a result is just the first item of what I thought I should receive. In the following example I get "Page one" properly, as well as "TextName1" properly. I do not get "TextName2" or "TextName3". I need to be able to decide whether the item is a sectionHeader or a linklist later. Thanks for any help!
The Python is: for div in soup.findAll(True, attrs={'class':['sectionHeader', 'linklist']}, recursive=True): The basic structure of the HTML is: <h3 class="sectionHeader">Page one</h3> <ul class=linklist> <li><a href="......">TextName1</a> <span class="attr">More Text</span></li> <li><a href="......">TextName2</a> <span class="attr">More Text</span></li> <li><a href="......">TextName3</a> <span class="attr">More Text</span></li> </ul> |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Calibre custom news feed and python help. | harrynewman | Calibre | 4 | 10-08-2009 09:26 AM |
Python 2.5 and Calibre | FizzyWater | Calibre | 1 | 03-27-2009 02:15 AM |
Having some trouble with Calibre 0.4.109(python upgrade) | ould | Calibre | 13 | 12-04-2008 03:28 PM |
calibre python-lxml problem on ubuntu | carpii | Calibre | 5 | 11-29-2008 05:34 AM |
Calibre and Python: do they get along? | zander_nyrond | Calibre | 7 | 07-20-2008 06:54 PM |