View Single Post
Old 04-19-2010, 03:05 PM   #1808
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by gambarini View Post
Yes, it is a powerful solution... but not simple.
Agreed. Once you understand BeautifulSoup and pre and post processing of html, you can do almost anything with a page. During the Olympics I used it to parse a Flash-based slideshow of photos. The Flash code on page 1 included a URL that pointed to XML data elsewhere on the web. BeautifulSoup let me extract the address for that data from the scripting on page 1. The XML data had pointers to photo images, with titles and comments for each photo for the Flash code to use. BeautifulSoup then let me extract the XML data and build a custom virtual page with each photo being labeled and having a comment. That custom page, despite not really existing anywhere, was passed to Calibre's recipe handler to build the EPUB.

Basically, BeautifulSoup will let you remove elements, swap or add elements, find elements, construct new pages, etc. IIRC, multipage recipes grab article text from subsequent pages and paste it into the first page before the first page gets processed by the recipe.

Last edited by Starson17; 04-19-2010 at 03:08 PM.
Starson17 is offline