View Single Post
Old 05-19-2010, 08:46 AM   #1941
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by gambarini View Post
First Question:
Is there the option to add one or more lines (like the signature of the article, when the signature is a gif and it is into a table (td) withouth tag) to the downloaded article?
I'm not 100% certain what you are asking. Preprocess_html or postprocess_html will let you add anything you want. You can add tags to the html with any content, including images. On your question about the table, are you asking how to put things into a table, or how to extract it from a table? Generally, both are possible with BeautifulSoup.

Quote:
Second Question:
some newspaper give the opportunity to read the entire newspaper in various format (a jpg for every page, or a single pdf file for every page) directly in the browser. Is there the possibility to download these files? i
Now i use the first jpg (pdf) for the cover image, so i am able to find the correct page and the correct date, but it is only initial page, and with a fixed resolution.
At least this is a good option to obtain an overall image of all the newspaper, though it is not give a comfortable reading.
Are you asking how to split up pdfs to get images found on pages 2 and beyond, or how to use content you already have access to?
Starson17 is offline