![]() |
#1 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: Nov 2024
Device: Tolino 4HD
|
help on Heise CT recipe
Hi There,
the Heise CT recipe works so far, after I modified slightly the login part, as it was outdated. Anyway, I could see that all Articles are ending after one page, and the rest of the Article is no longer there. I could see that if I use a normal Browser, there is no Paging and no JavaScript used for the article text itself. When I comment out the remove tags part I could see that the pages really looks like not completely downloaded. Not sure what I can do. I tried to include browser_type = 'webengine' to check if may the bot check is blocking it, but including this parameter I get the following error: AttributeError: 'WebEngineBrowser' object has no attribute 'select_form' Any help would be appreciated, MP |
![]() |
![]() |
![]() |
#2 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: Nov 2024
Device: Tolino 4HD
|
OK, I could see now, that it looks like the DOM is may not ready in webengine or qt? did I miss here something for the usage when it is not mechanize?
def get_browser(self): br = BasicNewsRecipe.get_browser(self) if self.username is not None and self.password is not None: loginURL = 'https://www.heise.de/sso/login?forward=%2Fselect' br.open(loginURL) br.select_form(action='/sso/login/login/nojs') br['username'] = self.username br['password'] = self.password br.submit() return br |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: Nov 2024
Device: Tolino 4HD
|
OK, I made an update to 7.20 from 7.19, and now the pages are fully downloaded.
Anyway, I would like to understand still why other browser_types are not working. so any help on this would be appreciated. |
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,320
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That's because I never got around to implementing select_form() for the non mechanize browsers.
|
![]() |
![]() |
![]() |
#5 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: Nov 2024
Device: Tolino 4HD
|
Is there a way, to stop deleting the created Temporary files? If so I could check what's the content and to check may the reason, or to build may a other DOM access for it?
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,320
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Not sure what you are asking. If you want to read the html you canalways do so from the browser object.
|
![]() |
![]() |
![]() |
#7 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: Nov 2024
Device: Tolino 4HD
|
During the process several Files will be created in Temp folder. But the files are deleted at the end of the process (only some final files will stay as epub). When there would be a switch not to delete them could help what is happening for all (also during the cleanup for example).
|
![]() |
![]() |
![]() |
#8 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,320
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
As I said read the html you want from the browser object and save it wherever you like on the filesystem.
|
![]() |
![]() |
![]() |
#9 |
Enthusiast
![]() Posts: 27
Karma: 10
Join Date: Nov 2024
Device: Tolino 4HD
|
|
![]() |
![]() |
![]() |
#10 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Dec 2024
Device: Pocketbook Inked EO, Tolino Shine 3
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
recipe for Heise-online - german (almost all subjects) | schuster | Recipes | 3 | 12-06-2012 01:43 PM |
recipe for Heise Newsticker - german | schuster | Recipes | 0 | 05-14-2011 12:45 PM |
Heise.de: Biegsames E-Paper | KernelPanic | Andere Lesegeräte | 0 | 10-23-2009 07:20 PM |
Heise Artikel über Kindle im Ausland | Alexander Turcic | Amazon Kindle | 0 | 06-03-2008 11:36 AM |
Heise gives iRex iLiad a thumb up | Alexander Turcic | iRex | 3 | 06-08-2006 12:02 PM |