Ok well davidfor with some help from your settings file some tweaking and a bit of testing with other pages I think I got a pretty solid settings. My only hangup is with the comments section. When the comments are only text it grabs it fine, but some have .gifs after them for new or high ratings. When it is those comments it displays the comment then a big space then (Â). When I look at the source html it shows (& #194;) without the space. And looking at the page source in browser it shows the image tags on a new line under the comments text. So I'm assuming that it is a carriage return or new line that doesn't encode right so the plugin sees it as (& #194;) then calibre turns it into (Â). Now it would be nice if I could get it to give me something unique to each image that I could later do a search and replace with to add that info into the comments, but I would also take just the comments dropped of the images, or the (& #194;) stuff. I have trouble understanding the strip/include text settings with having to escape certain things and what types of wildcards can be used. Like is there a way to say that once you reach that (& #194;) part to strip everything from then on? Or maybe once it reaches a punctuation (.,?,!) to then strip after that? But that might cause problems if the comments have more than 1 punctuation in them.
So what are your thoughts? comments? Do you have information or tips to help me?
Thanks
|