Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > KOReader

Notices

Reply
 
Thread Tools Search this Thread
Old 01-23-2016, 09:07 AM   #1
gummihuhn
Enthusiast
gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.
 
Posts: 35
Karma: 28904
Join Date: Aug 2015
Device: none
A Do It Yourself "Read It Later" Service for Koreader

In addition to reading a lot of books, I read a lot of news. I love KOreader, and like to read using that tool as much as I can. I've experimented with a lot of tools: Pocket, Wallabag, Calibre, Calibre2OPDS, COPS and many more. But none of them provided the simple, seamless integration of my reading list with KOreader that I desired. So I pieced together my own using some really nice open source tools.

The tools:

Syncthing
I use Syncthing to sync my books between my computers and my devices running KOreader. I also use it to sync a lot of other things between devices. Syncthing is open source, peer-to-peer (no server required) sync software available for a wide variety of platforms. Even if you aren't interested in the "Read It Later" solution I describe in this post you should consider using Syncthing to sync your device(s) running KOreader. There is an Android app, instructions for Kindle Touch and just this week, thanks to tshering, a simple installer for Kobos running KSM. It should be fairly easy to put Syncthing on other e-reader devices. Even if you can't or don't want to install Syncthing on your device, you can use Syncthing for a very easy USB sync solution.

Five Filters
Five Filters offers a variety of content-related tools that may be of interest. The one I use most heavily (and the one used in my "Read It Later" solution) is called "Push to Kindle". Don't worry, despite the name a Kindle is not required. If you submit the URL of web page to this tool, "Push to Kindle" creates a nicely formatted .epub, .mobi or .pdf which can be emailed to your Kindle device (hence the "Push to Kindle" name) or downloaded to your computer. (Note that if you prefer to run "Push to Kindle" on your own server, an open source release is coming soon.

Pandoc (optional)
Pandoc is an open source document converter. It is a very powerful (albeit complicated) tool. I use it as a backup to the Five Filters downloads, since for some unknown reason images are stripped from the Five Filters epubs. For some of the websites I follow, the images are very important (eg, financial charts) so I use Pandoc to generate epubs for them. The downside is that the outputted epubs are not nearly as pretty as their Five Filters counterparts. I'm sure that this could be fixed with stylesheets etc but I have not looked into this. Again, using Pandoc is entirely optional. It is available for a wide variety of platforms. If you need to install from source (this won't apply to most people), I recommend creating a "relocatable binary".

A Simple Script
Here is a simple script I wrote to use these tools together:

Code:
#!/bin/bash

# a simple script to download an epub version of a given web page from http://fivefilters.org/kindle-it/
# or (optionally) generate an epub version of the given web page using Pandoc (http://pandoc.org/)

# change the next line to the absolute output path where you would like the epub to be saved inlcuding the trailing '/'
savepath="$HOME/Documents/"

# OPTIONAL: the absolute path to the list of domains for which you want epubs with images (less pretty output)
# Use one fully qualified domain name (https://en.wikipedia.org/wiki/Fully_qualified_domain_name) per line.
# Pandoc must be installed to use this feature.
pandoclist="$HOME/.config/pandoclist"

now=$(date +"%s")                       # store the current time
url=$1                                  # store the input URL
furl=${url#*://}                        # remove the 'http://' or 'https://' from the input URL
domain=$( echo "$furl" |cut -d/: -f1 )  # get the domain for checking against Pandoc list

# the next line contains the options to pass to Five Filters
durl='http://fivefilters.org/kindle-it/send.php?context=download&format=epub&url='
durl+=$furl                             # construct the full URL of the epub request URL

oname=$(basename $url)                  # save the last part of the URL, which we will use to name the epub
oname="${oname%.*}"                     # remove the file extension (eg .html)
oname+=-"$now"			        # add a timestamp to prevent overwriting of files with same name
oname+='.epub'                          # add the .epub file extension to the output name
opath=$savepath$oname                   # define the absolute path to the output file

if grep -Fxq $domain $pandoclist        # check for match in the Pandoc list 
then
    pandoc -r html $url -t epub -o $opath       # generate the epub and store it in the specified directory
else
    wget -b -q $durl -O $opath                  # download the epub and store it in the specified directory
fi
Putting It All Together
  1. Install Syncthing on a computer. Optionally, also install Pandoc on the same computer.
  2. [Install Syncthing on your KOreader device(s), or set up your Koreader device(s) for simple USB sync.
  3. Configure the folders to be synced between your computer(s) and your KOreader devices(s). See http://docs.syncthing.net/intro/getting-started.html. I use one folder ("Books", with subfolders) for my books, and another folder ("News") for epubs gathered with the above tools.
  4. Put my simple script on your computer and make sure it is executable. Make sure you edit it to set where the epubs should be saved (this should be the same as one of your synced folders), and optionally, the location of your list of websites for which Pandoc should be used instead of Five Filters.
  5. Now test it. From the command line, in folder where your script is:
    ./<name of script> <URL of web page>
  6. Assuming it is working as you like it, set up your browser, RSS aggregator etc to pass a URL to the script with a simple keyboard shortcut. This is left as an exercise for the reader.

Now, at the press of a couple of buttons on your computer, any URL you desire will be turned into an epub and automatically send to your KOreader device(s).

Enjoy! Suggested improvements or alternative approaches welcome.

Last edited by gummihuhn; 01-23-2016 at 09:10 AM. Reason: formatting
gummihuhn is offline   Reply With Quote
Old 01-24-2016, 06:26 AM   #2
Markismus
Guru
Markismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicing
 
Markismus's Avatar
 
Posts: 895
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
I usually use rsync on linux. Had a look at pandoc quite some time ago. Is it already a viable solution for Latex to epub without loss of formatting?
Markismus is offline   Reply With Quote
Old 01-24-2016, 08:05 AM   #3
gummihuhn
Enthusiast
gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.
 
Posts: 35
Karma: 28904
Join Date: Aug 2015
Device: none
Quote:
Originally Posted by Markismus View Post
I usually use rsync on linux.
rsync is a nice tool, which I use a lot. But once you want to do two-way sync and/or more than two devices are involved, I find Syncthing works really well.

Quote:
Originally Posted by Markismus View Post
Had a look at pandoc quite some time ago. Is it already a viable solution for Latex to epub without loss of formatting?
Sorry to say, I don't know. I've only used Pandoc for converting HTML to epub (haven't looked into what intermediate formats are used for that), and in that use case there is definitely a loss of formatting. I only use it for a couple of sites I follow where images are important, and for me the output is "good enough"-- at least until I find a tool that works better for this. Generating PDFs from the HTML is another option, which I may experiment with when I get some free time.
gummihuhn is offline   Reply With Quote
Old 01-25-2016, 04:26 AM   #4
Alan_S
Evangelist
Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.Alan_S ought to be getting tired of karma fortunes by now.
 
Alan_S's Avatar
 
Posts: 440
Karma: 1084584
Join Date: Aug 2007
Location: Sisak, Croatia
Device: Kobo Aura H2O, Kobo Aura ONE
You can check https://dotepub.com/

They also offer easy conversion of web pages into epub with images, but this also doesn't give great result. As I didn't tried Pandoc and don't know how bad result with it is, maybe dotepub gives same or similar result.

Please check and share how it works for you.
Alan_S is offline   Reply With Quote
Old 01-25-2016, 06:56 AM   #5
gummihuhn
Enthusiast
gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.
 
Posts: 35
Karma: 28904
Join Date: Aug 2015
Device: none
Quote:
Originally Posted by Alan_S View Post
You can check https://dotepub.com/

They also offer easy conversion of web pages into epub with images, but this also doesn't give great result. As I didn't tried Pandoc and don't know how bad result with it is, maybe dotepub gives same or similar result.

Please check and share how it works for you.
Thanks for that suggestion.

I did look at dotepub. Using the bookmarklet, I got better results than I've been getting with Pandoc. Unfortunately, it appears that the only way to use dotepub programmatically is to use their API, which requires you to parse the HTML yourself. If I parse the HTML, I've already solved the formatting issues with Pandoc, so dotepub doesn't offer much of an advantage.

If I can figure out how to grab the "Printer Friendly Format" link from sites where images are important and pass that URL to Pandoc, that should solve the problem. This probably requires site-specific configurations or "recipes", which I may play with at some point, but this isn't a huge priority for me at the moment.
gummihuhn is offline   Reply With Quote
Old 01-27-2016, 05:22 PM   #6
gummihuhn
Enthusiast
gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.
 
Posts: 35
Karma: 28904
Join Date: Aug 2015
Device: none
Quote:
Originally Posted by gummihuhn View Post
If I can figure out how to grab the "Printer Friendly Format" link from sites where images are important and pass that URL to Pandoc, that should solve the problem. This probably requires site-specific configurations or "recipes", which I may play with at some point, but this isn't a huge priority for me at the moment.
I've figured out how to make the output of Pandoc much nicer, including easy per-site configuration settings.

An example of the current output is attached. To customize output content for the source website of that epub, here is all I needed:

Code:
.date {display:none;}
.tophat {display:none;}
.persistent-header-placeholder {display:none;}
.lede-headline {display:none;}
.social-share {display:none;}
.article-rail {display:none;}
.terminal-tout {display:none;}
.read-this-next {display:none;}
.article-tags__tag {display:none;}
.article-tags__tag-link {display:none;}
.unsupported-browser {display:none;}
.footer {display:none;}
.footer__container {display:none;}
This is representative of the number of lines necessary for most other websites I've set up. Those settings won't change until the website gets redesigned (a fairly rare occurrence), so once a site is set up should just work. Getting the relevant CSS classes is pretty easy in any modern browser, even if you don't know CSS. And of course those per-site settings can be shared between people.

There are still some obvious improvements to be made, but I like the progress. After I've had some time to clean up my script and write up a how-to (probably this weekend), I'll post the updated script with instructions in case anyone is interested in trying it.
gummihuhn is offline   Reply With Quote
Old 01-30-2016, 10:48 AM   #7
gummihuhn
Enthusiast
gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.gummihuhn solves Fermat’s last theorem while doing the crossword.
 
Posts: 35
Karma: 28904
Join Date: Aug 2015
Device: none
I've reworked my script to make it quite a bit more flexible. As part of that it has become two different scripts.

I haven't yet had time to write up a how-to for customizing website-specific output from Pandoc. I hope to do that over the next week or so, and to push out a few more website-specific formatting rules.

I'm mainly only doing this to meet my own needs (other tools just weren't cutting it for me), but if you do give it a try, feedback and suggestions are welcome.

You can see the latest iteration and follow future developments here: https://github.com/0r0/klemheist
gummihuhn is offline   Reply With Quote
Old 02-10-2016, 07:09 PM   #8
loviedovie
Addict
loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.loviedovie ought to be getting tired of karma fortunes by now.
 
Posts: 295
Karma: 2139988
Join Date: Nov 2014
Device: bookeen
wallabag works great for me but epub export is not automatic. I tend to use wallabag client on Android.
loviedovie is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
"Koreader" and "Coolreader" Prefixes Ken Maltby Feedback 3 05-21-2015 06:39 PM
Touch How transfer "Books Read" and "Hours Read" data Abrakadabra77 Kobo Reader 5 02-16-2015 03:30 AM
Koreader plugin "Calibre Companion" chaley Kobo Developer's Corner 4 12-21-2014 05:05 PM
How to remove "Fully read" books from "Last Open" list? pjeanetta PocketBook 4 12-08-2010 10:30 AM


All times are GMT -4. The time now is 09:52 PM.


MobileRead.com is a privately owned, operated and funded community.