Hello!
I'm trying to figure out how I would go about converting a couple websites into eBooks for easy reading on my Kindle Paperwhite.
For example, there are two in particular that I would like to convert:
1. Mozilla Developer Network's JavaScript guides:
https://developer.mozilla.org/en-US/docs/Web/JavaScript (all pages on the left panel).
2. Facebook's React Native guide:
https://facebook.github.io/react-nat...d.html#content
I've tried setting up a custom news source within the calibre software, but I get garbled text, presumably from the JavaScript. I also have had trouble keeping the code snippets formatted properly (instead of reduced down to plain text).
Can someone help me set up my scripts to make this work for MDN & FB?
I'll include the script for MDN & a part of the output below, so you can see what's happening.
Thanks a lot!
SCRIPT (partial)
#!/usr/bin/env python2
# vim:fileencoding=utf-8
from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1436894783(BasicNewsRecipe):
title = 'MDN JavaScript Guide'
oldest_article = 9999
max_articles_per_feed = 100
auto_cleanup = True
feeds = [
('Introduction', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Introduction'),
('Grammar & Types', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Grammar_and_Types'),
('Control flow and error handling', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Control_flow_and_error_handling'),
('Loops and iteration', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Loops_and_iteration'),
('Functions', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Functions'),
('Expressions and operators', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Expressions_and_Operators'),
('Numbers and dates', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Numbers_and_dates'),
('Text formatting', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Text_formatting'),
('Regular Expressions', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions'),
('Indexed collections', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Indexed_collections'),
('Keyed collections', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Keyed_collections'),
('Working with objects', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Working_with_Objects'),
('Details of the object model', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Details_of_the_Object_Model'),
('Iterators and generators', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Iterators_and_generators'),
('Meta programming', 'https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Meta_programming'),
]
//MORE FEEDS WOULD GO HERE, UNLESS WE COULD AUTOMATE THE DOWNLOAD OF EVERYTHING CONTAINED WITHIN developer.mozilla.org/en-US/docs/Web/JavaScript/
OUTPUT (partial)
US/docs/Web/JavaScript/Reference/Statements/for...of" title="The for...of statement creates a loop Iterating over iterable objects (including Array, Map, Set, arguments object and so on), invoking a custom iteration hook with statements to be executed for the value of each distinct property."><code>for..of</code></a> construct. Some built-in types, such as <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array" title="The JavaScript Array global object is a constructor for arrays, which are high-level, list-like objects."><code>Array</code></a> or <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map" title="The Map object is a simple key/value map. Any value (both objects and primitive values) may be used as either a key or a value."><code>Map</code></a>, have a default iteration behavior, while other types (such as <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object" title="The Object constructor creates an object wrapper."><code>Object</code></a>) do not.</p> <p>In order to be <strong>iterable</strong>, an object must implement the <strong>@@iterator</strong> method, meaning that the object (or one of the objects up its <a href="/en-US/docs/Web/JavaScript/Guide/Inheritance_and_the_prototype_chain">prototype chain</a>) must have a property with a <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/iterator" title="The Symbol.iterator well-known symbol specifies the default iterator for an object. Used by for...of."><code>Symbol.iterator</code></a> key:</p> <h3 id="User-defined_iterables">User-defined iterables</h3> <p>We can make our own iterables like this:</p> <pre class="brush: js">var myIterable = {} myIterable[Symbol.iterator] = function* () { yield 1; yield 2; yield 3; }; [...myIterable] // [1, 2, 3] </pre> <h3 id="Built-in_iterables">Built-in iterables</h3> <p><a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/String" title="The String global object is a constructor for strings, or a sequence of characters."><code>String</code></a>, <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array" title="The JavaScript Array global object is a constructor for arrays, which are high-level, list-like objects."><code>Array</code></a>, <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray" title="A TypedArray object describes an array-like view of an underlying binary data buffer. There is no global property named TypedArray, nor is there a directly visible TypedArray constructor. Instead, there are a number of different global properties, whose values are typed array constructors for specific element types, listed below. On the following pages you will find common properties and methods that can be used with any typed array containing elements of any type."><code>TypedArray</code></a>, <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map" title="The Map object is a simple key/value map. Any value (both objects and primitive values) may be used as either a key or a value."><code>Map</code></a> and <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set" title="The Set object lets you store unique values of any type, whether primitive values or object references."><code>Set</code></a> are all built-in iterables, because the prototype objects of them all have a <a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/iterator" title="The Symbol.iterator well-known symbol specifies the default iterator for an object. Used by for...of."><code>Symbol.iterator</code></a> method.</p> <h3 id="Syntaxes_expecting_iterables">Syntaxes expecting iterables</h3> <p>Some statements and expressions are expecting iterables, for example the <code><a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for...of">for-of</a></code> loops, <a href="https://developer.mozilla.org/en
// "b" // "c" [..."abc"] // ["a", "b", "c"] function* gen(){ yield* ["a", "b", "c"] } gen().next() // { value:"a", done:false } [a, b, c] = new Set(["a", "b", "c"]) a // "a" </pre> <h2 id="Generators">Generators</h2> <p>While custom iterators are a useful tool, their creation requires careful programming due to the need to explicitly maintain their internal state. Generators provide a powerful alternative: they allow you to define an iterative algorithm by writing a single function which can maintain its own state.</p> <p>A generator is a special type of function that works as a factory for iterators. A function becomes a generator if it contains one or more <code>yield</code> expressions and if it uses the <code>function*</code> syntax.</p> <pre class="brush: js">function* idMaker(){ var index = 0; while(true) yield index++; } var gen = idMaker(); console.log(gen.next().value); // 0 console.log(gen.next().value); // 1 console.log(gen.next().value); // 2 // ...</pre> <h2 id="Advanced_generators">Advanced generators</h2> <p>Generators compute their yielded values on demand, which allows them to efficiently represent sequences that are expensive to compute, or even infinite sequences as demonstrated above.</p> <p>The <code>next()</code> method also accepts a value which can be used to modify the internal state of the generator. A value passed to <code>next()</code> will be treated as the result of the last <code>yield</code> expression that paused the generator.</p> <p>Here is the fibonacci generator using <code>next(x)</code> to restart the sequence:</p> <pre class="brush: js">function* fibonacci(){ var fn1 = 1; var fn2 = 1; while (true){ var current = fn2; fn2 = fn1; fn1 = fn1 + current; var reset = yield current; if (reset){ fn1 = 1; fn2 = 1; } } } var sequence = fibonacci(); console.log(sequence.next().value); // 1 console.log(sequence.next().value); // 1 console.log(sequence.next().value); // 2 console.log(sequence.next().value); // 3 console.log(sequence.next().value); // 5 console.log(sequence.next().value); // 8 console.log(sequence.next().value); // 13 console.log(sequence.next(true).value); // 1 console.log(sequence.next().value); // 1 console.log(sequence.next().value); // 2 console.log(sequence.next().value); // 3</pre> <div class="note"><strong>Note:</strong> As a point of interest, calling <code>next(undefined)</code> is equivalent to calling <code>next()</code>. However, starting a newborn generator with any value other than undefined when calling <code>next()</code> will result in a <code>TypeError</code> exception.</div> <p>You can force a generator to throw an exception by calling its <code>throw()</code> method and passing the exception value it should throw. This exception will be thrown from the current suspended context of the generator, as if the <code>yield</code> that is currently suspended were instead a <code>throw <em>value</em></code> statement.</p> <p>If a <code>yield</code> is not encountered during the processing of the thrown exception, then the exception will propagate up through the call to <code>throw()</code>, and subsequent calls to <code>next()</code> will result in the <code>done</code> property being <code>true</code>.</p> <p>Generators have a <code>return(value)</code> method that returns the given value and finishes the generator itself.</p> <h2 id="Generator_comprehensions">Generator comprehensions</h2> <p>A significant drawback of <a href="/en-US/docs/Web/JavaScript/Reference/Operators/Array_comprehensions" title="en/JavaScript/Guide/Predefined Core Objects#Array comprehensions">array comprehensions</a> is that they cause an entire new array to be constructed in memory. When the input to the comprehension is itself a small array the overhead involved is insignificant — but when the input is a large array or an expensive (or indeed infinite) generator the creation of a new array can be problematic.</p> <p>Generators enable lazy computation of sequences, with items calculated on-demand as they are needed. <a href="/en-