Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 01-23-2015, 09:58 AM   #46
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Nice
theducks is online now   Reply With Quote
Old 01-23-2015, 09:59 AM   #47
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,076
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Quote:
Barbecued Bear Paws
And it's SO hard to get fresh bear at the local supermarket these days
phossler is offline   Reply With Quote
Old 01-23-2015, 10:56 AM   #48
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
First of all, congrats on finally adding in the Reports functionality! I will have to mess around with it in the next few weeks. It is quite helpful on some of the extremely large projects I have been working on lately (Sigil chugs on these absolutely massive files).

Quote:
Originally Posted by kovidgoyal View Post
I have no plans to add a links report. The Check Book tool already checks for broken links and allows you to jump to them, and the editor autocompletes href attributes.
I have come up with 4 Use Cases off of the top of my head on why the Sigil Links Report is extremely helpful (and why it should probably be done in Calibre's Reports as well).

Use Case #1:

The Links Report is extremely helpful when you are cleaning up HTML files. I use it all the time when I pull a series of HTML articles off of a website to convert into an EPUB.

Let us say I wanted to strip all of the links in the book, or remove all of the amazon.com links, but keep the ones pointing to cde.com + xyz.com, I can easily sort + spot those and remove them.

Use Case #2:

Also, if you are working on newer books that were exported from OCR (Finereader), it tries to do its best to digitize the links from the original PDF (sometimes gets it wrong if it was broken across lines). So on the visual surface, the link looks perfectly fine, but the link itself is broken.

For example, a link might look like this:

Code:
<a href="http://www.sample.com/">http://www.sample.com/</a><a href="sample/sample.html">sample/sample.html</a>
You would be able to easily spot this error in the Links Report. (Ok ok, I know, I know, horrible sample I came up with! )

Here is four real life "OCR errors" I caught with the Links Report:

Code:
<p>———. 1936. Liquidity. Minnesota Bankers Assoc. Available at: <a href="http://www">http://www</a>.</p>

  <p>24hgold.com/viewcompanyarticle.aspx?langue = en&amp;articleId = 217737</p>

<p>Nobelprize.org. 2008. John Nash interview, September, 2004. Retrieved January 15, 2008 from <a href="http://nobelprize.org/mediaplayer/index.php?id">http://nobelprize.org/mediaplayer/index.php?id</a> = 429</p>

<p>Montaigne, Michel de. “The Profit of One Man is the Damage of Another.” <span style="font-style:italic;">Essays.</span> Chapter XXI. <a href="http://www.uoregon.edu/%7Erbear/montaigne/">http://www.uoregon.edu/%7Erbear/montaigne/</a>&nbsp;1xxi.htm</p>

<p>Development.” Free-Market News Network, February 14 and 15, at <a href="http://www.freemarketnews.com/Analysis/241/6939/notes.asp?wid">http://www.freemarketnews.com/Analysis/241/6939/notes.asp?wid</a>=241&amp;&nbsp;nid=6939 and http:// <a href="http://www.freemarketnews.com/">www.freemarketnews.com/</a> Analysis/241/6949/notes.asp?&nbsp;wid=241&amp;nid= 6949.</p>
Use Case #3:

It is also extremely helpful when catching inconsistencies in what text is actually wrapped up in the <a> tags. For example, I digitized an entire Journal, at the bottom, it might say something like:

Code:
<p>Please contact the <a href="http://samplesite.com">Sample Site</a>.</p>
and in another section of the book, it might say:

Code:
<p>Please contact the <a href="http://samplesite.com">Sample Sit</a>e.</p>
and:

Code:
<p>Please contact the <a href="http://samplesite.com">sample site</a>.</p>
If you sort the Links Report, you can also easily spot that something odd happened, because you would see "Sample Site" and "Sample Sit" and "sample site".

These are typically very hard to catch with just your naked eye, or even a quick perusal over the code, unless you knew EXACTLY what you were looking for (and even then, easy to miss).

Use Case #4:

It is VERY helpful in catching absolutely useless links. For example, Finereader exports a lot of phantom "bookmark##" links:

Click image for larger version

Name:	LinksReport.png
Views:	221
Size:	14.3 KB
ID:	134097

When you are cleaning out all the cruft, the Links Report makes it very easy (as you can see, Finereader also exports "footnote##" links). This is helpful when you want to get rid of as much useless code as possible, and to spot if you actually did remove it all.

Finereader 12 even introduced this cursed "caption#" class... which in all cases I have seen, is 100% worthless. Most of the time I forget to even look for it, and I just accidentally stumble on it when I am looking at the Links Report.

Last edited by Tex2002ans; 01-23-2015 at 11:40 AM.
Tex2002ans is offline   Reply With Quote
Old 01-23-2015, 11:09 PM   #49
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Sigh, it never ends...

Here you go:

https://github.com/kovidgoyal/calibr...27759f74e8d188

It even has a live preview of the link destination, and you can double click to jump to either the link definition or its destination in the editor.
kovidgoyal is offline   Reply With Quote
Old 01-24-2015, 12:39 AM   #50
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Hi

I was unable in the last weeks to intervene due to severe bandwitdth (...) constraints in China which lasted a little more than one month. Your new report feature is absolutely brilliant and I must congratulate you for implementing it so neatly.

I hurried to check one EPUB. It worked beautifully! What astounded me most was the report on the used characters (112 as a grand total).

As it happens, there were on this EPUB four otf subsetted fonts with a total of 525k including one regular (164k), one italic (153k). Among them, two (bold and bold-italic) were hardly used at all (only for some titles) but occupied respectively 108k and 100k.

I use systematically the subsetting of fonts with the Editor but now, I can't help thinking that the next logical step would be to be able to downsize each font to its really used characters...

I am sorry to make you suffer...

Last edited by roger64; 01-24-2015 at 12:43 AM. Reason: otf
roger64 is offline   Reply With Quote
Old 01-24-2015, 05:15 AM   #51
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Subsetting a font does reduce it to only used characters
kovidgoyal is offline   Reply With Quote
Old 01-24-2015, 06:05 AM   #52
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Thanks for your reply that surprised me. So otf fonts seem to be heavier than ttf ones. I use otf files from here:
http://sourceforge.net/projects/linu...bertine/5.3.0/

I thought that the downsizing was possibly not complete because 525k for 110 characters seemed to be a little bloated

Two years ago I had prepared a regular ttf web-font of Linux Libertine using font-squirrel online service and this web-font had a 60k only size and had usually enough characters for a French novel. It is attached here. There was a downside though: any time I needed a Spanish or other foreign accent, I had to make a new web-font and this was a tedious process.

So, as soon as you began subsetting fonts with the Editor, I stopped using it because also otf fonts are nicer (ligatures). I will follow on...
Attached Files
File Type: zip linlibertine_c-4.0.4ro-webfont.ttf.zip (32.2 KB, 104 views)
roger64 is offline   Reply With Quote
Old 01-24-2015, 08:29 AM   #53
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Font files can have arbitrary amounts/types of data associated with every character. Font subsetting removes only the most common/standardised types of data. Some fonts can have extra, font foundry specific data tables, which font subsetting leaves alone since it knows nothing about them.
kovidgoyal is offline   Reply With Quote
Old 01-24-2015, 01:32 PM   #54
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Quote:
Originally Posted by kovidgoyal View Post
@Divingduck:As for reporting what fonts are used for what characters, it is possible, but it would be fairly slow, and would only work for embedded fonts, since otherwise the font used is system dependent. Basically, it would use the code from the font subsetting tool.
Maybe it is possible to use a switch for this so that the main functionality works without font detection and if needed the user can enable a font detection. I guess this feature ins’t relevant and it makes no sense to slow down the general report for all.
My main case is to figure out what characters are used with embedded font(s) and which characters are missing and will supported from system fonts and - if possible what general font information for a character is available if there is no embedded font inside an eBook (because a font was deleted, forgotten to embed or what ever)

The idea behind is to have a tool where it is possible to make a valid check for possible font related problems with devices. I know this way isn't perfect but it gives a bit more control.
My ultimate wish is to have a tool where I can make a selection of fonts (e.g. installed fonts in a reader) and compare this with the fonts and used characters in an eBook. Something what I don't saw in any other program so far


About my checks I can say, I am very happy.

The Report shows all kind of files in an eBook and the character analysis shows all involved characters incl. all control characters. In the picture report I miss the information of picture type (bw, grayscale, color).

There comes up one whish: The possibility to mark / copy selected elements of information columns or lines entries to clipboard (via context menu).

About the Links Report of Sigil. I do not often use this, but if I have link problems and I can’t find the problems quick then I took a first look with Sigil too. For simple book structures I can do those things by hand but in other cases with complex structures this report helps me to become a better overview. This is mostly the case if I work with complex web documents with a lot of crosslinks between files.
Divingduck is offline   Reply With Quote
Old 01-24-2015, 06:40 PM   #55
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by Divingduck View Post
The Report shows all kind of files in an eBook and the character analysis shows all involved characters incl. all control characters. In the picture report I miss the information of picture type (bw, grayscale, color).
Surely if the picture is grayscale the thumbnail will be gray, so it should be easy to spot gray scale images.

Quote:
There comes up one whish: The possibility to mark / copy selected elements of information columns or lines entries to clipboard (via context menu).
Use the Save button, which exports all data to a csv file, from where you can extract whatever you want using any spreadsheet program.
kovidgoyal is offline   Reply With Quote
Old 01-24-2015, 09:35 PM   #56
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by kovidgoyal View Post
Sigh, it never ends...

Here you go:

https://github.com/kovidgoyal/calibr...27759f74e8d188

It even has a live preview of the link destination, and you can double click to jump to either the link definition or its destination in the editor.
Awesome, thanks!

Screenshot if anyone wants to know what it looks like.
Attached Thumbnails
Click image for larger version

Name:	report-links.png
Views:	204
Size:	30.0 KB
ID:	134161  
eschwartz is offline   Reply With Quote
Old 01-26-2015, 12:07 PM   #57
icallaci
Guru
icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.
 
Posts: 769
Karma: 6528026
Join Date: Sep 2012
Device: Kobo Elipsa
Do you (would you, could you) have plans to add a report that shows orphan classes used in html files that are NOT in the CSS?
icallaci is offline   Reply With Quote
Old 01-26-2015, 12:16 PM   #58
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Orphaned CSS can already be removed with a longstanding tool.
eschwartz is offline   Reply With Quote
Old 01-26-2015, 12:35 PM   #59
icallaci
Guru
icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.icallaci ought to be getting tired of karma fortunes by now.
 
Posts: 769
Karma: 6528026
Join Date: Sep 2012
Device: Kobo Elipsa
Quote:
Originally Posted by eschwartz View Post
Orphaned CSS can already be removed with a longstanding tool.
Which tool? I don't want them removed. I want to know which classes are orphaned in the HTML so that I can add them to the CSS.

Last edited by icallaci; 01-27-2015 at 10:01 AM.
icallaci is offline   Reply With Quote
Old 01-26-2015, 12:45 PM   #60
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by icallaci View Post
Which tool? I don't want it removed. I want to know which classes are orphaned in the HTML so that I can add them to the CSS.


Sigils report: Style classes in HTML files No entry in the 'Found in' for that case
theducks is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Editor: ToC Editor: Start entry macnab69 Editor 2 06-25-2014 11:15 AM
Book Editor TOC Editor Isue? weberr Editor 2 04-17-2014 11:13 AM
PRS-600 Features I really would like to see... eosrose Sony Reader 5 10-01-2010 05:36 AM
I am looking for the ff. features in an eReader chris1 Which one should I buy? 1 02-07-2010 11:15 AM
Right now, you can have 2 of 3 features? surrealmind Which one should I buy? 10 01-03-2010 10:08 PM


All times are GMT -4. The time now is 06:38 PM.


MobileRead.com is a privately owned, operated and funded community.