Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 09-24-2013, 03:40 PM   #1
hsawires
Member
hsawires began at the beginning.
 
hsawires's Avatar
 
Posts: 11
Karma: 10
Join Date: Sep 2013
Device: Galaxy Note 10.1
Problems with very huge collection in Arabic

Hi
this is my first time here in calibre forum, I was really impressed what that software can do.

I have some problems using Calibre, I have a very huge collection of PDF's almost 100 GB of Files about 12,000 books and the most of them have a file name written in Arabic character (left to right) .

my problems is:

Calibre takes very long time importing this huge list into another location where Calibre library are stored.
when Calibre make its library it did not recognize the Arabic letters of the files and folders where I grab the list. and make another copy of the files and folder with nonsense Latin characters, while when I save a book from Calibre to another location it save it correctly. the same problem occurred when I generate a cover.
the extracted covers from the pdfs are so huge it extracted it with the same size saved in the PDF which is cost me extra space and make my collection huger.


my question is, how can I import my PDF's into Calibre database without letting Calibre make another collection on my disk to store the files again, and if there is a method dose it make Calibre faster collecting data?
and how can I configure Calibre to recognize Arabic (RTL) letters? and have can I reduce the size of the extracted cover pages?

Thank you.
hsawires is offline   Reply With Quote
Old 09-24-2013, 07:57 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,054
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Moderator Notice
Please read the sticky before posting in Development. https://www.mobileread.com/forums/sho...d.php?t=122042

Moved
theducks is offline   Reply With Quote
Advert
Old 09-24-2013, 10:07 PM   #3
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,724
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by hsawires View Post
Hi
this is my first time here in calibre forum, I was really impressed what that software can do.

I have some problems using Calibre, I have a very huge collection of PDF's almost 100 GB of Files about 12,000 books and the most of them have a file name written in Arabic character (left to right) .

my problems is:

Calibre takes very long time importing this huge list into another location where Calibre library are stored.
when Calibre make its library it did not recognize the Arabic letters of the files and folders where I grab the list. and make another copy of the files and folder with nonsense Latin characters, while when I save a book from Calibre to another location it save it correctly. the same problem occurred when I generate a cover.
the extracted covers from the pdfs are so huge it extracted it with the same size saved in the PDF which is cost me extra space and make my collection huger.


my question is, how can I import my PDF's into Calibre database without letting Calibre make another collection on my disk to store the files again, and if there is a method dose it make Calibre faster collecting data?
and how can I configure Calibre to recognize Arabic (RTL) letters? and have can I reduce the size of the extracted cover pages?

Thank you.
@hsawires

To reduce the size of covers install the Resize Cover plugin -->> Read about it here https://www.mobileread.com/forums/sho...d.php?t=150982, simplest way to install plugins is via Preferences->Plugins->Get New Plugins

The default mode of operation is for Calibre to store your books in its library folders, which are organised in a Library/Author/Book hierarchy. That's not going to change in the foreseeable future - there are numerous discussions in this sub-forum on the issue.

I believe some people keep the 'books' (PDF's, MOBI, EPUB etc) in their own folder hierarchy and use Calibre as a conversion tool and a searchable catalogue. However my understanding is that if you want the integration with devices like e-readers, phones, tablets etc then you have to allow Calibre to retain the book files.

Its probably a silly question, but have you selected Arabic in Preferences->Look & Feel->Main Interface->Choose language

Regarding the time required to Add the Books - I can't think of anything obvious to make it significantly faster apart from hi speed disks. I recently added a USB 3 adapter, and a USB 3.0 disk dock with two WD Caviar Blacks - they're faster than my internal Seagates on SATA.

BR

Last edited by BetterRed; 09-24-2013 at 10:15 PM.
BetterRed is offline   Reply With Quote
Old 09-25-2013, 05:41 AM   #4
hsawires
Member
hsawires began at the beginning.
 
hsawires's Avatar
 
Posts: 11
Karma: 10
Join Date: Sep 2013
Device: Galaxy Note 10.1
@BetterRed


thank you for your reply.

yes, the Resize Cover plugin is awesome, Thank you.

I hope that Calibre can manage an existing list without importing them to its own Library, like importing books from XML, TXT or even CSV file. Manging this huge library is more important to me than importing them in Caliber library.
you may know that importing books into Calibre library is not the only problem, I find some books duplicated in specific folders which make the library itself very huge. like making virtual library or a series, something like that.

What I want to say is the enlargement of the Calibre library with no logical reason.

for the Arabic characters: what you suggest to me is just localizing the Calibre interface, but didn't deal with the database and the outputs. hereunder you can find what i get in my Calibre Library folders, when I choose to save books on my device or on my Harddisk.

[2]
[
mthnwy mwln jll ldyn lrwmy (35)]
cover.jpg
metadata.opf
mthnwy mwln jll ldyn lrwmy - 2.pdf
[brhym sht]
mthnwy jll ldyn lrwmy sh`r lSwfy@ l' (34)
cover.jpg
metadata.opf
mthnwy jll ldyn lrwmy sh`r lSwf - brhym sht.pdf
and so one ...

as you can see, recorded written in Arabic letters have nonsense system folder and files names.

probably I will take your advice switching to a Harddisk faster then my Internal one.

again, Thank you for your reply
hsawires is offline   Reply With Quote
Old 09-25-2013, 06:32 AM   #5
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,724
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by hsawires View Post
@BetterRed


thank you for your reply.

yes, the Resize Cover plugin is awesome, Thank you.

I hope that Calibre can manage an existing list without importing them to its own Library, like importing books from XML, TXT or even CSV file. Manging this huge library is more important to me than importing them in Caliber library.
you may know that importing books into Calibre library is not the only problem, I find some books duplicated in specific folders which make the library itself very huge. like making virtual library or a series, something like that.

What I want to say is the enlargement of the Calibre library with no logical reason.

for the Arabic characters: what you suggest to me is just localizing the Calibre interface, but didn't deal with the database and the outputs. hereunder you can find what i get in my Calibre Library folders, when I choose to save books on my device or on my Harddisk.

[2]
[
mthnwy mwln jll ldyn lrwmy (35)]
cover.jpg
metadata.opf
mthnwy mwln jll ldyn lrwmy - 2.pdf
[brhym sht]
mthnwy jll ldyn lrwmy sh`r lSwfy@ l' (34)
cover.jpg
metadata.opf
mthnwy jll ldyn lrwmy sh`r lSwf - brhym sht.pdf
and so one ...

as you can see, recorded written in Arabic letters have nonsense system folder and files names.

probably I will take your advice switching to a Harddisk faster then my Internal one.

again, Thank you for your reply
The first issue to resolve is to get the database to handle Arabic RTL script, the best person to help you with that is Kovid Goyal, hopefully he will make a contribution here - if he doesn't then I suggest you send him a PM.

Assuming you can get that issue resolved then there is another plugin that may be relevant - Import List - read about it here =>> https://www.mobileread.com/forums/sho...d.php?t=187831

How might this be of use - well if you could create a list of books as a TXT, CSV or XML file then you could use this plug in to 'register' their existence into the calibre database - that way you could leave the book files where they are. The list can include metadata which may enable you to use calibre to detect duplicates - perhaps via the Find Duplicates plugin. I would anticipate that importing a list of thousands of books will be a lot faster than importing the books themselves.

But the first issue to resolve is to get the database storing and presenting Arabic data

BR

Last edited by BetterRed; 09-25-2013 at 07:55 AM. Reason: typo
BetterRed is offline   Reply With Quote
Advert
Old 09-25-2013, 07:28 AM   #6
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Quote:
Originally Posted by hsawires View Post
I hope that Calibre can manage an existing list without importing them to its own Library, like importing books from XML, TXT or even CSV file. Manging this huge library is more important to me than importing them in Caliber library.
You can add book records into the database without adding e-book file formats via Add Books>Add Empty Book, but this has to be done for each book individually, as far as I know.

While there is a command line argument that allows adding books, I am not sure if you can add empty books in this manner. I would suggest reading through the Command Line>Add section of the calibre manual.
Sabardeyn is offline   Reply With Quote
Old 09-25-2013, 08:13 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,346
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
All non english characters in the metadata are tranliterated into english when creating files and folders in the calibre library, this is true of all scripts, not just arabic and is not going to change, however if you go to Preferences->Saving to disk and uncheck the box that says convert non-english to english characters, then when you export files from calibre using save to disk, the arabic characters will be preserved.
kovidgoyal is online now   Reply With Quote
Old 09-25-2013, 08:35 AM   #8
hsawires
Member
hsawires began at the beginning.
 
hsawires's Avatar
 
Posts: 11
Karma: 10
Join Date: Sep 2013
Device: Galaxy Note 10.1
Quote:
Originally Posted by BetterRed View Post
The first issue to resolve is to get the database to handle Arabic RTL script, the best person to help you with that is Kovid Goyal, hopefully he will make a contribution here - if he doesn't then I suggest you send him a PM.

Assuming you can get that issue resolved then there is another plugin that may be relevant - Import List - read about it here =>> https://www.mobileread.com/forums/sho...d.php?t=187831

How might this be of use - well if you could create a list of books as a TXT, CSV or XML file then you could use this plug in to 'register' their existence into the calibre database - that way you could leave the book files where they are. The list can include metadata which may enable you to use calibre to detect duplicates - perhaps via the Find Duplicates plugin. I would anticipate that importing a list of thousands of books will be a lot faster than importing the books themselves.

But the first issue to resolve it to get the database storing and presenting Arabic data

BR
For the second time in one day your help is so valuable to me. I give "Import List plugin" a try and it works very smoothly. this is what I meant. except some little bugs I will report it to it's thread later.

now I can generate a list of all my books in a .txt with some steps I can generate an XML file have the book title and its local path on the harddisk.

my question is:
what is the field name of the "Choose Format for" or where Calibre store the path of the book? the way I can open it when clicked, I mean the (container folder). If I got one of those information I can send it from my TXT or XML as a regular expression and import it into caliber without loosing the link to my file. and the book will not be empty anymore.
I am thinking of something like the following expression:

(?P<title>.*?) \- (?P<authors>.*)\.(?P<formats>.*) \; (?P<path>.*)

Sample:

Basic Rendring - Robert W. Gill.pdf ; d:\My Ebooks\Arts\Basic_Rendring_Robert_W._Gill.pdf

I hope it can be possible. because if it possible the cover page of the file will be possible to import as a link too. and make the database more lighter. - what do you think??

Thank you BR for your valuable help.

Last edited by hsawires; 09-25-2013 at 08:37 AM.
hsawires is offline   Reply With Quote
Old 09-25-2013, 08:43 AM   #9
hsawires
Member
hsawires began at the beginning.
 
hsawires's Avatar
 
Posts: 11
Karma: 10
Join Date: Sep 2013
Device: Galaxy Note 10.1
Thank you kovid goyal

you said:
Quote:
Originally Posted by kovidgoyal View Post
All non english characters in the metadata are tranliterated into english when creating files and folders in the calibre library, this is true of all scripts, not just Arabic and is not going to change.
but why? I can delete my old local files, just if Calibre can store it the way It offer, which is great. but I dont want to loose the files and folder names.

Hope you Calibre team may change their mind
hsawires is offline   Reply With Quote
Old 09-25-2013, 04:15 PM   #10
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Hsawires,
Calibre's library file structure, as BetterRed posted in message #3, is hard-coded into the program due to various internal functions that need to know exactly where the e-book files can be found. Because of this requirement calibre's library structure cannot and will not be changed. (Kovid has said this repeatedly and there are multiple variants of this conversation available in these forums.)

More importantly, you should not be performing any actions to the calibre library (structure or files) either manually or through any other software. Any manipulation of the files outside of calibre could lead to data loss. Which means, if you don't have a copy of your books elsewhere, either through calibre or a separate backup, you might suffer permanent loss.
Sabardeyn is offline   Reply With Quote
Old 09-25-2013, 05:48 PM   #11
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,724
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@hsawires I hope you won't mind me asking a couple of questions
  1. Are you able to enter Authors, Titles etc as RTL Arabic text into the Metadata Edit dialogue etc and see it on the Book List, Book Details etc?
  2. If the answer to 1 is 'Yes' and given Kovid's input, why does it matter how Calibre creates it's library folder and file names.

    Providing what one gets within Calibre itself, and what ones gets on anything that Calibre exports has real author and book names etc in the appropriate character set then its naming of folders and files in the library is 'theoretically' irrelevant. I sometimes wonder if it might have been 'better' if Kovid had decided to use the Author and Book identifiers (integers) to create folder and file names.
  3. Is Calibre's ability to integrate with phones, e-book readers and tablets one of your key requirements.

BR
BetterRed is offline   Reply With Quote
Old 09-26-2013, 02:39 AM   #12
hsawires
Member
hsawires began at the beginning.
 
hsawires's Avatar
 
Posts: 11
Karma: 10
Join Date: Sep 2013
Device: Galaxy Note 10.1
Quote:
Originally Posted by Sabardeyn View Post
Hsawires,
Calibre's library file structure, as BetterRed posted in message #3, is hard-coded into the program due to various internal functions that need to know exactly where the e-book files can be found. Because of this requirement calibre's library structure cannot and will not be changed. (Kovid has said this repeatedly and there are multiple variants of this conversation available in these forums.)

More importantly, you should not be performing any actions to the calibre library (structure or files) either manually or through any other software. Any manipulation of the files outside of calibre could lead to data loss. Which means, if you don't have a copy of your books elsewhere, either through calibre or a separate backup, you might suffer permanent loss.
I realized that I Cant depend 100% on Calibre. I have to deal with my local files as well as Calibre library. My complaint of Calibre library enlargement is no longer important, because in all cases I have to deal with two libraries.
hsawires is offline   Reply With Quote
Old 09-26-2013, 03:11 AM   #13
hsawires
Member
hsawires began at the beginning.
 
hsawires's Avatar
 
Posts: 11
Karma: 10
Join Date: Sep 2013
Device: Galaxy Note 10.1
Quote:
Originally Posted by BetterRed View Post
@hsawires I hope you won't mind me asking a couple of questions
not at all please ask whatever you like. I am appreciating your help so much.

Quote:
  1. Are you able to enter Authors, Titles etc as RTL Arabic text into the Metadata Edit dialogue etc and see it on the Book List, Book Details etc?
  1. yes without any problems. you can see a screeshot of one of my books.

    [Mod: Very large image converted to an attachment]

    Quote:
  2. If the answer to 1 is 'Yes' and given Kovid's input, why does it matter how Calibre creates it's library folder and file names.
  3. redundancy my friend, HD spaces, duplication of the same file. I think it is an important issue, specially when i decided to use a data management software.

    Quote:
    Providing what one gets within Calibre itself, and what ones gets on anything that Calibre exports has real author and book names etc in the appropriate character set then its naming of folders and files in the library is 'theoretically' irrelevant. I sometimes wonder if it might have been 'better' if Kovid had decided to use the Author and Book identifiers (integers) to create folder and file names.
    I am afraid that I couldn't understand your point.

    If you mean that storing the books in the Library in whatever manner will not affect using Calibre. yes this is true unless it make another copy of the file. and that lead me to the "duplication issue" again, I echo Sabardeyn in his last post. that I have to deal with two places storing the books, one for Calibre and the other is the safer one.

    Quote:
  4. Is Calibre's ability to integrate with phones, e-book readers and tablets one of your key requirements.
yes of course and no problem with that, except exporting a document to epub in Arabic have some problem some in encoding the characters, vowels and special character, and some issue with the direction RTL. but this is not my case because most of my e-books are (text on image) or image only PDF's. but if you like, it will be a pleasure for me starting a Thread about all the Arabic issues in Calibre.
Attached Thumbnails
Click image for larger version

Name:	dw3xn8.png
Views:	344
Size:	350.1 KB
ID:	112171  

Last edited by pdurrant; 09-26-2013 at 06:38 AM.
hsawires is offline   Reply With Quote
Old 09-26-2013, 03:15 AM   #14
hsawires
Member
hsawires began at the beginning.
 
hsawires's Avatar
 
Posts: 11
Karma: 10
Join Date: Sep 2013
Device: Galaxy Note 10.1
@BR

I post you a reply, but it have to be approved by the moderator, probably because I attached an Image to my post.
hsawires is offline   Reply With Quote
Old 09-26-2013, 05:10 AM   #15
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,724
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by hsawires View Post
@BR

I post you a reply, but it have to be approved by the moderator, probably because I attached an Image to my post.
ack - btw which OS do you use ?

BR
BetterRed is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problems with RTL texts (Arabic, Hebrew) Doitsu Kindle Formats 9 07-11-2012 09:26 PM
HUGE problems with USA 505 service mudfairy Sony Reader 4 02-14-2010 03:20 AM
Sony PRS 700 - problems with arabic text firdavs_abc Sony Reader 1 09-26-2009 07:37 AM
Huge problems converting from odt superanima Calibre 9 07-28-2009 01:21 PM


All times are GMT -4. The time now is 12:01 AM.


MobileRead.com is a privately owned, operated and funded community.