Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 03-08-2011, 07:59 AM   #1
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
Need help with integrating files with non-descript names and no metadata

Hi,

Here is my situation: I have downloaded a little less than 100 very interesting articles from a website. They are now on my hd in htm format. I know I an easily import them into calibre, but since htm fles have no metadada and since their names are totally non-descript, I will not be able to know "who is who" (my calibre collection has 16'000+ books my now).

But all the files came from only one main page from which I downloaded them. And this page does have the correct names. Is there a way to batch rename them by somehow grabbing the names from this first menu page (which I can also download if needed) and applying them in the correct order to the files?

If not, can I at least keep them in a separate "sub-library' inside calibre and rename them all like this book1 book2 book3, etc.?

Or what other way could you recommend to identify these articles? (they are all on the same topic).

Many thanks,

Farhad
Farhad is offline   Reply With Quote
Old 03-08-2011, 08:18 AM   #2
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Moderator Notice
Moved to appropriate subforum.

Does that "main page" you mention link to the files by their current name? If so, try downloading that, verify that the links are intact and lead to the local copies of the articles, then stick that main page into Calibre. It should then gather up all the articles and you can convert a single book containing all articles at once.
There's no way you can batch rename those files with the tools Calibre provides.

Edit to add: You can, of course, simply use your "sub-library" idea by using a saved search that finds those articles to restrict the library view.
Manichean is offline   Reply With Quote
Advert
Old 03-08-2011, 10:02 AM   #3
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
Quote:
Originally Posted by Manichean View Post
Try downloading that, verify that the links are intact and lead to the local copies of the articles, then stick that main page into Calibre. It should then gather up all the articles and you can convert a single book containing all articles at once.
Excellent idea!

Thanks a lot!!

Cheers,

Farha
Farhad is offline   Reply With Quote
Old 03-08-2011, 10:40 AM   #4
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
Bummer :-(

My article download application (the Downthemall! FF extension) does not download the directory tree from the target site, so all my files are now in one subdirectory while the links in the main page point to various subdirectories in which the files were originally stored.

I could, of course, manually edit the 200+ hrefs and get rid of the offending parts one by one, but before I do that can I ask you if you know of a fast way to download all the files from one page AND respect the remote directory structure?

Thanks,

Farhad
Farhad is offline   Reply With Quote
Old 03-08-2011, 10:53 AM   #5
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by Farhad View Post
I could, of course, manually edit the 200+ hrefs and get rid of the offending parts one by one, [...]
Why not use search & replace in your favourite text editor?
Manichean is offline   Reply With Quote
Advert
Old 03-08-2011, 11:28 AM   #6
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
because most subdirs are different. I might as well do that by hand :-(
Farhad is offline   Reply With Quote
Old 03-08-2011, 11:28 AM   #7
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,408
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Does calibre's web2disk (http://calibre-ebook.com/user_manual/cli/web2disk.html) do the job you want?
chaley is offline   Reply With Quote
Old 03-08-2011, 11:34 AM   #8
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
dunno, but I will gladly try and let you know. thanks!!
Farhad is offline   Reply With Quote
Old 03-08-2011, 12:07 PM   #9
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
web2page did not do the trick either. It created its own directory tree with names like link1 link2 etc...

I don't think that did I do something wrong (like missing a flag).

maybe I should try wget (not used it in 10 years LOL)
Farhad is offline   Reply With Quote
Old 03-08-2011, 12:14 PM   #10
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by Farhad View Post
because most subdirs are different. I might as well do that by hand :-(
Hm. A regex/some other matching pattern aware editor should do the trick. If you're on Windows (which I suspect you're not because of your signature), I'd suggest Notepad++.
Manichean is offline   Reply With Quote
Old 03-08-2011, 12:18 PM   #11
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
yep - I will have to go to an editor (and yes, on my Linux box).
thanks!
Farhad is offline   Reply With Quote
Old 03-08-2011, 01:45 PM   #12
Farhad
Zealot
Farhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheeseFarhad can extract oil from cheese
 
Farhad's Avatar
 
Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
found the trick! here is how to do it (in case somebody else has this issue):

from the CLI run

wget -r --level=2 http://nameoftargetsite

then make sure that no non htm/html files have been downloaded (I had a few zipped one).

Next, find the exact page which has the URLs of the files on the remote site. Point caliber to that very same page on the local machine and import it.

Finally, convert the book into EPUB , FB2 or any other format :-)

HTH!
Farhad is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I have one metadata category have two different names? emanresu Calibre 3 11-23-2010 06:44 AM
Names of files clockmaker Calibre 3 08-26-2010 08:52 PM
Using Folder Names as Metadata volkermord Calibre 7 08-24-2010 01:36 AM
Help finding Metadata Names and Values? Sabardeyn ePub 3 04-02-2010 11:16 PM
Files with long names? cmhsieh54 iRex 0 08-05-2009 12:43 PM


All times are GMT -4. The time now is 03:35 AM.


MobileRead.com is a privately owned, operated and funded community.