|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#1 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
Need help with integrating files with non-descript names and no metadata
Hi,
Here is my situation: I have downloaded a little less than 100 very interesting articles from a website. They are now on my hd in htm format. I know I an easily import them into calibre, but since htm fles have no metadada and since their names are totally non-descript, I will not be able to know "who is who" (my calibre collection has 16'000+ books my now). But all the files came from only one main page from which I downloaded them. And this page does have the correct names. Is there a way to batch rename them by somehow grabbing the names from this first menu page (which I can also download if needed) and applying them in the correct order to the files? If not, can I at least keep them in a separate "sub-library' inside calibre and rename them all like this book1 book2 book3, etc.? Or what other way could you recommend to identify these articles? (they are all on the same topic). Many thanks, Farhad |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Moderator Notice Moved to appropriate subforum. Does that "main page" you mention link to the files by their current name? If so, try downloading that, verify that the links are intact and lead to the local copies of the articles, then stick that main page into Calibre. It should then gather up all the articles and you can convert a single book containing all articles at once. There's no way you can batch rename those files with the tools Calibre provides. Edit to add: You can, of course, simply use your "sub-library" idea by using a saved search that finds those articles to restrict the library view. |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
Quote:
![]() Thanks a lot!! Cheers, Farha |
|
![]() |
![]() |
![]() |
#4 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
Bummer :-(
My article download application (the Downthemall! FF extension) does not download the directory tree from the target site, so all my files are now in one subdirectory while the links in the main page point to various subdirectories in which the files were originally stored. I could, of course, manually edit the 200+ hrefs and get rid of the offending parts one by one, but before I do that can I ask you if you know of a fast way to download all the files from one page AND respect the remote directory structure? Thanks, Farhad |
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
because most subdirs are different. I might as well do that by hand :-(
|
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,408
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Does calibre's web2disk (http://calibre-ebook.com/user_manual/cli/web2disk.html) do the job you want?
|
![]() |
![]() |
![]() |
#8 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
dunno, but I will gladly try and let you know. thanks!!
|
![]() |
![]() |
![]() |
#9 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
web2page did not do the trick either. It created its own directory tree with names like link1 link2 etc...
I don't think that did I do something wrong (like missing a flag). maybe I should try wget (not used it in 10 years LOL) |
![]() |
![]() |
![]() |
#10 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
|
|
![]() |
![]() |
![]() |
#11 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
yep - I will have to go to an editor (and yes, on my Linux box
![]() thanks! |
![]() |
![]() |
![]() |
#12 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 1248
Join Date: Feb 2011
Location: Florida
Device: N800/Nokia 5230/PB IQ701/PB 360
|
found the trick! here is how to do it (in case somebody else has this issue):
from the CLI run wget -r --level=2 http://nameoftargetsite then make sure that no non htm/html files have been downloaded (I had a few zipped one). Next, find the exact page which has the URLs of the files on the remote site. Point caliber to that very same page on the local machine and import it. Finally, convert the book into EPUB , FB2 or any other format :-) HTH! |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How do I have one metadata category have two different names? | emanresu | Calibre | 3 | 11-23-2010 06:44 AM |
Names of files | clockmaker | Calibre | 3 | 08-26-2010 08:52 PM |
Using Folder Names as Metadata | volkermord | Calibre | 7 | 08-24-2010 01:36 AM |
Help finding Metadata Names and Values? | Sabardeyn | ePub | 3 | 04-02-2010 11:16 PM |
Files with long names? | cmhsieh54 | iRex | 0 | 08-05-2009 12:43 PM |