Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 01-10-2017, 01:19 PM   #166
Teom@n
Enthusiast
Teom@n began at the beginning.
 
Posts: 44
Karma: 10
Join Date: Dec 2014
Location: Lyon
Device: Kindle PW3
Quote:
Originally Posted by Doitsu View Post
That depends on your technical skills. If your MS Excel spreadsheet contains only two columns you could save it as a tab-delimited text file and process it with tab2opf.py.
If you're familiar with regular expressions, you could also convert the tab-delimited text file directly using a couple of regular expressions.
I have a computer skills so I will try to do my best.

Thanks.
Teom@n is offline   Reply With Quote
Advert
Old 01-10-2017, 03:10 PM   #167
Teom@n
Enthusiast
Teom@n began at the beginning.
 
Posts: 44
Karma: 10
Join Date: Dec 2014
Location: Lyon
Device: Kindle PW3
Quote:
Originally Posted by Doitsu View Post
That depends on your technical skills. If your MS Excel spreadsheet contains only two columns you could save it as a tab-delimited text file and process it with tab2opf.py.
If you're familiar with regular expressions, you could also convert the tab-delimited text file directly using a couple of regular expressions.
Do you know any script(python, etc.) for converting huge html files to other formats?
Teom@n is offline   Reply With Quote
Old 01-10-2017, 03:14 PM   #168
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 3,953
Karma: 12058512
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Teom@n View Post
Do you know any script(python, etc.) for converting huge html files to other formats?
I usually use the Python bs4 library for HTML parsing.
Doitsu is offline   Reply With Quote
Old 01-10-2017, 04:50 PM   #169
Teom@n
Enthusiast
Teom@n began at the beginning.
 
Posts: 44
Karma: 10
Join Date: Dec 2014
Location: Lyon
Device: Kindle PW3
Quote:
Originally Posted by Doitsu View Post
I usually use the Python bs4 library for HTML parsing.
Ok, I will try it. Thanks.
Teom@n is offline   Reply With Quote
Old 01-22-2017, 03:37 PM   #170
Teom@n
Enthusiast
Teom@n began at the beginning.
 
Posts: 44
Karma: 10
Join Date: Dec 2014
Location: Lyon
Device: Kindle PW3
Quote:
Originally Posted by Doitsu View Post
I usually use the Python bs4 library for HTML parsing.
hi,

could you share the script/commands that you were using for converting html to text? I have a html file(124mb) which I took it from a mobi file. I installed the python and beautifulsoap.

could you guide me?

thanks
Teom@n is offline   Reply With Quote
Old 01-22-2017, 06:26 PM   #171
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 3,953
Karma: 12058512
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Teom@n View Post
could you share the script/commands that you were using for converting html to text? I have a html file(124mb) which I took it from a mobi file. I installed the python and beautifulsoap.
Unfortunately, I can't help you with that, because scripts and commands will vary depending on the exact input and output formats. However, BS4 is well documented. For example:

Code:
soup.get_text()
will strip all tags from an HTML file.

If you're not a Python programmer, you could also use a text editor with regular expressions support, e.g. Notepad++, to remove unwanted tags or convert them to a different format.
Doitsu is offline   Reply With Quote
Old 01-23-2017, 05:51 AM   #172
Teom@n
Enthusiast
Teom@n began at the beginning.
 
Posts: 44
Karma: 10
Join Date: Dec 2014
Location: Lyon
Device: Kindle PW3
Quote:
Originally Posted by Doitsu View Post
Unfortunately, I can't help you with that, because scripts and commands will vary depending on the exact input and output formats. However, BS4 is well documented. For example:

Code:
soup.get_text()
will strip all tags from an HTML file.

If you're not a Python programmer, you could also use a text editor with regular expressions support, e.g. Notepad++, to remove unwanted tags or convert them to a different format.
Thanks mate.
Teom@n is offline   Reply With Quote
Reply

Tags
ebook tools, kindle tools

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Dictionary lookup in iBooks 1.1: "Dictionary not available for this language" kjk Apple Devices 71 09-18-2010 07:24 AM
best foreign language & dictionary options? joedevivre Which one should I buy? 2 12-13-2009 10:40 AM
How do I create headword-enabled Mobipocket dictionary? owl123 Kindle Formats 1 07-24-2009 12:13 PM
Useful tip: How to change the BD language AFTER you create a book HarryT Workshop 4 04-15-2009 01:36 AM
creating a foreign language dictionary dirtylc Amazon Kindle 1 03-30-2009 09:40 AM


All times are GMT -4. The time now is 09:10 AM.


MobileRead.com is a privately owned, operated and funded community.