12-07-2014, 01:17 PM | #1 |
Enthusiast
Posts: 26
Karma: 10
Join Date: Jul 2010
Device: none
|
Best workflow for data->database->epub?
Hello,
I am planning on doing a free epub (and perhaps a related website) on Japanese writing (with a lot of detailed information for each ideogram taken from various books) and I would like to work in a way that I can collect all the information in a consistent way, create a database and then easily put the information together to create an epub with a proper layout (and perhaps also a website online), with an entry for each ideogram (perhaps adding also specific tags for the website). I was wondering if you could give me advice on how to do it in a way that I can minimize copying and pasting the information many times, since I would like to dedicate most of the time to the research proper. Specifically, how to construct a "simple" database (including pictures of the various versions of the ideograms) that can be easily exported in a way that can go easily/directly on the epub (with "factsheets" for each ideogram, whether each one of these is on one page depends on the reader). The layout I was thinking is quite standard (for East Asian languages) and you can have an idea looking at these pictures: http://myweb.facstaff.wwu.edu/yusa/basickanji.shtml http://www.neilsattin.com/wp-content...anjiuni001.JPG (that's not me) Basically a bigger picture on the left and smaller pictures on the right, various ways of reading the ideogram, the meanings and various compound words that use the ideogram in question, together with their meanings Thank you very much in advance for your kind advice. Cheers, Clemens Last edited by clemens14; 12-07-2014 at 01:19 PM. |
12-09-2014, 12:45 PM | #2 |
Addict
Posts: 398
Karma: 96448
Join Date: Dec 2013
Device: iPad
|
Well, the most popular language today is PHP and the most popular content management system (CMS) is Wordpress. However, what you're asking for is specific, do you have an knowledge in coding whatsoever?
I recently created a project called Social DRM where people could upload their epubs and get some stuff embedded in it, it is about 80% complete from initial prototype and has some of the code you're looking for as far as I could understand from your message. If you elaborated more on this matter maybe I could help you out. |
12-10-2014, 10:00 AM | #3 |
Enthusiast
Posts: 26
Karma: 10
Join Date: Jul 2010
Device: none
|
Hi,
thank you for your kind reply. Indeed the situation is very specific. The main idea is that I want to go through various books, collecting data (both in text format and image format) related to ideograms and put them together in a coherent way, this is why I was thinking about a database, which doesn't necessarily have to be online. The idea is to have a well defined input mask, so data is uniform Then, from the data inserted in the database, I would like to make a digital text that can be easily formatted (CSS?). I was thinking about epub, since I have worked with it quite a bit, although not with the latest standards, but other formats would be fine as well. With the same data, I would also like to make a website (this reinforces my idea that perhaps epub is the ideal format to consider for the ebook). I would like to accomplish this the smoothest way possible, i.e not having to insert data and then copy and paste around the various information many times (for example once for the database, once when making the epub and once more for the website). I know a little bit of coding, or better, I can follow and use a bit the code others have made, although my knowledge is very limited. I have used a bit of php (written by others) to batch fix (mainly footnote problems) various epubs made with Indesign 5.5, since it didn't export too well to epub, then fixed some more with Sigil, mainly by modifying the css files. If you wish you can PM me, but it is also fine to have the discussion public if you prefer. Thank you again, Clemens p.s. another option I was suggested was through MultiMarkDown, making one "factsheet" fot each ideogram considered, but I am a bit worried about the images and the internal references, therefore I though perhaps a proper database might be more solid (I am talikng about 5,6 thousand ideograms) |
12-13-2014, 06:08 PM | #4 |
Bookmaker & Cat Slave
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Hey, guys:
What about something in XML? Wouldn't that work for him? Put all the data into XML, and then use an XSLT to transform it? Or...can anyone think of a way to slap it into a DB, and export THAT into XML? From which he could then use an XSLT to get to ePUB? Thoughts, gang? Hitch |
12-13-2014, 11:19 PM | #5 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
For example, here is a site that I found that lists many kanji characters in UTF-8: http://www.rikai.com/library/kanjita....unicode.shtml Or here is a list of something similar to what you want in HTML (kanji, unicode codepoint, Henshall number, meanings): http://www.aule-browser.com/kanji/he...y-unicode.html Each Kanji on that site looked to be split like this: Code:
<span class="kanji">傍</span> <span class="UCS"> 508D </span> <span class="kid">1815 </span> <span class="m1">bystander</span> <span class="mngs"> · side, besides, while, nearby, 3rd person</span><br /> Code:
<div class="whole"> <p class="kanji">傍</p> <p class="altKanji">侀 侁 侂</p> <p class="mainmeaning">bystander</p> <p class="altmeaning">side, besides, while, nearby, 3rd person</p> <p class="thoroughexplanation">Blah blah blah, blah blah blah, this Kanji was used from the time period of ABCD-WXYZ.</p> <p class="thoroughexplanation">This is commonly used in business terms.</p> <p class="examplesentence">"This is an example sentence with this word."</p> </div> 侀 侁 侂 侃 侄 侅 來 侇 侈 侉 侊 例 侌 侍 侎 侏 偐 偑 偒 偓 偔 偕 偖 偗 偘 偙 做 偛 停 偝 偞 偟 劐 劑 劒 劓 劔 劕 劖 劗 劘 劙 劚 力 劜 劝 办 功 I would probably split each Kanji into its own file, and do my own organizing/combining elsewhere. For example, if I then wanted to create a giant HTML file of all of the words dealing with "numbers", I would just be able to create an outside program, which would say: merge the HTML files for: 一 (one), 二 (two), 三 (three), 四 (four), 五 (five), 六 (six), 七 (seven), 八 (eight), 九 (nine), 十 (ten). Quote:
If the entire book was just to match the look of the images linked in Post #1, I don't see a problem with merging individual HTML files together... If you wanted to do crazy cross-references + other madness... that might be a different story. Sadly, I don't know enough about Kanji, to know how exactly best this could be organized... all I know is UTF-8 codepoints. Are these books organized in some sort of "alphabetical" order? Or do they organize by themes (numbers, weather, business, etc. etc.)? |
||
12-14-2014, 05:08 PM | #6 |
Enthusiast
Posts: 26
Karma: 10
Join Date: Jul 2010
Device: none
|
Thank you for the repies and the links.
By images, I actually intended images, in the sense that I would use the unicode characters (and their variants), but I would also like to add scanned versions of some calligraphic styles. I didn't know about Droid Sans Fallback, but it looks like an interesting font to embed. XML+XSLT sounds interesting although I am not too familiar with it (understand the concept, but never directly worked with it). How should it be implemented (ad specifically through what software)? I am on Debian, but Windows 7 is also fine. I chatted with Odedta on the problem (the need of a way to input the data, both text and images, in a coherent way and then do both an epub and a website) and he suggested a mysql database + a CMS (Wordpress) that can allow the needed data to be inserted (though some sort of form/input mask) and export capabilities through plugins, and he is kindly looking into that path. Any idea of what could be used in this regard? Cheers, Clemens |
12-14-2014, 10:02 PM | #7 |
Connoisseur
Posts: 68
Karma: 786508
Join Date: Aug 2014
Location: Great Lakes
Device: K4PC, PW2, HD7, calibre
|
Just stumbled on a couple of sites you might find useful, I was trying to interpret some characters in a different forum. The first interprets unicode:
http://www.fileformat.info/info/unic...57fa/index.htm the second includes meanings and the strokes to create the chracter: http://www.jisho.org/kanji/details/%E5%9F%BA Thought might be useful for reference, good luck with your project. |
12-20-2014, 06:55 AM | #8 |
Enthusiast
Posts: 26
Karma: 10
Join Date: Jul 2010
Device: none
|
Thank you for the links
|
Tags |
database, epub creation, workflow |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
OCR to EPUB Best Workflow | Pumpkin Soup | Workshop | 19 | 04-22-2014 03:05 PM |
Workflow: Converting to Kindle and EPUB | slowsmile | Workshop | 0 | 05-15-2013 01:35 AM |
Saving my own data in Calibre's database | Pepin33 | Development | 3 | 10-05-2012 10:57 AM |
Opinion on workflow (and enhancing it) - research-type workflow | TheDarkTrumpet | Which one should I buy? | 8 | 03-02-2009 10:41 AM |