Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 05-01-2008, 02:04 PM   #1
46137
Member
46137 began at the beginning.
 
Posts: 11
Karma: 40
Join Date: May 2008
Location: Lima, Peru
Device: Sony PRS 505
Reformatting untidy text files macro

I use the Sony Reader 505 and really don't like the pdf function. I actually use it a lot to read educational pdfs (I'm a teacher). I also like to use arial 16 full justified so as to not get a headache. Because of this I really have to get the pdfs into rtf.

Converting PDFs. I use Adobe, ABC PDF converter and Cut and Paste. It all depends on which gives the best result for each document

Cut and paste from html also gives these problems

As you know often a PDF document/ HTML converted to text is often very messy with double returns at each line, page numbers etc. Makes editing a pain. I’ve been struggling with this for ages and have come up with a few techniques to make this easier. If I’m telling anybody anything they know… sorry!!

You often need to use 1 or 2 or more of these but I have found you can generally make a readable rtf/doc/plain text with minium fuss

First… look at the document. This is really important. Look for patterns. Click the reveal formatting button (backwards P on tool bar) and see what you have. Look for double/triple multiple carriage returns, repeated formatting. Sometime Autoformat will sort it straight away, but if it doesn’t……

I generally always do this:

Replace <space> with <space>. This replaces soft space with space

Replace <space. <space> with <space>, gets rid of double spaces. Keep hitting Replace All until no changes are made. When I first started I had lots of trouble with extra spaces, breaking words, lines separated etc. Getting each word to be separated by a single space really helped

The easiest to edit quickly is when each paragraph has a double or triple return return. Use find a replace to replace each double carriage return (Double carriage return is ^p^p) with any long random string. I use xxxxxxxxxx. If this is the case then it takes seconds to get a reabale document.

If you have a mix of double and triple it is easy to replace each single ^p with a double ^p^p then replace 3 or 4 returns with xxxxxxxxxxx.

When this had done. Replace each return with nothing. Finally replace xxxxxxxxxx with ^p^p

This gives each paragraph with a double return, which is how I like it!!

If you use PDF converter from ABC then the Page No. is in blue. This is very easy to deal with. One technique is the find and replace formatting function. On the find and replace hit more,-format –font -font colour blue. Just replace anything blue with nothing. All page numbers gone.

This also works with sub and superscript, underline, bold etc

Often if you autoformat then words can link together which makes spellchecking a page. This occurs because at the end of each line the carriage return is next to the final word with out a <space>.

Example

wrote the timetable^p
in two days

becomes:

wrote the timetablein two days

Replace ^p with <space>^p for the whole document.

Then replace <space><space> with <space>
Finally replace ^p<space>^p with ^p^p. This will get rid of extra spaces between carriage returns

Use find and replace for headings you wish to remove, such as titles appearing on each page.

Example may be Book Name <space> 1, Book Name 2, ……..Book Name 27 etc

Replace Book Name <space> 1 with Book Name <space> Hit replace all until no more are found. Change 1 for 2 and repeat. Do this until you get to 0 then start again from 1 and keep going unil no more changes. Finally Replace Book Name <space> with nothing. This works for any repeated text and number.

Remember any repeated string is your friend. It even works for page numbers on their own. Just remember to replace the formatting each time. Look for repeated patterns so:

^p
^p
123
^p

and

^p
^p
124
^p

etc etc are easy
For all the page numbers just do multiple runs, until no more changes. If the book has over 1 hundred page you have to find and replace each number at least 3 times

Replace ^p1 with ^p: give 23 and 24
Replace ^p2 with ^p: give 3 and 4
Replace ^p3 with ^p: give <nothing> and 4
Replace ^p4 with ^p: Give <nothing> and <nothing>
Etc etc

Until you end up with the easy to change ^p^p^p^p, just change to ^p^p^p. This will give your return between paragraphs

For really stubborn documents I use this macro. First I find all headings and make sure that I have a triple carriage after them. Example

Heading^p
^p
^p

I make sure that any words and carriage returns are seperated by a space.

Macro

Sub newcarr()
'
' newcarr Macro
' Macro recorded 26/04/2008 by SEC
'
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = " "
.Replacement.Text = " "
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = " "
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = """"
.Replacement.Text = "'"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = " "
.Replacement.Text = " "
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "^p^p^p"
.Replacement.Text = "xxxxxxxxxx"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ". ^p"
.Replacement.Text = ".^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "? ^p"
.Replacement.Text = "?^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "' ^p"
.Replacement.Text = "'^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ") ^p"
.Replacement.Text = ")^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ": ^p"
.Replacement.Text = ":^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "; ^p"
.Replacement.Text = ";^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ".^p"
.Replacement.Text = "1xyz"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "?^p"
.Replacement.Text = "2xyz"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "'^p"
.Replacement.Text = "3xyz"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ")^p"
.Replacement.Text = "4xyz"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ":^p"
.Replacement.Text = "5xyz"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ";^p"
.Replacement.Text = "6xyz"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "^p"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "xxxxxxxxxx"
.Replacement.Text = "^p^p^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "1xyz"
.Replacement.Text = ".^p^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "2xyz"
.Replacement.Text = "?^p^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "3xyz"
.Replacement.Text = "'^p^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "4xyz"
.Replacement.Text = ")^p^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "5xyz"
.Replacement.Text = ":^p^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "6xyz"
.Replacement.Text = ";^p^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
Sub arial()
'
' arial Macro
' Macro recorded 26/04/2008 by SEC
'
Selection.WholeStory
Selection.Font.Color = wdColorAutomatic
Selection.ParagraphFormat.Alignment = wdAlignParagraphJustify
End Sub



This basically removes double spaces. Removes extra spaces between punctuation and returns and clears up speech marks

I then replace any terminal puntuation (.’!)?) that has a ^p after it with strings. I basically follow the rule that if the sentence ends at the end of the line then this is where a paragraph should be. Not always true but gives a readable document.

Finally delete all carriage returns then replace strings with double returns.

I must stress none of these work on their own, none of them always work but by combining these techniques I am able to convert a text document made from a PDF much more quickly. The labour intensive deleting each extra return used to take hours. I can in 95% of the cases get a readable document in about 10mins for an average size book, where manually might take 2 or 3 hours (Yes we've all done it!!!). Clever use of find and replace really helps.

Oh... also always save as a plain text document before doing the cosmetic formatting, makes for a smaller file!!!

Hope it helps and please don't flame me if you already know all these!!!!!!

46137 is offline   Reply With Quote
Old 05-01-2008, 07:11 PM   #2
Ervserver
Wizard
Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.
 
Ervserver's Avatar
 
Posts: 2,624
Karma: 1008294
Join Date: Dec 2007
Location: Iowa, USA
Device: Nook Simple Touch
I used ABC PDF converter once and the output had put a ? in place of all " never did figure out why
Ervserver is offline   Reply With Quote
Advert
Old 05-01-2008, 09:16 PM   #3
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,657
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Is there actually any PDF converter thatc an convert a text based PDF to some other format without error? Even Adobe Acrobat Pro makes plenty of mistakes.
JSWolf is offline   Reply With Quote
Old 05-02-2008, 12:36 AM   #4
pilotbob
Grand Sorcerer
pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.
 
pilotbob's Avatar
 
Posts: 19,832
Karma: 11844413
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
Quote:
Originally Posted by JSWolf View Post
Is there actually any PDF converter thatc an convert a text based PDF to some other format without error? Even Adobe Acrobat Pro makes plenty of mistakes.
The Able PDF extractor does a very good job. It creates a perfect .DOC or .RTF if you open them in Word... but on the reader no so much. But, if you convert to .txt it is very good. Of course, it does text only!

BOb
pilotbob is offline   Reply With Quote
Old 05-02-2008, 02:52 AM   #5
rambler
Junior Member
rambler began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Feb 2008
Location: UK
Device: None yet
I've also made a few quick macros to sort out text.

The first is intended to format text as italic when it has been formatted using underscores (ie: this is _italic_ text) - lots of text files seem to use this. Note that if there are an uneven number of underscores you'll get interesting results...

The second macro is to fix text that has extraneous carriage returns in it, as often happens, like this:

"this is one line of
text
but somehow we have a new carriage return in it..."

Note that you must first ensure you have edited the macro to indicate how many carriage returns are in the text (sometimes it will be two, but usually it's one).

Both macros are quick hacks, and can benefit from some tweaking, but work fine for my purposes. Here they are:

Sub FixBadText()

' THIS MACRO WILL REPLACE EXTRA LINE BREAKS IN TEXT WITH A SPACE

Dim sReplaceParas As String
' NOTE: change the value in the double quotes below to ^13^13 if there are two
' carriage returns in the document
sReplaceParas = "^13"

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
' need to use wildcards here
.Text = "[A-z]" & sReplaceParas & "[A-z]" ' ^13 is paragraph char
.Forward = True
.Wrap = wdFindContinue 'wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = True 'False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Replacement.Text = ""
End With
While Selection.Find.Execute
'Do something within the found text
' here i need to replace just the middle chars, ie the paragraph marks

Selection.TypeText (Selection.Characters.First & " " & Selection.Characters.Last)

Wend

'Now do the same but for commas!
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
' need to use wildcards here
.Text = "," & sReplaceParas & "[A-z]" ' ^13 is paragraph char
.Forward = True
.Wrap = wdFindContinue 'wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = True 'False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Replacement.Text = ""
End With
While Selection.Find.Execute
'Do something within the found text
' here i need to replace just the middle chars, ie the paragraph marks
'MsgBox "Value found: " & Selection.Characters.First & Selection.Characters.Last, vbCritical

Selection.TypeText (Selection.Characters.First & " " & Selection.Characters.Last)

Wend

'Now do the same but for hyphens!
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
' need to use wildcards here
.Text = "-" & sReplaceParas & "[A-z]" ' ^13 is paragraph char
.Forward = True
.Wrap = wdFindContinue 'wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = True 'False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Replacement.Text = ""
End With
While Selection.Find.Execute
'Do something within the found text
' here i need to replace just the middle chars, ie the paragraph marks
'MsgBox "Value found: " & Selection.Characters.First & Selection.Characters.Last, vbCritical

Selection.TypeText (Selection.Characters.First & " " & Selection.Characters.Last)

Wend

End Sub

---------------------------------------------------


Sub ChangeToItalics()

' THIS WILL REPLACE _some text_ into italics
' Note: if there is an uneven number of underscores, you will encounter problems!

Dim iBookMark As Integer
Dim lStart As Long
Dim lEnd As Long

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
' Need to use wildcards here
.Text = "_"
.Forward = True
.Wrap = wdFindContinue 'wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Replacement.Text = ""
End With

iBookMark = 0

While Selection.Find.Execute
' Here we replace the underscores with blank string, and ensure any text between
' them is formatted as italic

ActiveDocument.Bookmarks.Add "temp" & iBookMark

'Selection.MoveRight
Selection.MoveUntil "_"
ActiveDocument.Bookmarks.Add "temp" & (iBookMark + 1)

' Note that first char in story is 0, not 1...
lStart = ActiveDocument.Bookmarks("temp0").Start
lEnd = ActiveDocument.Bookmarks("temp1").Start

' Now make the first bookmark select the whole text between the two underscores
ActiveDocument.Bookmarks("temp0").Start = lStart + 1
ActiveDocument.Bookmarks("temp0").End = lEnd

' Now select the bookmark text
ActiveDocument.Bookmarks("temp0").Select
' And make it italic
Selection.ItalicRun

' Now delete the underscores
ActiveDocument.Bookmarks("temp0").Select
Selection.MoveLeft wdCharacter, 2
Selection.Delete
ActiveDocument.Bookmarks("temp1").Select
Selection.Delete


Wend

' Delete the bookmarks we created
Dim iCount As Integer
iCount = 0

Do While iCount <= 2
If ActiveDocument.Bookmarks.Exists("temp" & iCount) = True Then
ActiveDocument.Bookmarks("temp" & iCount).Delete
End If
iCount = iCount + 1
Loop

End Sub
rambler is offline   Reply With Quote
Advert
Old 05-02-2008, 09:29 AM   #6
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
I have had great results with ABC Amber PDF Converter and ABBYY PDF Transformer. I have also made extensive use of Stingo's Word Macro (found in the MobileRead Wiki.) Your macro deserves to be there too.

I have found that the quality of the conversion from PDF depends on the content of the PDF. If it is text based then the conversion is very good and the problems are limited to headers, footers, and other physical page artifacts. The only other real issue is the encoding scheme -- Unicode vs. 8 bit vs. ASCII etc. If it is an image inside the PDF then the output quality is directly related to the OCR properties.
RWood is offline   Reply With Quote
Old 05-02-2008, 10:20 AM   #7
46137
Member
46137 began at the beginning.
 
Posts: 11
Karma: 40
Join Date: May 2008
Location: Lima, Peru
Device: Sony PRS 505
I prefer ABC to Adobe itself. Adobe leaves lots of artefacts, while ABC blue page no. helps a lot really helps a lot.

Often if the letter is a non-standard code then you will get problems such as losing ?.

The other you find is losing Fl and Fi so flag becomes .ag. and fixing becomes .xing

I always save the output as an rtf or word doc then save as a txt to edit. However always look at the raw rtf. If there is a repeatbale pattern for problems based on formatting then this is the time to solve itThis has the advantage of keeping the non-standard stuff. When you save as txt it then asks you to decide on substitue characters/page returns etc. You can experiment with substituions and encoding and often after a few tries you get better results.

For headers and footers don't forget you can find/replace for fonts size, font, position etc etc.

IThe ideal output will have headers/footers a different size, page Numbes as Page Number and any footnotes in subscript. This is then a doddle to clear up.
46137 is offline   Reply With Quote
Old 05-02-2008, 10:22 AM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,657
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
46137, would you mind actually attaching the macro instead of pisting it as text inside a message please?
JSWolf is offline   Reply With Quote
Old 05-02-2008, 09:27 PM   #9
46137
Member
46137 began at the beginning.
 
Posts: 11
Karma: 40
Join Date: May 2008
Location: Lima, Peru
Device: Sony PRS 505
Yer tiz
Attached Files
File Type: rar newcarr.rar (922 Bytes, 305 views)
46137 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Word Macro: Footnotes to inline text ? Hadrien Workshop 17 10-08-2011 01:28 PM
Word Formatting Macro (Stingo's Macro) Stingo Sony Reader 75 08-24-2010 05:18 AM
Kindle DX Graphite first look and macro shots of text MobileTechReview Amazon Kindle 31 07-09-2010 05:37 PM
Reformatting .txt files willijt Workshop 14 03-27-2010 10:05 AM
Reformatting PDF Files for Sony Reader sfernald Sony Reader 13 11-11-2007 08:52 AM


All times are GMT -4. The time now is 05:14 PM.


MobileRead.com is a privately owned, operated and funded community.