View Single Post
Old 09-12-2019, 03:36 PM   #1
kboogie222
Junior Member
kboogie222 began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2019
Location: New Jersey
Device: Kindle Oasis 2
Regex to count line wraps?

I'm finding that a lot of files that were converted from PDF have line wrap issues. Tons of line breaks in the middle of sentences.

The number of paragraphs that start with a lowercase letter would be a great indicator of PDF conversion linewrap issues.

Is it possible to create a regex that counts those occurrences and saves the count in a column?

This would be a great measure of quality. Perhaps even the ratio of lower/uppercase paragraph starts.

Please help
kboogie222 is offline   Reply With Quote