View Single Post
Old 08-14-2016, 04:30 AM   #21
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Every time this gets brought up, I always bust out the same few Regex:

Regex #1

I use this one to catch a hyphen at the end of a paragraph:

Search: -</p>\s+<p>
Replace:

Never do a "Replace All". Look at each of these manually and decide if it is actually a hard hyphen or a soft hyphen.

Code:
<p>"This is an example of wrong non-</p>

<p>alignment."</p>

<p>This is an example of estab-</p>

<p>lishing the sentence.</p>
Regex #2

This one catches any paragraph that ends in a character NOT in the RED part.

In this case, any paragraph that ends WITHOUT a "more than sign" (most likely HTML tags), a "right double quote", a "quotation mark", an "exclamation point", or a "period". (These are valid paragraph endings for the most part).

Search: ([^>”\?\!\.])</p>\s+<p>
Replace: \1

Note: (Make sure there is a space after "\1 ").

Code:
<p>This is an example</p>

<p>sentence.</p>

<p>“This is an example of</p>

dialogue.”</p>

<p>This is a long, very long,</p>

<p>very very long example.</p>

<p>And this is an example of one of Tex's (amazing)</p>

<p>Side Notes.</p>
Side Note: Colons are a different beast. Depending on the book, they can be valid paragraph endings or not. Create a separate Search to handle those individually.

Regex #3

Typically a paragraph does not belong with a lowercase letter... so this one catches most stragglers.

Search: <p>[a-z]

Code:
<p>He picked three out of the hat:</p>

<p>one, two, three.</p>
Regex #4

This one typically occurs with dialogue... a quotation mark that gets split.

Search: ,”</p>

Code:
<p>“Hey! Get back over here,”</p>

<p>Tex said. “You are a buffoon.”</p>
Those Regex should cover most cases. The rest has to be done by thorough manual checking.

Quote:
Originally Posted by Hitch View Post
BUT, I gotta comment on this.

I have wondered if there's some way for me to donate my time for subtitling. Mr. Hitch needs the subtitles for all the UK and Aussie stuff that we watch, and the subtitles/closed captioning are simply DREADFUL. I mean, dreadful. I don't know how the hell anyone can manage, if they can't fill in the blanks through hearing. It's unbelievable.

It's not any better for US TV; it's pretty much as awful/worse. I know that some of the services are PAID, so that's the worst part. If, like DP and PG, it was all donated time, okay...I could wince and ignore it, but a commercial service? Appalling.
I remember running across an article talking about this:

https://www.viki.com/

Or I remember running across volunteer subtitling for Youtube videos:

https://amara.org/en/

I swear I ran across another one that was similar to Distributed Proofreaders... but I can't seem to find the article/interview in a quick search. I remember the woman started it because her husband was deaf, and the CC + automated subtitles were complete crap. I can't remember for the life of me where I even read the article though. Maybe it was years ago when I was looking into Transcription.

Side Note: Or you could always join the "dark side" and join the Fansubbing community.

Quote:
Originally Posted by Hitch View Post
It's annoying as hell. I mean, fine, PBS, I get it, you want to keep your access and all that good crap. So then, why NOT offer the "unedited" versions at 2:00 a.m., or something, so that those of us who don't want to miss plot points don't? Is that too much to ask? I forget which show it was, in the past...year or two? But something happened, that made NO sense. I even thought, "oh, I must not have been paying attention," (as I'm frequently also reading at the same time). When I went back and watched it? Ixnay. They'd chopped out the plot point, which resolved something. I was not a happy camper.
Reminds me of them cutting minutes off of shows... and to top it off, speeding the episodes up by a few percent:

https://www.techdirt.com/articles/20...ery-hour.shtml

Stuff like this is partially why I don't watch TV any more... it is absolutely unbearable.

Last edited by Tex2002ans; 08-14-2016 at 04:44 AM.
Tex2002ans is offline   Reply With Quote