View Single Post
Old 02-10-2018, 07:12 AM   #6
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 854
Karma: 3341026
Join Date: Jan 2017
Location: Poland
Device: Various
Quote:
Originally Posted by Tex2002ans View Post
Just wondering what the use-case is?

Are you trying to pull out all n-grams?
It's just an idea that's on my mind.

Code:
I bought a new smart phone.
I have a very smart phone.
I check all words and if two neighboring (connected with each other) exist in the dictionary - I display the results.

In this case:
Code:
smart + phone = smartphone
(In the first sentence should be "smartphone", in second is OK – written separately.)

Of course, EVERYTHING depends on the context and this context does not manage to "catch" correctly.

My dream is to get a result close to:
Code:
(.{0,10})(?=(\b\w+\b[,;.\s]*\b\w+\b))(.{0,10})
The expected result:
Code:
ght a new smart phone.
ve a very smart phone.
(0-10 characters of "context" around words).

I can then jump to the first sentence and manually join the words.
BeckyEbook is offline   Reply With Quote