TAGS :Viewed: 13 - Published at: a few seconds ago

[ regex and bigrams ]

I have a list of bigrams :

['only wish', 'please fix', 'please add', 'only request', 'just hope']

and a list of strings:

['this is a wonderful utility. My only wish is to get a new sync feature.', 'Does not work well, please fix the problem.', 'Great, works fine. just hope they keep adding new utilities.', 'My only request is they add a new ui']

I need to search for these bigrams in the list of strings (assuming I can handle the upper/lower case), I am not sure if regexes are the best way to look up these bigrams in the list of strings, any help will be appreciated.

Answer 1


Heres one way to do it without regex:

bigrams = ['only wish', 'please fix', 'please add', 'only request', 'just hope']
text = ['this is a wonderful utility. My only wish is to get a new sync feature.', 'Does not work well, please fix the problem.', 'Great, works fine. just hope they keep adding new utilities.', 'My only request is they add a new ui']

for string in text:
    for bigram in bigrams:
        if bigram in string.lower():
            print bigram + ' in ' + string