TAGS :Viewed: 4 - Published at: a few seconds ago

[ Use OrderedDict or ordered list?(novice) ]

(Using Python 3.4.3) Here's what I want to do: I have a dictionary where the keys are strings and the values are the number of times that string occurs in file. I need to output which string(s) occur with the greatest frequency, along with their frequencies (if there's a tie for the most-frequent, output all of the most-frequent).

I had tried to use OrderedDict. I can create it fine, but I struggle to get it to output specifically the most frequently occurring. I can keep trying, but I'm not sure an OrderedDict is really what I should be using, since I'll never need the actual OrderedDict once I've determined and output the most-frequent strings and their frequency. A fellow student recommended an ordered list, but I don't see how I'd preserve the link between the keys and values as I currently have them.

Is OrderedDict the best tool to do what I'm looking for, or is there something else? If it is, is there a way to filter/slice(or equivalent) the OrderedDict?

Answer 1

You can simply use sorted with a proper key function, in this case you can use operator.itemgetter(1) which will sorts your items based on values.

from operator import itemgetter

print sorted(my_dict.items(),key=itemgetter(1),reverse=True)

Answer 2

This can be solved in two steps. First sort your dictionary entries by their frequency so that the highest frequency is first.

Secondly use Python's groupby function to take matching entries from the list. As you are only interested in the highest, you stop after one iteration. For example:

from itertools import groupby
from operator import itemgetter

my_dict = {"a" : 8, "d" : 3, "c" : 8, "b" : 2, "e" : 2}

for k, g in groupby(sorted(my_dict.items(), key=itemgetter(1), reverse=True), key=itemgetter(1)):
    print list(g)

This would display:

[('a', 8), ('c', 8)]

As a and c are equal top.

If you remove the break statement, you would get the full list:

[('a', 8), ('c', 8)]
[('d', 3)]
[('b', 2), ('e', 2)]