[ comparing dynamic number of lists (non-equal length) for common entries in Python ]
I'm trying to compare lists of different number and length that have been generated dynamically by user input and pattern matching. I haven't included all the matching code, but you should get the idea of what I'm trying to do.
Following suggestions from another Stack Overflow post, I've used a 'list of lists'. I've used the number of queries inputted by the user to name lists and access them.
At the end of the program I want to do some comparison between the lists, but I can't get my head around how to do this. To start, I'd just like to compare list elements and find those that match in all of the lists, however I'd also like to perform other list comparisons at a later date. I just can't figure out how to access individual lists once I'm outside of the 'for query in dom_queries' loop.
I'm super stuck and woul really apreciate some help!!
Thanks,
# set dom_count and initialise query_list
dom_count = 0
dom_queries = []
# get the number of query domains
domain_number = raw_input('How many domains do you want to find intersects for? ')
# Grab query ID's
while dom_count < int(domain_number):
dom_count += 1
query_domain = raw_input('domain ID query ' + str(dom_count) + ': ')
dom_queries.append(query_domain)
# initialise lists for query_matches
list_of_lists = []
for i in range(len(dom_queries)):
list_of_lists.append( [] )
list_pos = 0
# do some matching here for each dom_query, incrementing list position for each query
# and put matches into the list
for query in dom_queries:
some_match = re.search(r'XYZ',some_line)
list_of_lists[int(list_pos)].append(some_match.group())
list_pos += 1
# HERE IS WHERE I'M STUCK!!!
# I would like to compare all list's generated and find list entries
# that exist in each list (can be any number of lists with different lengths).
for i in range (len(dom_queries)):
common = list(set(list_of_lists[i] & .... \/^.^\/ ??
Answer 1
From all your lists you can create one set that will contain all the items that are present in all lists with the function intersection() This works starting with Python 2.6 and you'll have to covnert the lists to sets first.
http://docs.python.org/2/library/stdtypes.html#set.intersection
Answer 2
First, just a simplification. You can use a list comprehension to create the empty list of lists (just a bit more Pythonic). Also, let's make it a list of sets instead of a list of lists.
list_of_sets = [set() for i in range(domain_number)]
Then we can do something like this:
common_set = set()
for i, s in enumerate(list_of_sets):
if i == domain_number - 1:
break
common_set = common_set.update(s.intersection(list_of_sets[i+1])
So, you start with an empty set and then for each of the sets in the list, you find its intersection with the next set in the list (intersection: all the shared items between the two). You then use update
to merge that intersection set into your set of common elements. Later if you want to manually add an item to the common set you would use the add
method.