在 python 中我使用 itertools 组合来获取所有唯一的 5 个组合:from itertools import combinationslst = [\'a\', \'b\', \'c\', \'d\',\'e\', \'f&...
在 python 中我使用 itertools 组合来获取所有唯一的 5 个组合:
from itertools import combinations
lst = ["a", "b", "c", "d","e", "f"]
combinations_of_5 = list(combinations(lst, 5))
然后,我得到了一个列表列表,我已过滤该列表列表,仅包含来自 lst 的项目。现在,我需要统计在列表中找到的最受欢迎的组合。
lst_of_lsts = [
["e", "d", "c", "b", "a", "c", "d", "b"],
["e", "b", "c", "a", "b", "e", "c"],
["a", "b", "a", "d", "b", "a", "a"],
["a", "b", "c", "d", "e", "c", "d"],
["e", "d", "c", "b","a"]
]
我以为这会相当简单,但我一直在努力寻找一种可靠的方法来做到这一点并确保结果准确。我实际的列表列表要大得多(几百万个列表),我想知道在列表列表中发现的 10 个最受欢迎的组合。我的想法是也许使用滑动窗口技术,看看其中是否存在任何组合。
任何帮助表示感谢。
尝试提供的解决方案:
from itertools import combinations
from collections import Counter
lst = ["a", "b", "c", "d","e"]
lst_of_lsts = [
["e", "d", "c", "b", "a", "c", "d", "b"],
["e", "b", "c", "a", "b", "e", "c"],
["a", "b", "a", "d", "b", "a", "a"],
["a", "b", "c", "d", "e", "c", "d"],
["e", "d", "c", "b","a"]
]
combinations_of_5 = set(combinations(lst, 5))
data = map(tuple, lst_of_lsts)
c = Counter(comb for comb in data if comb in combinations_of_5)
print(c.most_common(10))
返回 []
您可以简单地使用计数器,如下所示 [Counter(l) for l in lst_of_lsts]
:
from collections import Counter
lst_of_lsts = [
["e", "d", "c", "b", "a", "c", "d", "b"],
["e", "b", "c", "a", "b", "e", "c"],
["a", "b", "a", "d", "b", "a", "a"],
["a", "b", "c", "d", "e", "c", "d"],
["e", "d", "c", "b", "a"]
]
counters = [Counter(l) for l in lst_of_lsts]
res = [counter.most_common(1)[0] for counter in counters]
print(counters)
print(res)
[Counter({'d': 2, 'c': 2, 'b': 2, 'e': 1, 'a': 1}), Counter({'e': 2, 'b': 2, 'c': 2, 'a': 1}), Counter({'a': 4, 'b': 2, 'd': 1}), Counter({'c': 2, 'd': 2, 'a': 1, 'b': 1, 'e': 1}), Counter({'e': 1, 'd': 1, 'c': 1, 'b': 1, 'a': 1})]
[('d', 2), ('e', 2), ('a', 4), ('c', 2), ('e', 1)]
from collections import Counter
lst_of_lsts = [
["e", "d", "c", "b", "a", "c", "d", "b"],
["e", "b", "c", "a", "b", "e", "c"],
["a", "b", "a", "d", "b", "a", "a"],
["a", "b", "c", "d", "e", "c", "d"],
["e", "d", "c", "b", "a"]
]
counter = Counter([tuple(sorted(l)) for l in lst_of_lsts])
print(counter.most_common(1)[0])
(('a', 'b', 'b', 'c', 'c', 'd', 'd', 'e'), 1)
from collections import Counter
lst_of_lsts = [
["e", "d", "c", "b", "a", "c", "d", "b"],
["e", "b", "c", "a", "b", "e", "c"],
["a", "b", "a", "d", "b", "a", "a"],
["a", "b", "c", "d", "e", "c", "d"],
["e", "d", "c", "b", "a"]
]
counter = Counter([tuple(sorted(set(l))) for l in lst_of_lsts])
print(counter)
print(counter.most_common(1)[0])
Counter({('a', 'b', 'c', 'd', 'e'): 3, ('a', 'b', 'c', 'e'): 1, ('a', 'b', 'd'): 1})
(('a', 'b', 'c', 'd', 'e'), 3)
import string
from collections import Counter
from itertools import combinations
lst = string.ascii_lowercase[0:6]
combinations_of_5 = tuple(combinations(lst, 5))
# print(combinations_of_5)
lst_of_lsts = [
["e", "d", "c", "b", "a", "c", "d", "b"],
["e", "b", "c", "a", "b", "e", "c"],
["a", "b", "a", "d", "b", "a", "a"],
["a", "b", "c", "d", "e", "c", "d"],
["e", "d", "c", "b", "a"]
]
res = Counter([tuple(sorted(l)) for l in lst_of_lsts if len(l) == len(set(l)) and len(l) == 5])
print(res)
Counter({('a', 'b', 'c', 'd', 'e'): 1})