8wDlpd.png
8wDFp9.png
8wDEOx.png
8wDMfH.png
8wDKte.png

找到 5 个唯一元素的所有组合后,我想找出列表列表中哪个组合出现的次数最多?

Android Control 3月前

81 0

在 python 中我使用 itertools 组合来获取所有唯一的 5 个组合:from itertools import combinationslst = [\'a\', \'b\', \'c\', \'d\',\'e\', \'f&...

在 python 中我使用 itertools 组合来获取所有唯一的 5 个组合:

from itertools import combinations
lst = ["a", "b", "c", "d","e", "f"]
combinations_of_5 = list(combinations(lst, 5))

然后,我得到了一个列表列表,我已过滤该列表列表,仅包含来自 lst 的项目。现在,我需要统计在列表中找到的最受欢迎的组合。

lst_of_lsts = [
               ["e", "d", "c", "b", "a", "c", "d", "b"],
               ["e", "b", "c", "a", "b", "e", "c"],
               ["a", "b", "a", "d", "b", "a", "a"],
               ["a", "b", "c", "d", "e", "c", "d"],
               ["e", "d", "c", "b","a"]
              ]

我以为这会相当简单,但我一直在努力寻找一种可靠的方法来做到这一点并确保结果准确。我实际的列表列表要大得多(几百万个列表),我想知道在列表列表中发现的 10 个最受欢迎的组合。我的想法是也许使用滑动窗口技术,看看其中是否存在任何组合。

任何帮助表示感谢。

尝试提供的解决方案:

from itertools import combinations
from collections import Counter

lst = ["a", "b", "c", "d","e"]

lst_of_lsts = [
               ["e", "d", "c", "b", "a", "c", "d", "b"],
               ["e", "b", "c", "a", "b", "e", "c"],
               ["a", "b", "a", "d", "b", "a", "a"],
               ["a", "b", "c", "d", "e", "c", "d"],
               ["e", "d", "c", "b","a"]
              ]

combinations_of_5 = set(combinations(lst, 5))

data = map(tuple, lst_of_lsts)
c = Counter(comb for comb in data if comb in combinations_of_5)
print(c.most_common(10))

返回 []

帖子版权声明 1、本帖标题:找到 5 个唯一元素的所有组合后,我想找出列表列表中哪个组合出现的次数最多?
    本站网址:http://xjnalaquan.com/
2、本网站的资源部分来源于网络,如有侵权,请联系站长进行删除处理。
3、会员发帖仅代表会员个人观点,并不代表本站赞同其观点和对其真实性负责。
4、本站一律禁止以任何方式发布或转载任何违法的相关信息,访客发现请向站长举报
5、站长邮箱:yeweds@126.com 除非注明,本帖由Android Control在本站《list》版块原创发布, 转载请注明出处!
最新回复 (0)
  • @JohnGordon 你说得完全正确。实际的列表是 100 个项目,我只是快速将其放在那里,更新了,谢谢!

  • @user24714692 我希望输出返回在列表列表中找到的最常见组合,然后返回找到的次数。这似乎很简单,但出于某种原因,有点难。

  • @user24714692 不行,因为组合模式应该只包含唯一字符,不能有重复。模式长度最多为 5。

  • 首先将 itertools 生成的组合存储在一个集合中,以便快速进行成员资格测试。

       combinations_of_5 = set(combinations(lst, 5))
    

    接下来将您的列表过滤为 collections.Counter

       data = map(tuple, lst_of_lsts)
       c = Counter(comb for comb in data if comb in combinations_of_5)
       print(c.most_common(10))
    

    话虽如此,我认为您实际上可能正在寻找排列而不是组合。

  • 对于这个顺序并不重要,我只是想计算某个模式在列表列表中出现的次数。这是针对几百万个列表运行的,所以我需要它尽可能高效。每个字母代表一个动作,我试图找到最常见的动作序列。

  • 您可以简单地使用计数器,如下所示 [Counter(l) for l in lst_of_lsts]

    代码 1

    from collections import Counter
    
    lst_of_lsts = [
        ["e", "d", "c", "b", "a", "c", "d", "b"],
        ["e", "b", "c", "a", "b", "e", "c"],
        ["a", "b", "a", "d", "b", "a", "a"],
        ["a", "b", "c", "d", "e", "c", "d"],
        ["e", "d", "c", "b", "a"]
    ]
    
    counters = [Counter(l) for l in lst_of_lsts]
    res = [counter.most_common(1)[0] for counter in counters]
    print(counters)
    print(res)
    
    
    

    印刷

    [Counter({'d': 2, 'c': 2, 'b': 2, 'e': 1, 'a': 1}), Counter({'e': 2, 'b': 2, 'c': 2, 'a': 1}), Counter({'a': 4, 'b': 2, 'd': 1}), Counter({'c': 2, 'd': 2, 'a': 1, 'b': 1, 'e': 1}), Counter({'e': 1, 'd': 1, 'c': 1, 'b': 1, 'a': 1})]
    [('d', 2), ('e', 2), ('a', 4), ('c', 2), ('e', 1)]
    

    代码 2

    from collections import Counter
    
    lst_of_lsts = [
        ["e", "d", "c", "b", "a", "c", "d", "b"],
        ["e", "b", "c", "a", "b", "e", "c"],
        ["a", "b", "a", "d", "b", "a", "a"],
        ["a", "b", "c", "d", "e", "c", "d"],
        ["e", "d", "c", "b", "a"]
    ]
    
    counter = Counter([tuple(sorted(l)) for l in lst_of_lsts])
    print(counter.most_common(1)[0])
    
    

    印刷

    (('a', 'b', 'b', 'c', 'c', 'd', 'd', 'e'), 1)
    

    代码 3

    from collections import Counter
    
    lst_of_lsts = [
        ["e", "d", "c", "b", "a", "c", "d", "b"],
        ["e", "b", "c", "a", "b", "e", "c"],
        ["a", "b", "a", "d", "b", "a", "a"],
        ["a", "b", "c", "d", "e", "c", "d"],
        ["e", "d", "c", "b", "a"]
    ]
    
    counter = Counter([tuple(sorted(set(l))) for l in lst_of_lsts])
    print(counter)
    print(counter.most_common(1)[0])
    

    印刷

    Counter({('a', 'b', 'c', 'd', 'e'): 3, ('a', 'b', 'c', 'e'): 1, ('a', 'b', 'd'): 1})
    (('a', 'b', 'c', 'd', 'e'), 3)
    

    代码4:

    import string
    from collections import Counter
    from itertools import combinations
    
    lst = string.ascii_lowercase[0:6]
    combinations_of_5 = tuple(combinations(lst, 5))
    # print(combinations_of_5)
    
    lst_of_lsts = [
        ["e", "d", "c", "b", "a", "c", "d", "b"],
        ["e", "b", "c", "a", "b", "e", "c"],
        ["a", "b", "a", "d", "b", "a", "a"],
        ["a", "b", "c", "d", "e", "c", "d"],
        ["e", "d", "c", "b", "a"]
    ]
    
    res = Counter([tuple(sorted(l)) for l in lst_of_lsts if len(l) == len(set(l)) and len(l) == 5])
    print(res)
    
    
    

    印刷

    Counter({('a', 'b', 'c', 'd', 'e'): 1})
    
  • 如果我想使用排列而不是组合,代码 4 会有很大不同吗?或者我只需从 itertools 中将组合换成排列?

  • @joelion2 不,你只需将其替换为:tuple(permutations(lst, 5))。问题是我真的不知道你想做什么。lst_of_lsts 由你手动定义,这几乎是 Counter 的核心部分:Counter([tuple(sorted(l)) for l in lst_of_lsts if len(l) == len(set(l)) and len(l) == 5])。只需提出新问题,并使用这些代码明确定义预期输出。

返回
作者最近主题: