算法方面有几件事可以帮助你……
如果您坚持使用现有代码,则应该进行 filtered_words
。 set
您正在对包含 80K 个元素的列表进行 3.6M 次查找,这会很慢。如果失败,您将对 P(10,9) 等进行另外 3M 次以上的查找。
考虑事先对单词进行“工作”(如下所示)。如果您对每个可测试的 ~80K 个单词中的字母进行排序(这对于小型列表来说非常快),那么您可以避免对示例字母进行排列,并且只需要进行几百次(最多)比较,因为您可以使用 组合 并进行排序。与排列总和相比,sum(C(10,10)=1 , C(10,9) = 10 , C(10, 8) = 45, ... ) 微不足道。
您的代码的这个修改在不到一秒的时间内运行:
import random #To make the random string given to the player
from collections import Counter, defaultdict
from itertools import permutations, \
combinations # For obtaining all permutations for a 10 letter set
# Define the file path
file_path = 'palabras.txt'
#Dictionary with accented letters and their unaccented counterparts to be replaced by
accented_to_unaccented = {
'é': 'e', 'í': 'i', 'ú': 'u', 'á': 'a', 'ó': 'o', 'ü': 'u', 'è': 'e'
}
# Initialize an empty list to store the filtered words
filtered_words = defaultdict(list)
#Function to replace accents in how common the letters are
def replace_accented_letters(word):
return ''.join(accented_to_unaccented.get(char, char) for char in word) #This will parse all the words in the file and replace the first char in the dict (the accented one) with the solution
# Open and read the file
with open(file_path, 'r', encoding='utf-8') as file:
# Read each line (word) in the file
for line in file:
if len(line) > 10:
continue
# Strip any leading/trailing whitespace, force lowercase and check the word length
word = line.strip().lower()
word = replace_accented_letters(word) #Call function to replace accents
filtered_words[''.join(sorted(word))].append(word)
def find_longest_word(letters, word_list: dict):
longest_word = []
# Iterate over lengths from len(letters) down to 1
for length in range(len(letters), 0, -1):
for sample in combinations(letters, length): # Filter word_list to include only words of
sorted_letters = ''.join(sorted(sample))
matches = word_list.get(sorted_letters, None)
if matches:
return matches
return None
random_string = ''.join(random.sample('abcdefghijklmnopqrstuvwxyz', 10))
#Use the longest word function
longest_word = find_longest_word(random_string, filtered_words)
print(f"La palabra mas larga de las letras random '{random_string}' posible era: {longest_word}")