我收到此错误:logits 和标签必须具有相同的第一个维度,在尝试训练我使用 tensorflow 的 SimpleRNN 层创建的聊天机器人时,logits 形状为 [3,21],标签形状为 [15]
当我尝试训练我使用 tensorflow 提供的 SimpleRNN 层创建的聊天机器人时,我得到了这个错误:logits 和标签必须具有相同的第一个维度,logits 形状为 [3,21],标签形状为 [15]。我已确保使用填充和适当的标记化来使输入具有相同的大小,但我仍然收到此错误。我的代码如下。
import tensorflow as tf
import numpy as np
import json
# Step 1: Prepare the Data
questions = ["What is your name?", "How are you?", "What is the time?"]
answers = ["My name is ChatBot.", "I am fine, thank you.", "I do not have a watch."]
with open('q.txt', 'w') as q_file, open('a.txt', 'w') as a_file:
for q, a in zip(questions, answers):
q_file.write(q + '\n')
a_file.write(a + '\n')
# Step 2: Preprocess the Data
with open('q.txt', 'r') as q_file:
questions = q_file.readlines()
with open('a.txt', 'r') as a_file:
answers = a_file.readlines()
tokenizer = tf.keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(questions + answers)
q_sequences = tokenizer.texts_to_sequences(questions)
a_sequences = tokenizer.texts_to_sequences(answers)
q_sequences = tf.keras.preprocessing.sequence.pad_sequences(q_sequences, padding='post')
a_sequences = tf.keras.preprocessing.sequence.pad_sequences(a_sequences, padding='post')
q_sequences = np.array(q_sequences)
a_sequences = np.array(a_sequences)
# Step 3: Build the Model
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(tokenizer.word_index) + 1, output_dim=64, input_length=q_sequences.shape[1]),
tf.keras.layers.SimpleRNN(64, return_sequences=True),
tf.keras.layers.SimpleRNN(64),
tf.keras.layers.Dense(len(tokenizer.word_index) + 1, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()
# Step 4: Train the Model
# Adjust labels to match the model's output
# Since we are predicting the next token, slice answers accordingly
model.fit(q_sequences, a_sequences[:, 1:], epochs=10)
# Step 5: Save the Weights and Biases
model.save('chatbot_model.h5')
weights_and_biases = {}
for layer in model.layers:
if isinstance(layer, tf.keras.layers.SimpleRNN) or isinstance(layer, tf.keras.layers.Dense):
weights_and_biases[layer.name] = {
'weights': layer.get_weights()[0].tolist(),
'biases': layer.get_weights()[1].tolist()
}
with open('weights_and_biases.json', 'w') as json_file:
json.dump(weights_and_biases, json_file)
# Step 6: Custom Feedforward Function
def custom_feedforward(input_sequence, weights_and_biases):
def relu(x):
return np.maximum(0, x)
def softmax(x):
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=-1, keepdims=True)
w_rnn1 = np.array(weights_and_biases['simple_rnn']['weights'])
b_rnn1 = np.array(weights_and_biases['simple_rnn']['biases'])
hidden_state = np.zeros((w_rnn1.shape[1],))
for t in range(input_sequence.shape[0]):
hidden_state = relu(np.dot(input_sequence[t], w_rnn1) + b_rnn1 + np.dot(hidden_state, w_rnn1.T))
w_dense = np.array(weights_and_biases['dense']['weights'])
b_dense = np.array(weights_and_biases['dense']['biases'])
output = softmax(np.dot(hidden_state, w_dense) + b_dense)
return output
# Example usage of the custom feedforward function
input_sequence = q_sequences[0]
output = custom_feedforward(input_sequence, weights_and_biases)
print(output)
model.fit 行发生错误。
我原本希望该程序能够正确地完成 20 个 epoch 的训练,并且不会出现任何大小错误。谢谢您的帮助!