总体而言:我尝试使用 Cloudbuild 和 Cloudrun 构建 BERT 模型。我将模型(参数)和元数据(标签)保存在 GCP Cloud Storage 中。但是,我遇到了加载元数据.bin 的错误...
总体而言: 我尝试使用 Cloudbuild 和 Cloudrun 。我将模型(参数)和元数据(标签)保存在 GCP 云存储中。但是,我在通过 joblib.load() 加载 metadata.bin 文件时遇到了错误。 我的 metadata.bin 文件包含 UTF-8 字符,但 joblib.load 需要 ASCII 字符。在我的版本中,默认协议是 4,但错误消息表明协议是 0。
相关依赖项: python 3.8.0、joblib 1.1.1(我已经尝试升级最近的版本)、google-api-core==2.19.1、google-auth==2.32.0、google-cloud-core==2.4.1、google-cloud-storage==2.18.0
我的努力: 我已经审理了两起案件。
-
在 本地 。在这种情况下, 在 GCP Cloud Storage 下载 model.bin 和 metadata.bin 文件都可以 .
-
上尝试过 docker 。在这种情况下, 在 dockerized 容器中加载 metadata.bin 文件和 model.bin 文件也有效 .
错误详情 :
`
File "./src_review/model_server.py", line 70, in load_bert_model
metadata = joblib.load(metadata_path)
File "/usr/local/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 658, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/usr/local/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 577, in _unpickle
obj = unpickler.load()
File "/usr/local/lib/python3.8/pickle.py", line 1210, in load
dispatch[key[0]](self)
File "/usr/local/lib/python3.8/pickle.py", line 1244, in load_persid
raise UnpicklingError(
_pickle.UnpicklingError: persistent IDs in protocol 0 must be ASCII strings`
我的代码: `
def load_bert_model(config: argparse.Namespace):
bucket = storage_client.bucket(bucket_name)
model_blob = bucket.blob(model_file)
metadata_blob = bucket.blob(metadata_file)
local_model_path = '/tmp/pytorch_model.bin'
metadata_path = '/tmp/meta.bin'
print(f"Downloading model to {local_model_path}")
model_blob.download_to_filename(local_model_path)
log.info(f"Model downloaded to {local_model_path}")
metadata_blob.download_to_filename(metadata_path)
log.info(f"Metadata (label) downloaded to {metadata_path}")
metadata = joblib.load(metadata_path)
...`
来自GCP官方文档
`
def upload_directory_with_transfer_manager(bucket_name, source_directory, workers=1):
bucket = create_bucket_if_not_exists(bucket_name)
directory_as_path_obj = Path(source_directory)
paths = directory_as_path_obj.rglob("*.bin")
file_paths = [path for path in paths if path.is_file()]
relative_paths = [path.relative_to(source_directory) for path in file_paths]
string_paths = [str(path) for path in relative_paths]
print("Found {} files.".format(len(string_paths)))
results = transfer_manager.upload_many_from_filenames(
bucket, string_paths, source_directory=source_directory, max_workers=workers, skip_if_exists=False
)
for name, result in zip(string_paths, results):
if isinstance(result, Exception):
print("Failed to upload {} due to exception: {}".format(name, result))
else:
print("Uploaded {} to {}.".format(name, bucket.name))
`
预期原因: 我认为 cloud run config 与我的测试环境有很大不同。但我无法预期主要原因。
谢谢你的努力!