Machine learning/NLP
[NLP]. Sentence-Transformer 모델 onnx 형식으로 변환하기
Acdong
2022. 12. 12. 23:23
728x90
HuggingFace에 등록된 모델을 불러와 onnx 파일 형식으로 저장하기
from pathlib import Path
from transformers.convert_graph_to_onnx import convert
convert(framework="pt", model="j5ng/sentence-klue-roberta-base", output=Path("onnx_models/trfs-model.onnx"), opset=11)
Logs
ransformers.convert_graph_to_onnx` package is deprecated and will be removed in version 5 of Transformers warnings.warn( ONNX opset version set to: 11 Loading pipeline (model: j5ng/sentence-klue-roberta-base, tokenizer: j5ng/sentence-klue-roberta-base) Creating folder onnx_models Using framework PyTorch: 1.13.0 Found input input_ids with shape: {0: 'batch', 1: 'sequence'} Found input token_type_ids with shape: {0: 'batch', 1: 'sequence'} Found input attention_mask with shape: {0: 'batch', 1: 'sequence'} Found output output0 with shape: {0: 'batch', 1: 'sequence'} Found output output1 with shape: {0: 'batch'} Ensuring inputs are in correct order position_ids is not present in the generated input list. Generated inputs order: ['input_ids', 'attention_mask', 'token_type_ids'] |
주의할점:
- output 인자 값의 input Type을 Path 클래스로 넣어줘야한다. ( 버전이 업데이트 되면서 바뀐 것 같습니다.)
- 모델 저장 폴더 "onnx_models" 에는 아무것도 없어야 한다. ( 폴더안에 파일이 하나라도 있으면 에러 발생 )
ONNX 모델로 불러와서 임베딩하기
from onnxruntime import InferenceSession
import torch
sess = InferenceSession("onnx_models/trfs-model.onnx", providers=["CPUExecutionProvider"])
def mean_pooling(model_output, attention_mask):
model_output = torch.from_numpy(model_output[0])
token_embeddings = model_output #First element of model_output contains all token embeddings
attention_mask = torch.from_numpy(attention_mask)
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size())
sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1)
sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9)
return sum_embeddings / sum_mask, input_mask_expanded, sum_mask
from transformers import AutoTokenizer
# Using bert-base-uncased because Sentence Transformers uses the same
tokenizer = AutoTokenizer.from_pretrained("j5ng/sentence-klue-roberta-base")
query = "안녕하세요"
model_inputs = tokenizer(query, return_tensors="pt")
inputs_onnx = {k: v.cpu().detach().numpy() for k, v in model_inputs.items()}
sequence = sess.run(None, inputs_onnx)
sentence_embeddings = mean_pooling(sequence, inputs_onnx['attention_mask'])
print(sentence_embeddings[0][0][:5])
tensor([-0.2527, -0.0596, 0.1684, 0.1522, 0.5282]) 로 SentenceTransformer로 임베딩한 값과 일치함.
* 추가 사항
onnx to keras model
import onnx
from onnx2keras import onnx_to_keras
# Load ONNX model
onnx_model = onnx.load('onnx_models/trfs-model.onnx')
input_all = [node.name for node in onnx_model.graph.input]
print(input_all)
# Call the converter (input - is the main model input name, can be different for your model)
k_model = onnx_to_keras(onnx_model, input_all)
ValueError: '/Unsqueeze_output_0/' is not a valid root scope name. A root scope name has to match the following pattern: ^[A-Za-z0-9.][A-Za-z0-9_.\\/>-]*$
tensorflow 의 네이밍 정책에 의해 오류가 나는 모습이다.. 하... Torch , TF 둘이 사이좋게좀 지내라.
반응형