We have already introduced the Transformer model platform “[Hugging Face]Ep.1 An AI platform that ordinary people can play”, and I believe that many players will encounter such a situation during the operation process, so we will sort out the problems encountered and share solutions, so that friends in need can refer to them.
question
Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
prospect feed
The story goes like this, Xiao Ming is a software engineer who specializes in technology in the field of speech recognition.wav2vec2When recognizing the voice recognition model of , an error occurred at a critical moment, and this error may also be encountered by other people, so I decided to organize the process to help partners on the same technical road to break through the difficulties together.
First Xiao Ming usedwav2vec2The speech recognition model of , and load the Chinese model “wav2vec2-large-xlsr-53-chinese-zh-cn-gpt“, and expect to use the GPU to accelerate the recognition speed, so set DEVICE to cuda.
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
SRC_MODEL = ‘ydshieh/wav2vec2-large-xlsr-53-chinese-zh-cn-gpt’
DEVICE = ‘cuda’
processor = Wav2Vec2Processor.from_pretrained(SRC_MODEL)
model = Wav2Vec2ForCTC.from_pretrained(SRC_MODEL).to(DEVICE)
Then the audio file is directly identified.
audio_buffer, _ = sf.read(‘test.wav’)
input_values = processor(audio_buffer, sampling_rate=16000, return_tensors=”pt”).input_values
logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.decode(predicted_ids[0])
transcription
It turned out that something went wrong… What should I do?
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) …
reason
According to the error message, it is estimated that the input type is CPU (torch.FloatTensor), but the model type is GPU (torch.cuda.FloatTensor), so the data source needs to be converted to the type of GPU to match.
How to solve
We try to convert audio data into data of type “torch.cuda.FloatTensor”.
input_values = input_values.to(DEVICE)
In this way, the data types of the model and the data will be consistent. After all, the GPU and the CPU are not compatible, so be very careful when performing calculations…