Getting Started

Prerequisites

If you don’t already have local credentials setup for your AWS account, you can follow this guide for configuring them using the AWS CLI.

Note

  • Since we use amazon-transcribe SDK, which is built on top of the AWS Common Runtime (CRT), non-standard operating systems may need to compile these libraries themselves.

  • Should at least set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables and in [default] profile ~/.aws/credentials.

Also, ensure that you have portaudio

sudo apt install portaudio19-dev # linux
brew install portaudio # macos

Installation

Install the latest version from pip

pip install openai-game-translator

Install from github repository

git clone https://github.com/Erisae/openai-game-translator
cd openai-game-translator
make install

Uninstallation

Uninstall the latest version from pip

pip uninstall openai-game-translator

Uninstall from github repository

make uninstall

Terminal Usage

In terminal, run command translate to get audio translated to text. Multiple ways are provided:

  • aws_live transcription (best currently):

    translate --openai_key <openai_key> -i <input_language> -o <output_language> aws_live
    
    • <openai_key>: A valid OpenAI API key is required for inferencing GPT model to translate.

    • <input_language>: Language of the audio to be transcribed.

    • <output_language>: Target language for the translation.

    • aws_live: This option uses the AWS live stream transcription model, allowing the voice data stream to be uploaded to AWS services using the AWS SDK while recording the voice. This process does not require the generation of temporary audio files or the use of prerecorded files.

  • aws_pre transcription with prerecorded media:

    translate --openai_key <openai_key> -i <input_language> -o <output_language> aws_pre --file <file_path> --pre_recorded
    
    • aws_pre: This option uses AWS pre-recorded stream transcription model. Prerecorded file is uploaded to AWS service using AWS SDK.

    • <file_path>: Path for the prerecorded media.

    • --pre_recorded: Token specifying prerecorded media is needed.

  • aws_pre transcription without prerecorded media:

    translate --openai_key <openai_key> -i <input_language> -o <output_language> aws_pre --file <file_path>
    
    • <file_path>: Path to store temporary audio file while translating.

  • xunfei transcription with prerecorded media:

    translate --openai_key <openai_key> -i <input_language> -o <output_language> xunfei --xunfei_appid <xf_appid> --xunfei_apikey  <xf_apikey> --xunfei_apisecret <xf_apisecret> --file <file_path> --pre_recorded 
    
    • xunfei: This option uses xunfei’s transcription model.

    • <xf_appid>, <xf_apikey>, <xf_apisecret>: audio transcription api tokens from xunfei.

  • xunfei transcription without prerecorded media:

    translate --openai_key <openai_key> -i <input_language> -o <output_language> xunfei --xunfei_appid <xf_appid> --xunfei_apikey  <xf_apikey> --xunfei_apisecret <xf_apisecret> --file <file_path>
    

Note

  • aws_live, aws_pre and xunfei are subcommands, whether –file, –pre_recorded and the xunfei tokens are required are constrained by these subcommands.

  • Ensure that openai_key, input_language and output_language are assigned before running these subcommands, as otherwise the argument values might not be recognized correctly.

  • In terms of transcription accuracy, I would recommend to use aws_live as aws_pre and xunfei have high requirements for the quality of audio files (prominent human voice, no significant noise). The latter two actually performs well in ideal conditions (and is a substitute for those who can not access AWS).

Script Usage

In Python script, first import library

import openai
from game_translator import gameTranslator

Then fill in OpenAI API key for global usage

openai.api_key = "<openai_key>"

You can initialize multiple types of translators, using different transcription techniques

  • Initialize a translation model with Amazon live transcription

    translator1 = gameTranslator("aws_live") # by default, input language is Chinese and output language is English
    
  • Initialize a translation model with Amazon prerecorded transcription and no prerecorded audio file,

    translator2 = gameTranslator("aws_pre", filepath="path_to_store", prerecorded=False)
    
  • Initialize a translation model with Amazon prerecorded transcription and prerecorded audio file,

    translator3 = gameTranslator("aws_pre", filepath="path_to_prerecorded", prerecorded=True)
    
  • Initialize a translation model with xunfei speed transcription and no prerecorded audio file,

    translator4 = gameTranslator("xunfei", xunfei_appid="xunfei_appid", xunfei_apikey="xunfei_apikey", xunfei_apisecret="xunfei_apisecret", filepath="path_to_store", prerecorded=False)
    

Note

  • xunfei generally only transcribe Chinese and English (for free).

  • Initialize a translation model with xunfei speed transcription and prerecorded audio file,

    translator5 = gameTranslator("xunfei", xunfei_appid="xunfei_appid", xunfei_apikey="xunfei_apikey", xunfei_apisecret="xunfei_apisecret",  filepath="path_to_prerecorded", prerecorded=True)
    

Finally, call openai_translation() to translate,

translator.openai_translation()

This will output the transcription and translation result to terminal.

You can also call to only transcribe,

translator1.aws_live_transcription()
translator2.aws_prerecored_transcription()
translator3.aws_prerecored_transcription()
translator4.xunfei_transcription()
translator5.xunfei_transcription()