Document Translation

This content transformer uses an LLM to translate the body to a default language. If upfront it is detected that the document is already in the default language, the document is left as is. The translated text is afterwards available in the body field “bodyDefault”.

image-20260206-094455.png

Configuration Parameters

  1. Transformer Stage Type: choose Translation

  2. Prompt: Displays the prompt which is sent, together with the document body, to the LLM. You must adjust the translation towards the target language to your needs.

  3. Default language. Here please add the two letter of the ISO 639 language codes so that the stage can (heuristically) determine whether the text needs to be translated or not.

  4. LLM Configuration

    1. Open Llama configuration

      1. Embedding model: here you can provide the name of the embedding model, you want to use. For example mxbai-embed-large

      2. Use authentication. If enabled, the Suite can use basic authentication for communicating with the embedding endpoint. Please provide an according username and password.

      3. Public keys for SSL certificates: this configuration is needed, if you run the environment with self-signed certificates, or certificates which are not known to the Java key store.
        We use a straight-forward approach to validate SSL certificates. In order to render a certificate valid, add the modulus of the public key into this text field. You can access this modulus by viewing the certificate within the browser.

      image-20240928-201320.png

    2. Azure OpenAI GPT configuration

      image-20241005-080407.png

      1. GPT Endpoint: Offer the endpoint such as <https://<baseUrl>>.openai.azure.com/openai/deployments/<deploymentName>/chat/embeddings?api-version=<version>

      2. Password: here please add your API key which you can configure in the OpenAI configuration in the Azure portal.