-
Notifications
You must be signed in to change notification settings - Fork 79
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
198 additions
and
6 deletions.
There are no files selected for viewing
78 changes: 78 additions & 0 deletions
78
globalization/localization/ai/ai-and-llms-for-translation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
--- | ||
title: Use AI and large language models for translation | ||
description: Discover how large language models (LLMs) are revolutionizing localization, offering near-human quality and versatility in multilingual applications. | ||
author: jowilco | ||
ms.author: jowilco | ||
ms.topic: conceptual #Required; leave this attribute/value as-is. | ||
ms.date: 08/15/2024 | ||
ms.custom: | ||
- ai-gen-docs-bap | ||
- ai-gen-desc | ||
- ai-seo-date:08/15/2024 | ||
--- | ||
|
||
# Using artificial intelligence and large language models for translation | ||
|
||
With recent advances in large language models (LLMs), many localizers are considering whether to use AI instead of existing machine translation (MT) systems or even as a replacement for human translators (HT). The latest LLMs are performing well, getting close to HT-level quality, especially for “high-resource” languages. However, LLM-based solutions: | ||
|
||
- Might not perform as well as existing technologies, such as neural machine translation (NMT), especially for fields with specialized terminology such as healthcare. | ||
- Take longer and are more expensive to train than NMT | ||
- Are slower and require more processing power than NMT | ||
|
||
LLMs are evolving rapidly, costs are decreasing, and speed is increasing year-over-year, so many of the current concerns might be less relevant in the future. | ||
|
||
## Artificial intelligence and translation technology | ||
|
||
### Machine translation | ||
|
||
Machine translation (MT) systems are applications or online services that use technology to translate text between any of their supported languages. Although the concepts behind machine translation technology and the interfaces to use it are relatively simple, the science behind it's complex and brings together several leading-edge technologies. | ||
|
||
There has been an evolution in approaches to machine translation, including: | ||
|
||
- Rules based machine translation: machine translation based on dictionaries and grammar rules of each language | ||
- Statistical machine translation: machine translation based on statistical analysis of bilingual text corpora | ||
- Neural machine translation (NMT): NMT also uses statistical analysis to predict the likelihood of word sequences. It relies on neural networks to model entire sentences. | ||
|
||
Advances in large language models (LLMs) are enabling new paradigms for natural language processing tasks, which include translation. LLMs have the potential to outperform NMT, while enabling [natural language processing features](globalizing-ai-based-features.md) in multilingual applications. | ||
|
||
### Large language models and globalization | ||
|
||
Generative AI is a type of artificial intelligence focused on the ability of computers to use models to create content like text, synthetic data, and images. Generative AI applications are built on top of generative AI models such as large language models (LLMs). | ||
|
||
LLMs are deep learning models that consume and train on massive datasets to excel in language processing tasks such as translation. After these models have completed their learning processes, they generate statistically probable outputs when prompted. The models create new combinations of text that mimic natural language based on its training data. | ||
|
||
The development of LLMs has been a gradual process. The first LLMs were relatively small and could only perform simple language tasks. However, with the advances in deep neural networks, larger and more powerful LLMs were created. The 2020 release of the Generative Pre-trained Transformer 3 (GPT-3) model marked a significant milestone in the development of LLMs. GPT-3 demonstrated the ability to generate coherent and convincing text that was difficult to distinguish from text written by humans. | ||
|
||
GPT-3, and subsequent models, have been trained on datasets in multiple languages; therefore, these models are able to generate output in multiple languages. However, the quality of the output in each language is related to the amount of training data in that language. Languages where the LLMs were trained with a large set of data are considered *high-resource* languages. Languages that were trained with smaller sets of data are considered *low-resource* languages. | ||
|
||
AI and LLMs have the potential to be transformative technologies for globalization. While LLMs weren’t trained specifically for translation, their broad applicability to natural language tasks means that they perform well for translation, especially for high-resource languages. In addition, LLM features in a product often perform well for languages other than the original product language. | ||
|
||
### Neural machine translation vs large language models | ||
|
||
Many of the current state of the art-translation applications, such as [Microsoft Translator](https://www.microsoft.com/translator/business/), are based on neural machine translation (NMT). NMT is an improvement on previous statistical machine translation (SMT)-based approaches as it uses far more *dimensions* to represent the tokens (words, morphemes, punctuation, etc.) of the source and target text. | ||
|
||
Unlike NMT, large language models (LLMs) weren't designed for translation. However, as LLMs are designed to excel at language processing tasks, they often perform well at translation, especially between high-resource language pairs. | ||
|
||
There are similarities between NMT and LLM: | ||
|
||
- Both are pretrained using bilingual (or multi-lingual) corpora | ||
- Both can be trained, or [fine-tuned](/ai/playbook/technology-guidance/generative-ai/working-with-llms/fine-tuning), to perform better for specific tasks | ||
|
||
However, there are also differences that means that NMT or LLMs might be the most appropriate technology, depending on the task: | ||
|
||
- It’s easier and cheaper to fine-tune NMT for specific fields of translation, such as healthcare. | ||
- LLMs, in general, produce more fluent text, while NMT produces more accurate text. | ||
- NMT typically processes segment by segment, while LLMs can work on entire documents at once. So, LLMs perform better with explicit context. | ||
- It can be easier to integrate existing glossaries and term bases with NMT than LLMs. | ||
- NMT performs faster than LLMs; however, newer LLMs perform better than previous LLMs. Speed might be a significant concern for processing large volumes of text. | ||
- Processing translations using LLMs is more expensive than NMT. This is especially true for low-resource languages. | ||
- NMT can be optimized for language variants. LLMs might have trouble differentiating between and producing text for language variants such as Portuguese for Portugal and Brazilian Portuguese. | ||
- NMT is optimized specifically for translation while LLMs can be used for various language processing tasks. For example, an LLM could be used to create a business email in Japanese. | ||
|
||
## Using LLMs for localization tasks other than translation | ||
|
||
Due to their wide applicability for language processing tasks, consider using LLMs for other tasks in your localization workflow. For example, | ||
|
||
- LLMs might be suitable for linguistic review of human translated or machine translated text. | ||
- LLMs can be used to generate test data in multiple languages. | ||
- LLMs might produce better output than other machine translation methods for responses to technical support requests if your team can't support a language natively. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
--- | ||
title: Artificial intelligence and localization | ||
description: Use AI for global strategy by ensuring responsible practices and inclusive design to avoid biases, enhance user experience, and drive product success worldwide. | ||
author: jowilco | ||
ms.author: jowilco | ||
ms.topic: conceptual #Required; leave this attribute/value as-is. | ||
ms.date: 08/15/2024 | ||
ms.custom: | ||
- ai-gen-docs-bap | ||
- ai-gen-desc | ||
- ai-seo-date:08/15/2024 | ||
--- | ||
|
||
# Artificial intelligence and localization | ||
|
||
Artificial intelligence (AI) is the capability of a computer system to mimic human-like cognitive functions such as learning and problem-solving. An artificially intelligent computer system makes predictions or takes actions based on patterns in existing data and can then learn from its errors to increase its accuracy. A mature AI processes new information quickly and accurately, which makes it useful for complex scenarios such as self-driving cars, image recognition programs, and virtual assistants. | ||
|
||
Businesses around the world already use AI in a wide variety of applications, and intelligent technology is a growing field. As AI becomes more ubiquitous, use of AI in and for your product development must be a key component of your globalization strategy. | ||
|
||
Two examples of AI in global product development are: | ||
|
||
- [using AI for translation](ai-and-llms-for-translation.md) | ||
- [ensuring that AI-based features work correctly for users in all target markets](localizing-ai-based-features.md) | ||
|
||
## Responsible AI | ||
|
||
As artificial intelligence (AI) plays a larger role in our daily lives, it's more important than ever that AI systems are built to provide a helpful, safe, and trustworthy experience for everyone around the world. Microsoft defines six principles as the foundation for Responsible AI practices, practices that are intended to keep people and their goals at the center of the design process, and considers the benefits and potentials harms that AI systems can have on society. These principles are: | ||
|
||
- Fairness – AI systems should treat all people fairly. | ||
- Reliability and safety – AI systems should perform reliably and safely. | ||
- Privacy and security –AI systems should be secure and respect privacy. | ||
- Inclusiveness – AI systems should empower everyone and engage people. | ||
- Transparency – AI systems should be understandable. | ||
- Accountability – People should be accountable for AI systems. | ||
|
||
For more information about Microsoft’s approach to responsible AI, see [https://www.microsoft.com/ai/responsible-ai](https://www.microsoft.com/ai/responsible-ai). | ||
|
||
### Potential AI bias with international data | ||
|
||
Machine learning (ML) is the process of using mathematical models of data to help a computer learn without direct instruction. ML is considered a subset of artificial intelligence (AI). Machine learning uses algorithms to identify patterns within data, and those patterns are then used to create a data model that can make predictions. The adaptability of machine learning makes it a great choice in scenarios where the data is always changing, the nature of the task is always shifting, or coding a solution would be effectively impossible. | ||
|
||
The model that ML generates is defined by the data on which it was trained. The choice of training data can affect how the AI based on the model performs. If the training data contains historical prejudices and stereotypes, the AI might respond to the same prejudices and stereotypes. This isn't a desirable outcome for Responsible AI. | ||
|
||
For example, facial recognition systems trained predominantly on Western faces might perform poorly on individuals from other regions. An AI-generated response could use a word or phrase that’s acceptable in one culture but might be offensive in another. Or an AI-generated image could unintentionally be alienating to an entire segment of an audience, even if the image itself isn’t offensive. For example, displaying a snowscape to represent a time of year to users in the northern hemisphere of the globe wouldn’t be appropriate for users in the southern hemisphere where they're currently experiencing summer. | ||
|
||
Inclusive design aims to address AI bias by using diverse and representative datasets, and by involving stakeholders from different backgrounds in the design and evaluation process. Additionally, ensuring language diversity in training data ensures that AI systems perform well across different linguistic groups. |
61 changes: 61 additions & 0 deletions
61
globalization/localization/ai/localizing-ai-based-features.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
--- | ||
title: Localize artificial intelligence-based features | ||
description: Optimize AI features for global markets by customizing LLM outputs to meet diverse language and cultural needs, ensuring alignment with business goals and user expectations. | ||
author: jowilco | ||
ms.author: jowilco | ||
ms.topic: conceptual #Required; leave this attribute/value as-is. | ||
ms.date: 08/15/2024 | ||
ms.custom: | ||
- ai-gen-docs-bap | ||
- ai-gen-desc | ||
- ai-seo-date:08/15/2024 | ||
--- | ||
|
||
# Localizing artificial intelligence-based products and features | ||
|
||
AI-based products and features have become more prevalent since the 2020 release of the Generative Pre-trained Transformer 3 (GPT-3) large language model (LLM). These features are usually designed to support the source-language market. While other language support might be easy to enable, you shouldn't assume that features will work without more extended customization. | ||
|
||
It's essential to ensure that the LLM's outputs align with business goals and user expectations. Consider an LLM generating marketing emails. Without evaluation, these emails might come across as too formal, too casual, or too generic, depending on the target language. By assessing a sample of outputs for each target language, you can optimize their impact and relevance, making sure they effectively meet the business's objectives and resonate with the target audience. | ||
|
||
## Generating output in non-source languages | ||
|
||
There are two general approaches when creating output from LLMs in languages other than the original language: | ||
|
||
1. Translate the prompt into the target language and have the LLM respond in that language. | ||
1. Use the source language prompt, but ask the LLM to return the output in the target language. | ||
|
||
Either approach might be most appropriate for your prompt, use case, and source/target language pair. Testing the output then becomes critical to ensuring the best chance of supporting your customers in their language. | ||
|
||
## LLM prompt engineering and testing the output | ||
|
||
Large language models (LLMs) can learn new tasks on the fly, without requiring any explicit training or parameter updates. This mode of using LLMs is called in-context learning. It relies on providing the model with a suitable input prompt that contains instructions and/or examples of the desired task. The input prompt serves as a form of conditioning that guides the model's output. | ||
|
||
The process of designing and tuning the natural language prompts for specific tasks, with the goal of improving the performance of LLMs is called *prompt engineering*. Effective prompt engineering can significantly improve the performance of LLMs on specific tasks. It's done by providing instructions and contextual information that help guide the model's output. Researchers can steer the LLM's attention toward the most relevant information for a given task by carefully designing prompts, leading to more accurate and reliable outputs. | ||
|
||
If you intend to start with a source language prompt, then translate it for each target language, you shouldn't assume that just translating the prompt will be sufficient. Prompt translation is akin to translating marketing materials, in other words, [transcreation](../transcreation.md). While the translated prompt might be a good starting point, you'll need to repeat the prompt engineering process for each target language. | ||
|
||
## Considerations for effective LLM prompts | ||
|
||
### Region availability | ||
|
||
You should ensure that the AI technology that you're planning to use is available in all the regions and for all the locales that you would like to support. Some AI technologies are subject to export restrictions and if these technologies are a critical component of your implementation then you might be limited to specific regions. | ||
|
||
### Handling regional formats | ||
|
||
Even when the same language is used for two or more locales (for example, English for US, UK, and Australia), if your LLM output contains data that need to be formatted according to [regional formats](../../locale/regional-settings.md), you might need to specify the locale of the output in the prompt. Evaluate the prompt response using the names of the country/region and language combination and [IETF BCP 47 locales](../../locale/standard-locale-names.md) to see which produces the most appropriate output for your use case. | ||
|
||
### Customizing for different cultural contexts | ||
|
||
The level of formality for a business email for a recipient in the US will be different than a business email for a recipient in Japan, or even the UK. Consider the audience when prompt engineering for multiple languages and locales. | ||
|
||
### Content moderation | ||
|
||
Your source language prompt might include *content moderation*, for example, specifying terms or words that shouldn't be used in the response. Content moderation might also attempt to use the prompt to make responses suitable for a specific age group or to meet national content requirements. You might need to adjust the prompt so that content moderation is still appropriate for the target language or locale. | ||
|
||
### Cross-border data flow and AI laws | ||
|
||
Cross-border data flow is about the transfer of data across national borders, which is pivotal for global software development projects. Compliance with regional data protection laws and ensuring adequate computing resources are critical aspects that require meticulous planning and execution. | ||
|
||
The computing resources that are hosting your LLMs might be in a different geography than other resources that you're using. Ensure that you're still compliant with regulations like the General Data Protection Regulation (GDPR) in Europe that can restrict data movement, aiming to protect privacy and national security. | ||
|
||
The EU Artificial Intelligence Act includes stringent requirements for high-risk AI systems, adds another layer of complexity by imposing strict standards on data usage, transparency, and accountability. These regulations can pose challenges for AI development, as they require careful navigation to balance innovation with privacy and security concerns. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters