Creating source files

Additional features for translation

There are several additional features which are supported by Zend_Translate. Read here for these additional informations.

Options for adapters

Options can be used with all adapters. Of course the options are different for all adapters. You can set options when you create the adapter. Actually there is one option which is available to all adapters: 'clear' sets if translation data should be added to existing one or not. Standard behaviour is to add new translation data to existing one. But the translation data is only cleared for the selected language. So other languages remain untouched.

You can set options temporarily by giving them to addTranslation(). And you can use the method setOptions() to set options permanent.

Example #1 Using translation options

  1. // define ':' as separator for the translation source files
  2. 'adapter' => 'csv',
  3.         'content' => '/path/to/mytranslation.csv',
  4.         'locale'  => 'de',
  5.         'delimiter' => ':'
  6.     )
  7. );
  8.  
  9. ...
  10.  
  11. // clear the defined language and use new translation data
  12. 'content' => '/path/to/new.csv',
  13.         'locale'  => 'fr',
  14.         'clear'

Here you can find all available options for the different adapters with a description of their usage:

Options for translation adapters
Option Adapter Description Default value
adapter Zend_Translate only Defines the adapter which will be used for the translation. This option can only be given when a new instance of Zend_Translate is created. When it is set afterwards, then it will be ignored Must be set as it has no default value
clear all If set to TRUE, the already read translations will be cleared. This can be used instead of creating a new instance when reading new translation data FALSE
cache all Sets a cache for the translation adapter. This must be a instance of Zend_Cache_Core Per default no cache is set
content all Sets the content for the translation adapter. This could be an array, a filename or a directory. Which type of content is supported depends on the used adapter The default value depends on the used adapter
disableNotices all If set to TRUE, all notices regarding not available translations will be disabled. You should set this option to TRUE in production environment FALSE
ignore all All directories and files beginning with this prefix will be ignored when searching for files. This value defaults to '.' which leads to the behavior that all hidden files will be ignored. Setting this value to 'tmp' would mean that directories and files like 'tmpImages' and 'tmpFiles' would be ignored as well as all subsequent directories. This option also accepts an array which can be used when you want to ignore more than one prefix. .
log all An instance of Zend_Log where untranslated messages and notices will be written to NULL
logMessage all The message which will be written into the log Untranslated message within '%locale%': %message%
logPriority all The priority which is used to write the message into the log 5
logUntranslated all When this option is set to TRUE, all message IDs which can not be translated will be written into the attached log FALSE
reload all When this option is set to TRUE, then files are reloaded into the cache. This option can be used to recreate the cache, or to add translations to already cached data after the cache has already been created. FALSE
route all This option allows to use reroute from non existing translations to other languages. See the Rerouting Section for details about this option. NULL
scan all If set to NULL, no scanning of the directory structure will be done. If set to Zend_Translate::LOCALE_DIRECTORY the locale will be detected within the directory. If set to Zend_Translate::LOCALE_FILENAME the locale will be detected within the filename. See this chapter for details NULL
tag all Sets an individual tag which is used for the attached cache. Using this option allows to use and clear the cache for single instances. When this option is not set, the attached cache is used for all instances combined Zend_Translate
delimiter Csv Defines which sign is used as delimiter for separating source and translation ;
enclosure Csv Defines the enclosure character to be used. Defaults to a doublequote "
length Csv Defines the maximum length of a csv line. When set to 0 it will be detected automatically 0
useId Xliff and Tmx If you set this option to FALSE, then the source string will be used as message Id. The default for this option is TRUE, which means that the Id from the trans-unit element will be used as message Id TRUE

When you want to have self defined options, you are also able to use them within all adapters. The setOptions() method can be used to define your option. setOptions() needs an array with the options you want to set. If an given option exists it will be signed over. You can define as much options as needed as they will not be checked by the adapter. Just make sure not to overwrite any existing option which is used by an adapter.

To return the option you can use the getOptions() method. When getOptions() is called without a parameter it will return all options set. When the optional parameter is given you will only get the specified option.

Handling languages

When working with different languages there are a few methods which will be useful.

The getLocale() method can be used to get the currently set language. It can either hold an instance of Zend_Locale or the identifier of a locale.

The setLocale() method sets a new standard language for translation. This prevents the need of setting the optional language parameter more than once to the translate() method. If the given language does not exist, or no translation data is available for the language, setLocale() tries to downgrade to the language without the region if any was given. A language of en_US would be downgraded to en. When even the downgraded language can not be found an exception will be thrown.

The isAvailable() method checks if a given language is already available. It returns TRUE if data for the given language exist.

And finally the getList() method can be used to get all currently set languages for an adapter returned as array.

Example #2 Handling languages with adapters

  1. // returns the currently set language
  2. $actual = $translate->getLocale();
  3.  
  4. // you can use the optional parameter while translating
  5. "my_text", "fr");
  6. // or set a new language
  7. "fr""my_text");
  8. // refer to the base language
  9. // fr_CH will be downgraded to fr
  10. "fr_CH""my_text");
  11.  
  12. // check if this language exist
  13. "fr")) {
  14.     // language exists
  15. }

Automatical handling of languages

Note that as long as you only add new translation sources with the addTranslation() method Zend_Translate will automatically set the best fitting language for your environment when you use one of the automatic locales which are 'auto' or 'browser'. So normally you will not need to call setLocale(). This should only be used in conjunction with automatic source detection.

The algorithm will search for the best fitting locale depending on the user's browser and your environment. See the following example for details:

Example #3 Automatically language detection

  1. // Let's expect the browser returns these language settings:
  2. // HTTP_ACCEPT_LANGUAGE = "de_AT=1;fr=1;en_US=0.8";
  3.  
  4. // Example 1:
  5. // When no fitting language is found, the message ID is returned
  6. 'adapter' => 'gettext',
  7.         'content' => 'my_it.mo',
  8.         'locale'  => 'auto',
  9.         'scan'// Example 2:
  10. // Best found fitting language is 'fr'
  11. 'adapter' => 'gettext',
  12.         'content' => 'my_fr.mo',
  13.         'locale'  => 'auto',
  14.         'scan'// Example 3:
  15. // Best found fitting language is 'de' ('de_AT' will be degraded)
  16. 'adapter' => 'gettext',
  17.         'content' => 'my_de.mo',
  18.         'locale'  => 'auto',
  19.         'scan'// Example 4:
  20. // Returns 'it' as translation source and overrides the automatic settings
  21. 'adapter' => 'gettext',
  22.         'content' => 'my_it.mo',
  23.         'locale'  => 'auto',
  24.         'scan''content' => 'my_ru.mo', 'locale' => 'ru''it_IT');

After setting a language manually with the setLocale() method the automatic detection will be switched off and overridden.

If you want to use it again, you can set the language auto with setLocale() which will reactivate the automatic detection for Zend_Translate.

Since Zend Framework 1.7.0 Zend_Translate also recognises an application wide locale. You can simply set a Zend_Locale instance to the registry like shown below. With this notation you can forget about setting the locale manually with each instance when you want to use the same locale multiple times.

  1. // in your bootstrap file
  2. 'Zend_Locale', $locale);
  3.  
  4. // default language when requested language is not available
  5. 'en';
  6.  
  7. // somewhere in your application
  8. 'adapter' => 'gettext', 'content' => 'my_de.mo'// not available languages are rerouted to another language

Using a country as language

You can also use a country as locale parameter. This could be useful when you provide your user with flags, which represent the country in which he lives, and when he selects his flag, he would automatically get the default language for this country.

For example, when the user selected US then you would get en_US in return as locale which is being used. This leads automatically to the language en which is the default language for the country US.

  1. span style="color: #ff0000;">'adapter' => 'gettext',
  2.         'content' => 'my_de.mo',
  3.         'locale'  => 'US'
  4.     )
  5. );

Note: Always uppercase countries
Using this syntax you should always uppercase the input when you know that it's a country. The reason is that there are also languages which have the same letters as a country. Take for example om. You could expect to get ar_OM when you mean the country "Oman", or you could expect the language "Oromo" which is spoken in Kenia for example.
As Zend_Translate is related to languages it will always use the language in this case. Therefor always uppercase the locale when you want it to be recognised as country.

Automatic source detection

Zend_Translate can detect translation sources automatically. So you don't have to declare each source file manually. You can let Zend_Translate do this job and scan the complete directory structure for source files.

Note: Automatic source detection is available since Zend Framework version 1.5 .

The usage is quite the same as initiating a single translation source with one difference. You must give a directory which has to be scanned instead a file.

Example #4 Scanning a directory structure for sources

  1. // assuming we have the following structure
  2. //  /language/
  3. //  /language/login/login.tmx
  4. //  /language/logout/logout.tmx
  5. //  /language/error/loginerror.tmx
  6. //  /language/error/logouterror.tmx
  7. 'adapter' => 'tmx', 'content' => '/language')
  8. );

So Zend_Translate does not only search the given directory, but also all subdirectories for translation source files. This makes the usage quite simple. But Zend_Translate will ignore all files which are not sources or which produce failures while reading the translation data. So you have to make sure that all of your translation sources are correct and readable because you will not get any failure if a file is bogus or can not be read.

Note: Depending on how deep your directory structure is and how much files are within this structure it can take a long time for Zend_Translate to complete.

In our example we have used the TMX format which includes the language to be used within the source. But many of the other source formats are not able to include the language within the file. Even this sources can be used with automatic scanning if you do some pre-requisits as described below:

Language through naming directories

One way to include automatic language detection is to name the directories related to the language which is used for the sources within this directory. This is the easiest way and is used for example within standard gettext implementations.

Zend_Translate needs the 'scan' option to know that it should search the names of all directories for languages. See the following example for details:

Example #5 Directory scanning for languages

  1. // assuming we have the following structure
  2. //  /language/
  3. //  /language/de/login/login.mo
  4. //  /language/de/error/loginerror.mo
  5. //  /language/en/login/login.mo
  6. //  /language/en/error/loginerror.mo
  7. 'adapter' => 'gettext',
  8.         'content' => '/language',
  9.         'scan'

Note: This works only for adapters which do not include the language within the source file. Using this option for example with TMX will be ignored. Also language definitions within the filename will be ignored when using this option.

Note: You should be aware if you have several subdirectories under the same structure. Assuming we have a structure like /language/module/de/en/file.mo. In this case the path contains multiple strings which would be detected as locale. It could be either de or en. In such a case the behaviour is undefined and it is recommended to use file detection in such situations.

Language through filenames

Another way to detect the language automatically is to use special filenames. You can either name the complete file or parts of a file after the used language. To use this way of detection you will have to set the 'scan' option at initiation. There are several ways of naming the sourcefiles which are described below:

Example #6 Filename scanning for languages

  1. // assuming we have the following structure
  2. //  /language/
  3. //  /language/login/login_en.mo
  4. //  /language/login/login_de.mo
  5. //  /language/error/loginerror_en.mo
  6. //  /language/error/loginerror_de.mo
  7. 'adapter' => 'gettext',
  8.         'content' => '/language',
  9.         'scan'

Complete filename

Having the whole file named after the language is the simplest way but only viable if you have only one file per language.

  1. /languages/
  2. /languages/en.mo
  3. /languages/de.mo
  4. /languages/es.mo

Extension of the file

Another simple way to use the extension of the file for language detection. But this may be confusing since you will no longer have an idea which extension the file originally had.

  1. /languages/
  2. /languages/view.en
  3. /languages/view.de
  4. /languages/view.es

Filename tokens

Zend_Translate is also capable of detecting the language if it is included within the filename. But if you go this way you will have to separate the language with a token. There are three supported tokens which can be used: a dot '.', an underscore '_', or a hyphen '-'.

  1. /languages/
  2. /languages/view_en.mo -> detects english
  3. /languages/view_de.mo -> detects german
  4. /languages/view_it.mo -> detects italian

The first found string delimited by a token which can be interpreted as a locale will be used. See the following example for details.

  1. /languages/
  2. /languages/view_en_de.mo -> detects english
  3. /languages/view_en_es.mo -> detects english and overwrites the first file
  4. /languages/view_it_it.mo -> detects italian

All three tokens are used to detect the locale. When the filename contains multiple tokens, the first found token depends on the order of the tokens which are used. See the following example for details.

  1. /languages/
  2. /languages/view_en-it.mo -> detects english because '_' will be used before '-'
  3. /languages/view-en_it.mo -> detects italian because '_' will be used before '-'
  4. /languages/view_en.it.mo -> detects italian because '.' will be used before '_'

Ignoring special files and directories

Sometimes it is useful to exclude files or even directories from being added automatically. Therefor you can use the ignore option which accepts 3 possible usages.

Ignore a special directory or file

Per default Zend_Translate is set to ignore all files and directories beginning with '/.'. This means that all SVN files will be ignored.

You can set your own syntax by giving a string for the ignore option. The directory separator will be attached automatically and has to be omitted.

  1. span style="color: #ff0000;">'ignore' => 'test''adapter' => $adapter,
  2.         'content' => $content,
  3.         'locale'  => $locale,
  4.         'ignore'  => 'test'
  5.     )
  6. );

The above example will ignore all files and directories beginning with test. This means for example /test/en.mo, /testing/en.mo and /dir/test_en.mo. But it would still add /mytest/en.mo or /dir/atest.mo.

Note: Prevent SVN files from being searched
When you set this option, then the default '/.' will be erased. This means that Zend_Translate will then add all files from the hidden SVN directories. When you are working with SVN, then you should use the array syntax described in the next section.

Ignore several directories or files

You can also ignore several files and directories. Instead of a string, you can simply give an array with all wished names which will be ignored.

  1. span style="color: #ff0000;">'ignore''.', 'test', 'old''adapter' => $adapter,
  2.         'content' => $content,
  3.         'locale'  => $locale,
  4.         'ignore''.', 'test', 'old')
  5.     )
  6. );

In the above case all 3 syntax will be ignored. But still they have to begin with the syntax to be detected and ignored.

Ignore specific names

To ignore files and directories which are not beginning with a defined syntax but have a special syntax anywhere within their name you can use a regular expression.

To use a regular expression the array key of the ignore option has to begin with regex.

  1. span style="color: #ff0000;">'ignore''regex' => '/test/u',
  2.         'regex_2' => '/deleted$/u''adapter' => $adapter,
  3.         'content' => $content,
  4.         'locale'  => $locale,
  5.         'ignore''regex' => '/test/u', 'regex_2' => '/deleted$/u')
  6.     )
  7. );

In the above case we defined 2 regular expressions. The files and directories will always being searched with all given regular expressions. In our example this means that any files which contains test anywhere in their name will be ignored. Additionally all files and directories which end with deleted will not be added as translation.

Routing for translations

Not every message ID can be translated. But sometimes is can be useful to output the translation from another language instead of returning the message ID itself. You can archive this by using the route option.

You can add one route for every language. See the following example:

  1. span style="color: #ff0000;">'adapter' => $adapter,
  2.         'content' => $content,
  3.         'locale'  => $locale,
  4.         'route''fr' => 'en', 'de' => 'fr')
  5.     )
  6. );

The above returns a english translation for all messages which can not be translated to french. And it returns a french translation for all messages which can not be translated to german. It will even return an english translation for all messages which can wether be translated to german nor to french. So you can even define a complete translation chain.

This feature seems ot be interesting for anyone. But be aware that returning translations for wrong or other languages can be problematic when the user does not understand this language. So you should always use this feature sparingly.

Combining multiple translation sources

When you are working with multiple translations you may come into a situation where you want to use different source types. For example the resource files which are provided by the framework and your own translations which are available by using the gettext adapter.

By combining multiple translation adapters you can use them within one instance. See the following example:

  1. span style="color: #ff0000;">'adapter' => 'gettext',
  2.         'content' => '\path\to\translation.mo',
  3.         'locale'  => 'en''adapter' => 'array',
  4.         'content' => '\resources\languages\en\Zend_Validate.php',
  5.         'locale'  => 'en''content'

Now the first instance holds all translations from the second instance and you can use it within the application even if you used different source types.

Note: Memory savings
As you may have noted the second instance is no longer used as soon as it has been added to the first instance. To save some memory you may want to unset it.

When you are scanning for directories you may additionally want to use only one defined language. The predefined resources for example are available in more than 10 languages. But your application is not available in all of those language. Therefor you can also add only one language from the second adapter.

  1. span style="color: #ff0000;">'content''locale'  => 'en'
  2.     )
  3. );

This allows you still to scan through the directories and still add only those languages which are relevant for your application.

Checking for translations

Normally text will be translated without any computation. But sometimes it is necessary to know if a text is translated or not, therefor the isTranslated() method can be used.

isTranslated($messageId, $original = false, $locale = null) takes the text you want to check as its first parameter, and as optional third parameter the locale for which you want to do the check. The optional second parameter declares whether translation is fixed to the declared language or a lower set of translations can be used. If you have a text which can be returned for 'en' but not for 'en_US' you will normally get the translation returned, but by setting $original to TRUE, isTranslated() will return FALSE.

Example #7 Checking if a text is translatable

  1. span style="color: #ff0000;">'message1' => 'Nachricht 1',
  2.     'message2' => 'Nachricht 2',
  3.     'message3' => 'Nachricht 3''adapter' => 'array',
  4.         'content' => $english,
  5.         'locale' => 'de_AT''message1'"'message1' can be translated"'message1''de'"'message1' can not be translated to 'de'"
  6.         . " as it's available only in 'de_AT'"'message1''de'"'message1' can be translated in 'de_AT' as it falls back to 'de'";
  7. }

How to log not found translations

When you have a bigger site or you are creating the translation files manually, you often have the problem that some messages are not translated. But there is an easy solution for you when you are using Zend_Translate.

You have to follow two or three simple steps. First, you have to create an instance of Zend_Log. Then you have to attach this instance to Zend_Translate. See the following example:

Example #8 Log translations

  1. span style="color: #ff0000;">'adapter' => 'gettext',
  2.         'content' => $path,
  3.         'locale' => 'de'
  4.     )
  5. );
  6.  
  7. // Create a log instance
  8. '/path/to/file.log'// Attach it to the translation instance
  9. 'log''logUntranslated''unknown string');

Now you will have a new notice in the log: Untranslated message within 'de': unknown string.

Note: You should note that any translation which can not be found will be logged. This means all translations when a user requests a language which is not supported. Also every request for a message which can not be translated will be logged. Be aware, that 100 people requesting the same translation, will result 100 logged notices.

This feature can not only be used to log messages but also to attach this untranslated messages into an empty translation file. To do so you will have to write your own log writer which writes the format you want to have and strips the prepending "Untranslated message".

You can also set the 'logMessage' option when you want to have your own log message. Use the '%message%' token for placing the messageId within your log message, and the '%locale%' token for the requested locale. See the following example for a self defined log message:

Example #9 Self defined log messages

  1. span style="color: #ff0000;">'adapter' => 'gettext',
  2.         'content' => $path,
  3.         'locale'  => 'de'
  4.     )
  5. );
  6.  
  7. // Create a log instance
  8. '/path/to/file.log'// Attach it to the translation instance
  9. 'log''logMessage'      => "Missing '%message%' within locale '%locale%'",
  10.         'logUntranslated''unknown string');

Additionally you are also able to change the priority which is used to write the message into the log. Per default the priority Zend_Log::NOTICE is used. It equals with 5. When you want to change the priority you can use any of Zend_Log's priorities. See the following example:

Example #10 Self defined log priority

  1. // Create a log instance
  2. '/path/to/file.log''adapter' => 'gettext',
  3.         'content' => $path,
  4.         'locale'  => 'de',
  5.         'log''logMessage'      => "Missing '%message%' within locale '%locale%'",
  6.         'logPriority''logUntranslated''unknown string');

Accessing source data

Sometimes it is useful to have access to the translation source data. Therefor the following two functions are provided.

The getMessageIds($locale = null) method returns all known message IDs as array.

When you want to know the message ID for a given translation then you can use the getMessageId() method.

The getMessages($locale = null) method returns the complete translation source as an array. The message ID is used as key and the translation data as value.

Both methods accept an optional parameter $locale which, if set, returns the translation data for the specified language. If this parameter is not given, the actual set language will be used. Keep in mind that normally all translations should be available in all languages. Which means that in a normal situation you will not have to set this parameter.

Additionally the getMessages() method can be used to return the complete translation dictionary using the pseudo-locale 'all'. This will return all available translation data for each added locale.

Note: Attention: the returned array can be very big, depending on the number of added locales and the amount of translation data.

Example #11 Handling languages with adapters

  1. // returns all known message IDs
  2. // or just for the specified language
  3. $messageIds = $translate->getMessageIds('en_US'// returns all the complete translation data

Creating source files