Unicode Support

Print this topic Previous pageOne level upNext page

FamilyTreeFactory uses the Unicode character set internally. This allows characters from various languages to be used, regardless of the installed locale. For example, the original spelling of a Greek birthplace can be included in an entry, although an English locale is installed: Athens - Αθήνa.

Limitations:

Unicode is not supported by Windows 98 or ME. FamilyTreeFactory works normally under these operating systems, but can only use the ANSI character set of the installed locale. Use of the MSLU (Microsoft (1) Layer for Unicode) has not been tested.
Only languages that run left-to-right in lines and in which the lines run top-to-bottom can be used.
For a language to be used, a sufficient Unicode character set must be present. This is not necessarily true for older Windows versions or for less common fonts. To check the content of a font, you can use the Windows program Character Map: Start -> Programs -> Accessories -> System Tools -> Character Map.
Unicode is not supported for genealogical symbols.
It is not recommended that you use file names with Unicode characters that are not in the code page of the installed locale. FamilyTreeFactory does not have a problem with this, but other programs such as file backup software may not be able to recognize the characters.
For the names of document files which are attached to archive PDF files automatically only characters of the ASCII character set should be used. Details can be found in the section Creating Archive PDF Files.

 

Unicode in PDF files:

Terms used here:

Foreign language refers to a language whose characters are not completely contained in the installed locale.
Foreign character refers to a character not contained in the installed locale.

In PDF files, text is normally saved with a single codepage. Only characters from a single code page can be output within a string (usually a single line of text). This means that combinations of foreign characters from multiple foreign languages is not possible within a string; for example, 'Germany Россия Ελλada' cannot be output, as the Cyrillic and Greek foreign characters are not contained in a single codepage. The output of 'Germany Россия' or ''Germany Ελλada' is possible, as the Cyrillic and Greek codepages also include the Latin characters.

FamilyTreeFactory uses an especially powerful PDF generator that can create any combination of foreign characters from multiple foreign languages when the PDF option CID Unicode is used. CID Unicode allows for character set-independent font embedding. If you have used combinations of foreign characters from more than one foreign language in single lines of your text, activate the PDF Option CID Unicode when exporting PDF. The PDF version of this manual was also created with CID Unicode, so that the example 'Germany Россия Ελλada' can be displayed correctly.

 

Alphabetic sorting:

Alphabetic sorting is generally based on the coding in a character set. This means that Cyrillic and Greek letters are arranged after Latin letters.

 

Details to the various Unicode files:

Except for the image/graphic and PDF files, all files used by FamilyTreeFactory are text files.

FamilyTreeFactory reads Unicode text files (see below for Unicode Gedcom files) with the following Unicode character sets:

UTF-8
UTF-16 Little Endian
UTF-16 Big Endian

 

The program writes Unicode text files (see below for Unicode Gedcom files) with the following Unicode character set:

UTF-16 Little Endian

 

FamilyTreeFactory reads Unicode Gedcom files with the following Unicode character sets:

UTF-8
UNICODE (UTF-16 Little Endian)
UNICODE (UTF-16 Big Endian)

 

FamilyTreeFactory can export Unicode Gedcom files with the following Unicode character sets:

UTF-8
UNICODE (UTF-16 Little Endian)

 

Note: UTF-8 is also Unicode, but is coded differently than UTF-16. Gedcom files are usually differentiated between 'UTF-8' and 'UNICODE', in which 'UNICODE' stands for UTF-16 Little Endian or UTF-16 Big Endian.

up


(1) Microsoft Corporation