Module tongue.transliteration

Tongue language packs are internally always in UTF-8, but users may need different encodings.

Since users might have all sorts of ways of specifying the desired character encoding for their messages, Tongue provides a mechanism for deriving the target character encoding and then transliterating to and from that encoding.

get (encoding) Retrieve a tongue encoding converter.
guess (env) Retrieve a tongue encoding converter based on the environment.

Class converter

converter:touser (input) Convert a string to the user character set.
converter:fromuser (input) Convert a string from the user character set.


get (encoding)
Retrieve a tongue encoding converter.

Construct and return an encoder which can convert between the provided encoding and UTF-8 in either direction. The converter will be configured to transliterate where possible and to replace bad or unknown codepoints so as to ensure that the outputs are always valid.

If the desired encoding is UTF-8 then the encoder returned shall effectively be a passthrough, excepting that invalid or malformed codepoints shall be “cleaned up” by the encoder object.

Parameters:

  • encoding string The desired encoding to be used

Returns:

    encoder The bidirectional character encoder
guess (env)
Retrieve a tongue encoding converter based on the environment.

Firstly this function attempts to determine the encoding desired by the “client” by means of examining the provided environment table (or the process environment table if none was given). Once an encoding has been determined somehow, tongue will return an encoder by calling through to the get function.

If no encoding can be determined from the provided table, tongue will assume that UTF-8 is appropriate.

Parameters:

  • env optional table The environment to use (or nil to use the process env)

Returns:

    encoder The bidirectional character encoder

Class converter

Tongue character-set converter

Tongue deals internally in UTF-8 but may have to handle input and output in any character set a user may choose. The converter object wrappers a pair of iconv descriptors which manage that conversion.

converter:touser (input)
Convert a string to the user character set.

Parameters:

  • input string The input (UTF-8) string

Returns:

    string The output (user charset) string
converter:fromuser (input)
Convert a string from the user character set.

Parameters:

  • input string The input (user charset) string

Returns:

    string The output (UTF-8) string
generated by LDoc 1.4.6 Last updated 2022-01-28 11:46:49