feat: add configurable connection charset (lc_ctype) #164
+21
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Legacy Firebird databases commonly use charset
NONEon text columns. In these databases, text is stored as raw bytes in the application's encoding (typically WIN1252) without any charset metadata on the columns.The driver currently hardcodes
utf8as the connection charset (lc_ctypein the DPB). When connecting to a database with charsetNONEcolumns, Firebird does not transliterate the data: it sends the raw bytes as-is. The driver then incorrectly decodes these WIN1252 bytes as UTF-8, corrupting accented characters:Tournée→Tourn�eCafé→Caf�This affects a large number of production Firebird databases where charset
NONEwas the default.Solution
Add a
charsetoption toConnectOptionsthat:lc_ctypeto the specified charset instead of hardcodedutf8Usage
How it works
mapCharsetToEncoding()maps Firebird charset names to Node.jsBufferEncodingvalues (utf8for UTF8,latin1for all single-byte charsets)latin1encoding in Node.js provides a 1:1 byte-to-codepoint mapping, which correctly handles any single-byte Firebird charset (WIN1252, ISO8859_1, WIN1250, etc.)AbstractAttachmentand passed through tocreateDataReader()andcreateDataWriter()viaStatementImpl.prepare()Changes
node-firebird-driver:ConnectOptions: add optionalcharsetpropertycreateDpb(): useoptions.charsetinstead of hardcoded'utf8'mapCharsetToEncoding(): new helper to map Firebird charset → Node.js encodingAbstractAttachment: addencodingpropertycreateDataReader()/createDataWriter(): acceptencodingparameternode-firebird-driver-native:AttachmentImpl.connect(): setencodingfrommapCharsetToEncoding(options.charset)StatementImpl.prepare(): passattachment.encodingto reader/writerBackward compatible
When
charsetis not specified, the behavior is identical to before (defaults toutf8).