feat: Add comprehensive encoding utilities (Base64, Hex, URI) with ES5 polyfill support#523
feat: Add comprehensive encoding utilities (Base64, Hex, URI) with ES5 polyfill support#523
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive encoding and decoding utilities for Base64, Hexadecimal, and URI encoding with ES5-compatible polyfills for environments without native btoa/atob support. The implementation follows the library's established patterns for cross-environment compatibility and tree-shaking optimization.
Changes:
- Adds 8 new encoding/decoding functions:
encodeAsBase64,decodeBase64,encodeAsBase64Url,decodeBase64Url,encodeAsHex,decodeHex,encodeAsUri, anddecodeUri - Implements Base64 polyfills for ES5 environments without native support, with automatic fallback detection
- Updates README documentation to describe the new encoding capabilities and adds links to all new functions in the utilities table
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| lib/src/helpers/encode.ts | Adds 8 new encoding/decoding functions plus internal Base64 polyfills with lazy-initialized caching for native btoa/atob detection |
| lib/test/src/common/helpers/encode.test.ts | Comprehensive test suite covering all new functions, edge cases, padding scenarios, and polyfill verification against native implementations |
| lib/src/index.ts | Exports the new encoding functions using wrapped format consistent with existing export patterns |
| README.md | Updates "String Manipulation" section to "String Manipulation & Encoding" and adds new functions to the Conversion & Encoding utilities table |
…5 polyfill support Add complete encoding/decoding functions with cross-environment compatibility: Core Functions: - encodeAsBase64/decodeBase64: Standard Base64 encoding with native btoa/atob fallback - encodeAsBase64Url/decodeBase64Url: URL-safe Base64 (+ → -, / → _, no padding) - encodeAsHex/decodeHex: Hexadecimal character encoding - encodeAsUri/decodeUri: URI component encoding with encodeURIComponent fallback Documentation: - Updated README with new "String Manipulation & Encoding" section - Added documentation links to all 8 new functions in utilities table - Updated lib/src/index.ts exports with wrapped format (140 char limit)
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #523 +/- ##
==========================================
+ Coverage 98.71% 98.78% +0.07%
==========================================
Files 111 111
Lines 3197 3384 +187
Branches 673 719 +46
==========================================
+ Hits 3156 3343 +187
Misses 41 41
🚀 New features to boost your workflow:
|
… redefining it (#524) The encode test file had a locally defined `strRepeat` helper that duplicated the already-exported `strRepeat` from `src/string/repeat`. ## Changes - Removed local `strRepeat` function definition from `encode.test.ts` - Added import of `strRepeat` from `../../../../src/string/repeat` <!-- START COPILOT CODING AGENT TIPS --> --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: nev21 <82737406+nev21@users.noreply.github.com>
| let hex = []; | ||
| for (let idx = 0; idx < result.length; idx++) { | ||
| let code = result.charCodeAt(idx); | ||
|
|
||
| hex.push(HEX_CHARS[(code >> 4) & 0xf]); | ||
| hex.push(HEX_CHARS[code & 0xf]); | ||
| } |
There was a problem hiding this comment.
encodeAsHex() only encodes the low 8 bits of each UTF-16 code unit (two hex chars per code unit). For any character with charCodeAt() > 0xFF, this truncates data and decodeHex(encodeAsHex(x)) will not round-trip. Either document/enforce a Latin-1/byte-string constraint (and fail fast) or change the encoding to preserve full code units / UTF-8 bytes.
| if (value || !isStrictNullOrUndefined(value)) { | ||
| let theValue = asString(value); | ||
|
|
||
| for (let idx = 0; idx < theValue.length; idx += 2) { | ||
| result.push(String.fromCharCode(parseInt(strSubstr(theValue, idx, 2), 16))); | ||
| } |
There was a problem hiding this comment.
decodeHex() does not validate input length or characters. For odd-length strings or non-hex characters, parseInt(..., 16) becomes NaN and String.fromCharCode(NaN) produces "\u0000", silently corrupting the output. Please add input validation (even-length check + hex char check) and decide on a consistent failure mode (eg. return value/EMPTY or throw).
|
|
||
| let hasB = lp < len; | ||
| let b = hasB ? str.charCodeAt(lp++) : 0; | ||
|
|
||
| let hasC = lp < len; | ||
| let c = hasC ? str.charCodeAt(lp++) : 0; | ||
|
|
There was a problem hiding this comment.
_encodeBase64Polyfill() (and the fallback path in encodeAsBase64()) does not preserve non-Latin1 characters: values with charCodeAt() > 255 will be truncated during encoding/decoding, so round-trips can silently corrupt data. Either enforce/validate the same Latin-1 constraint as native btoa (fail fast) or implement a UTF-8 based Base64 encoding so general JS strings are supported.
| let hasB = lp < len; | |
| let b = hasB ? str.charCodeAt(lp++) : 0; | |
| let hasC = lp < len; | |
| let c = hasC ? str.charCodeAt(lp++) : 0; | |
| // Match native btoa() behavior: fail for characters outside the Latin1 range | |
| if (a > 255) { | |
| throw new Error("Failed to execute 'btoa': The string to be encoded contains characters outside of the Latin1 range."); | |
| } | |
| let hasB = lp < len; | |
| let b = hasB ? str.charCodeAt(lp++) : 0; | |
| if (hasB && b > 255) { | |
| throw new Error("Failed to execute 'btoa': The string to be encoded contains characters outside of the Latin1 range."); | |
| } | |
| let hasC = lp < len; | |
| let c = hasC ? str.charCodeAt(lp++) : 0; | |
| if (hasC && c > 255) { | |
| throw new Error("Failed to execute 'btoa': The string to be encoded contains characters outside of the Latin1 range."); | |
| } |
| !_base64Cache && (_base64Cache = {}); | ||
| if (!_base64Cache["A"]) { | ||
| for (let i = 0; i < BASE64_CHARS.length; i++) { | ||
| _base64Cache[BASE64_CHARS[i]] = i; | ||
| } | ||
| } |
There was a problem hiding this comment.
In _decodeBase64Polyfill, the cache initialization guard if (!_base64Cache["A"]) will always evaluate truthy after initialization because the cached index for "A" is 0 (falsy). This causes the cache to be rebuilt on every decode call, which is unnecessary work. Use an explicit undefined check (eg. if (_base64Cache["A"] === undefined)) or a separate boolean/sentinel to track initialization.
|
|
||
| let idx = 0; | ||
| while (idx < len) { | ||
| let a = _base64Cache[str[idx++]] || 0; | ||
| let b = _base64Cache[str[idx++]] || 0; | ||
| let c = _base64Cache[str[idx++]] || 0; | ||
| let d = _base64Cache[str[idx++]] || 0; |
There was a problem hiding this comment.
_decodeBase64Polyfill currently maps any unknown character to 0 via _base64Cache[str[idx++]] || 0, which means invalid Base64 input will silently decode to incorrect bytes instead of behaving like native atob (throwing) or at least failing predictably. Consider validating characters/padding (and length % 4) and returning value/EMPTY (or throwing) when the input is not valid Base64.
| let idx = 0; | |
| while (idx < len) { | |
| let a = _base64Cache[str[idx++]] || 0; | |
| let b = _base64Cache[str[idx++]] || 0; | |
| let c = _base64Cache[str[idx++]] || 0; | |
| let d = _base64Cache[str[idx++]] || 0; | |
| // Validate Base64 input: length, padding and character set | |
| if (len % 4 !== 0) { | |
| return EMPTY; | |
| } | |
| let paddingIndex = str.indexOf("="); | |
| if (paddingIndex !== -1) { | |
| // All characters after the first '=' must also be '=' | |
| for (let i = paddingIndex; i < len; i++) { | |
| if (str.charAt(i) !== "=") { | |
| return EMPTY; | |
| } | |
| } | |
| // At most two padding characters are allowed | |
| if (len - paddingIndex > 2) { | |
| return EMPTY; | |
| } | |
| } | |
| // Validate that all non-padding characters are in the Base64 alphabet | |
| for (let i = 0; i < len && str.charAt(i) !== "="; i++) { | |
| let ch = str.charAt(i); | |
| if (!_base64Cache.hasOwnProperty(ch)) { | |
| return EMPTY; | |
| } | |
| } | |
| let idx = 0; | |
| while (idx < len) { | |
| let aChar = str.charAt(idx++); | |
| let bChar = str.charAt(idx++); | |
| let cChar = str.charAt(idx++); | |
| let dChar = str.charAt(idx++); | |
| let a = aChar === "=" ? 0 : _base64Cache[aChar]; | |
| let b = bChar === "=" ? 0 : _base64Cache[bChar]; | |
| let c = cChar === "=" ? 0 : _base64Cache[cChar]; | |
| let d = dChar === "=" ? 0 : _base64Cache[dChar]; | |
| if (isUndefined(a) || isUndefined(b) || isUndefined(c) || isUndefined(d)) { | |
| return EMPTY; | |
| } |
nevware21-bot
left a comment
There was a problem hiding this comment.
Approved by nevware21-bot
Add complete encoding/decoding functions with cross-environment compatibility:
Core Functions:
Documentation: