Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:

strategy:
matrix:
node-version: [14.x, 16.x, 18.x]
node-version: [22.x, 24.x]

steps:
- uses: actions/checkout@v3
Expand All @@ -26,7 +26,7 @@ jobs:
run: npm install

- name: Check formatting
run: node_modules/.bin/prettier --check $(find src -type f)
run: npm run format:check

- name: Check lint
run: npm run lint
Expand Down
7 changes: 7 additions & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
node_modules
coverage
dist
dist.browser
tmp
npm-debug.log*

2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
language: node_js

node_js:
- "stable"
- 'stable'
10 changes: 3 additions & 7 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,24 +11,20 @@
"internalConsoleOptions": "neverOpen",
"disableOptimisticBPs": true,
"windows": {
"program": "${workspaceFolder}/node_modules/jest/bin/jest",
"program": "${workspaceFolder}/node_modules/jest/bin/jest"
}
},
{
"type": "node",
"request": "launch",
"name": "Jest Current File",
"program": "${workspaceFolder}/node_modules/.bin/jest",
"args": [
"${fileBasenameNoExtension}",
"--config",
"jest.config.js"
],
"args": ["${fileBasenameNoExtension}", "--config", "jest.config.js"],
"console": "integratedTerminal",
"internalConsoleOptions": "neverOpen",
"disableOptimisticBPs": true,
"windows": {
"program": "${workspaceFolder}/node_modules/jest/bin/jest",
"program": "${workspaceFolder}/node_modules/jest/bin/jest"
}
}
]
Expand Down
40 changes: 20 additions & 20 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
{
"cSpell.words": [
"Aditi",
"Bixby",
"Celine",
"Conchita",
"Giorgio",
"Mathieu",
"Mizuki",
"Raveena",
"Salli",
"Takumi",
"dedent",
"implicity",
"speechmarkdown",
"ssml",
"transpiled",
"tsify",
"uglifyjs"
]
}
"cSpell.words": [
"Aditi",
"Bixby",
"Celine",
"Conchita",
"Giorgio",
"Mathieu",
"Mizuki",
"Raveena",
"Salli",
"Takumi",
"dedent",
"implicity",
"speechmarkdown",
"ssml",
"transpiled",
"tsify",
"uglifyjs"
]
}
61 changes: 41 additions & 20 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,88 @@
# Change Log

All notable changes to the speechmarkdown-js project will be documented in this file.

## 2.1.0 - (December 22, 2022)

### Added

- Support for audio captions

## 2.0.0 - (October 28, 2021)

### Added

- Support for `voice` and `language` for `google-assistant`
- Formatters for `amazon-polly`, `amazon-polly-neural`, and `microsoft-azure`

## 0.8.0-beta.0 - (July 7, 2019)

### Added

- Support for sections with the `voice` and `lang` tags

## 0.7.0-alpha.0 - (July 6, 2019)

### Added

- Support for `audio` tag

## 0.6.0-alpha.0 - (July 6, 2019)

### Added

- Support for `voice` and `lang` tags

## 0.5.0-alpha.0 - (July 5, 2019)

### Fixed

- Issue #7 - Grammar - multiple modifiers for the same text

### Added

- Grammar and formatters for standard:
- volume / vol
- rate
- pitch
- sub
- ipa
- volume / vol
- rate
- pitch
- sub
- ipa

## 0.4.0-alpha.0 - (June 30, 2019)

### Added

- Update grammar and formatters for standard:
- emphasis
- address
- characters / chars
- date (skipped tests)
- expletive / bleep
- fraction (skipped tests)
- interjection
- number
- ordinal
- phone / telephone (skipped tests)
- time
- unit
- whisper

- emphasis
- address
- characters / chars
- date (skipped tests)
- expletive / bleep
- fraction (skipped tests)
- interjection
- number
- ordinal
- phone / telephone (skipped tests)
- time
- unit
- whisper

- Add tests to increase coverage

## 0.3.0-alpha.0 - (June 30, 2019)

### Added

- Update grammar and formatters for emphasis short format
- Change speechmarkdown.toString(markdown) to speechmarkdown.toText(markdown)


## 0.2.0-alpha.0 - (June 29, 2019)

### Added

- CHANGELOG.md

### Update
- Links in package.json

- Links in package.json
2 changes: 1 addition & 1 deletion CODE-OF-CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ This code of conduct provides guidance on participation in Speech Markdown-manag
- Other conduct which could reasonably be considered inappropriate in a professional setting;
- Advocating for or encouraging any of the above behaviors.

**Enforcement and Reporting Code of Conduct Issues.** Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting opensource-codeofconduct@speechmarkdown.org. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances.
**Enforcement and Reporting Code of Conduct Issues.** Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting opensource-codeofconduct@speechmarkdown.org. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances.
11 changes: 5 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@ When you submit a pull request, our team is notified and will respond as quickly

We look forward to receiving your pull requests for:

* New content you'd like to contribute (such as new code samples or tutorials)
* Inaccuracies in the content
* Information gaps in the content that need more detail to be complete
* Typos or grammatical errors
* Suggested rewrites that improve clarity and reduce confusion
- New content you'd like to contribute (such as new code samples or tutorials)
- Inaccuracies in the content
- Information gaps in the content that need more detail to be complete
- Typos or grammatical errors
- Suggested rewrites that improve clarity and reduce confusion

**Note:** We all write differently, and you might not like how we've written or organized something currently. We want that feedback. But please be sure that your request for a rewrite is supported by the previous criteria. If it isn't, we might decline to merge it.

Expand Down Expand Up @@ -45,7 +45,6 @@ In addition to written content, we really appreciate new examples and code sampl

This project has adopted the [Speech Markdown Open Source Code of Conduct](https://github.com/speechmarkdown/speechmarkdown-js/blob/master/CODE-OF-CONDUCT). Contact [opensource-codeofconduct@speechmarkdown.org](mailto:opensource-codeofconduct@speechmarkdown.org) with any additional questions or comments.


## Licensing

See the [LICENSE](https://github.com/speechmarkdown/speechmarkdown-js/blob/master/LICENSE) file for this project's licensing. We will ask you to confirm the licensing of your contribution. We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes.
22 changes: 20 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,19 @@ Supported platforms:
- amazon-alexa
- amazon-polly
- amazon-polly-neural
- apple-avspeechsynthesizer
- google-assistant
- ibm-watson
- microsoft-azure
- microsoft-sapi
- w3c
- samsung-bixby
- elevenlabs

Find the architecture [here](./docs/architecture.md)

Platform-specific SSML notes are tracked in [`docs/platforms`](./docs/platforms/README.md). Use `npm run docs:update-voices` to refresh the auto-generated voice maps in `src/formatters/data` when vendor credentials are available.

## Quick start

### SSML - Amazon Alexa
Expand Down Expand Up @@ -126,9 +133,14 @@ Available options are:
- "amazon-alexa"
- "amazon-polly"
- "amazon-polly-neural"
- "apple-avspeechsynthesizer"
- "google-assistant"
- "ibm-watson"
- "microsoft-azure"
- "microsoft-sapi"
- "w3c"
- "samsung-bixby"
- "elevenlabs"

- `includeFormatterComment` (boolean) - Adds an XML comment to the SSML output indicating the formatter used. Default is `false`.

Expand Down Expand Up @@ -179,8 +191,14 @@ The biggest place we need help right now is with the completion of the grammar a
- [x] emphasis - moderate
- [x] emphasis - none
- [x] emphasis - reduced
- [ ] ipa
- [ ] sub
- [x] ipa
- [x] sub

Short-form examples:

- `(pecan)/'pi.kæn/` → `<phoneme alphabet="ipa" ph="'pi.kæn">pecan</phoneme>`
- `(Al){aluminum}` → `<sub alias="aluminum">Al</sub>`
- `/ˈdeɪtə/` → `<phoneme alphabet="ipa" ph="ˈdeɪtə">ipa</phoneme>`

#### Standard Format

Expand Down
4 changes: 3 additions & 1 deletion docs/architecture.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
# Architecture

## Simple Parser

Instead of a simple parser architecture as shown here:

![](./assets/simple-parser-diagram.png)

## Parser-Formatter Architecture

Speech Markdown is first translated into an Abstract Syntax Tree (AST) and a formatter transforms that into the correct format:

![](./assets/parser-formatter-diagram.png)

This is more powerful as formatters have the ability to customize the output based on the differences of each platform.
This is more powerful as formatters have the ability to customize the output based on the differences of each platform.
22 changes: 22 additions & 0 deletions docs/platforms/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Speech platform reference

This directory contains reference notes about the SSML dialects that Speech Markdown targets. Each page describes:

- Links to the vendor documentation for the dialect.
- Highlights of the current Speech Markdown formatter behaviour.
- Known gaps that are not currently translated by the formatter.
- A generated voice catalogue summarising the voices that expose the dialect when the provider shares the data programmatically.

## Available platform notes

- [Amazon Polly](./amazon-polly.md)
- [Amazon Alexa](./amazon-alexa.md)
- [Apple AVSpeechSynthesizer](./apple-avspeechsynthesizer.md)
- [Google Cloud Text-to-Speech](./google-cloud-tts.md)
- [IBM Watson Text to Speech](./ibm-watson-tts.md)
- [ElevenLabs prompt controls](./elevenlabs.md)
- [Microsoft Azure Speech Service](./azure.md)
- [W3C SSML](./w3c.md)
- [Microsoft Speech API (SAPI)](./microsoft-sapi.md)

Voice catalogues are produced by the helper script `npm run docs:update-voices` which gathers voice metadata from the vendor APIs when credentials are available. The generated Markdown files live alongside the service documentation so that the catalogues can be versioned with the code base.
20 changes: 20 additions & 0 deletions docs/platforms/amazon-alexa.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Amazon Alexa SSML

## Official resources

- [Alexa Skills SSML reference](https://developer.amazon.com/en-US/docs/alexa/custom-skills/speech-synthesis-markup-language-ssml-reference.html)
- [Alexa voice catalogue](https://developer.amazon.com/en-US/docs/alexa/custom-skills/choose-the-voice-for-your-skill.html)
- [Designing with domains and emotions](https://developer.amazon.com/en-US/docs/alexa/custom-skills/speechcons-reference-interjections-for-alexa.html#expressive-ssml)

## Speech Markdown formatter coverage

- **Say-as rendering.** Inline modifiers such as `address`, `characters`, `date`, `interjection`, `number`, `ordinal`, `telephone`, `time`, and `unit` are mapped to `<say-as>` with sensible defaults for date and time formats so Alexa pronunciation fixes can stay in Speech Markdown.【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L76-L106】
- **Amazon-specific prosody.** Speech Markdown exposes `whisper`, `amazon:domain` (`dj` and `newscaster` modifiers), and `amazon:emotion` for `excited` and `disappointed`, emitting the appropriate tags and intensity attributes that Alexa recognises.【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L107-L145】【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L183-L201】
- **Voice fallback.** When a voice name is not present in the built-in whitelist, the formatter now falls back to emitting `<voice name="…">` so newly launched Alexa voices (for example Lupe or Aria) still render without code changes.【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L49-L51】【F:src/formatters/SsmlFormatterBase.ts†L44-L57】
- **Section-level wrappers.** `lang` and `voice` section modifiers wrap larger blocks, and Speech Markdown keeps Amazon-specific `music` and `news` domains available for long-form sections.【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L174-L205】

## Known gaps

- **Expressive extensions.** The formatter currently emits only `amazon:effect`, `amazon:domain`, and `amazon:emotion`, so features like `<amazon:auto-breaths>`, `<amazon:breath>`, `<alexa:name>`, and the long-form `<amazon:domain name="long-form">` still require manual SSML until new modifiers are defined.【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L40-L46】【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L183-L205】
- **Voice metadata.** The built-in whitelist predates the expanded Alexa voice line-up and lacks locale metadata for the neural voices, so Speech Markdown relies on the new fallback behaviour instead of providing locale validation for every published voice.【F:src/formatters/AmazonAlexaSsmlFormatter.ts†L5-L33】
- **No automated catalogue.** Unlike Azure, Google, Polly, and Watson, Alexa does not expose a public API for voice discovery, so the documentation cannot yet include a generated voice table and must be refreshed manually from the developer portal.
28 changes: 28 additions & 0 deletions docs/platforms/amazon-polly.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Amazon Polly SSML

## Official resources

- [Supported SSML tags](https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html)
- [Voice catalogue](https://docs.aws.amazon.com/polly/latest/dg/voicelist.html)

## Speech Markdown formatter coverage

Speech Markdown ships two formatters for Amazon Polly.

### `amazon-polly` (standard engine)

- **Say-as pronunciations.** Modifiers such as `address`, `cardinal`, `characters`, `digits`, `fraction`, `number`, `ordinal`, `telephone`, and `unit` render as `<say-as>` with sensible defaults for dates and times, mirroring Polly's SSML support.【F:src/formatters/AmazonPollySsmlFormatter.ts†L47-L72】
- **Pronunciation controls.** The formatter exposes `<sub>`, `<phoneme alphabet="ipa">`, and `<prosody>` so aliasing, IPA phonemes, and rate, pitch, or volume adjustments can be driven from Speech Markdown.【F:src/formatters/AmazonPollySsmlFormatter.ts†L78-L93】
- **Amazon-specific effects.** Polly-only modifiers such as `whisper`, `timbre`, and `drc` produce `amazon:effect` tags, while inline `lang` modifiers wrap content in `<lang xml:lang="…">` for mixed-language prompts.【F:src/formatters/AmazonPollySsmlFormatter.ts†L74-L105】
- **Known gaps.** Inline `voice`, `excited`, and `disappointed` modifiers are defined but intentionally left without SSML output, and section-level variants such as `newscaster` are also ignored, so these behaviours still require manual SSML.【F:src/formatters/AmazonPollySsmlFormatter.ts†L107-L151】

### `amazon-polly-neural`

- **Shared say-as handling.** The neural formatter mirrors the standard engine for `address`, `characters`, `digits`, `fraction`, `number`, `ordinal`, `telephone`, `unit`, `date`, and `time` modifiers so pronunciation fixes work across both engines.【F:src/formatters/AmazonPollyNeuralSsmlFormatter.ts†L41-L67】
- **Pronunciation helpers.** `sub`, `ipa`, and the rate or volume prosody controls are preserved, and `lang` plus `drc` continue to emit `<lang>` and `amazon:effect` tags respectively.【F:src/formatters/AmazonPollyNeuralSsmlFormatter.ts†L69-L91】
- **Neural-only domains.** Section-level `newscaster` modifiers wrap content in `<amazon:domain name="news">` to reach Polly's neural news style.【F:src/formatters/AmazonPollyNeuralSsmlFormatter.ts†L115-L134】
- **Known gaps.** Neural voices do not currently expose `emphasis`, `whisper`, `voice`, `excited`, or `disappointed` output because the formatter drops those modifiers, matching the limitations of Polly's neural styles.【F:src/formatters/AmazonPollyNeuralSsmlFormatter.ts†L93-L145】

## Voice catalogue

Run `npm run docs:update-voices` with either `AWS_ACCESS_KEY_ID`/`AWS_SECRET_ACCESS_KEY` plus `AWS_REGION` (or `AWS_DEFAULT_REGION`) or the `POLLY_AWS_KEY_ID`/`POLLY_AWS_ACCESS_KEY`/`POLLY_REGION` equivalents to regenerate `data/amazon-polly-voices.md`. The helper script calls Polly's `ListVoices` API (with additional language codes enabled) and writes a Markdown table of each voice's identifier, language, gender, and supported engines so formatter validations stay aligned with Amazon's inventory.
Loading