-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Expand WordPress Intelligence support to other locales #25034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
b4a20c9 to
7bcbb5b
Compare
| case .intelligence: | ||
| let languageCode = Locale.current.language.languageCode?.identifier | ||
| return (languageCode ?? "en").hasPrefix("en") | ||
| guard #available(iOS 26, *) else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's technically redundant; just some more defensive code.
|
| App Name | Jetpack | |
| Configuration | Release-Alpha | |
| Build Number | 30202 | |
| Version | PR #25034 | |
| Bundle ID | com.jetpack.alpha | |
| Commit | a45969e | |
| Installation URL | 2uidh380oa8l8 |
|
| App Name | WordPress | |
| Configuration | Release-Alpha | |
| Build Number | 30202 | |
| Version | PR #25034 | |
| Bundle ID | org.wordpress.alpha | |
| Commit | a45969e | |
| Installation URL | 7d77ftbbpq2to |
| with fewer than 10 words. | ||
| The summary should be clear, informative, and written in a neutral tone. | ||
| You MUST generate the summary in the same language as the support request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I think the prompt can be more direct here: "You MUST generate the summary in the (locale.identifier) locale.", and remove the makeLocaleInstructions() above?
|
I'd expect the generated summary to be in the post's language, rather than the device's language. What do you think? |
|
Oh, sorry, I missed that there are updates to other prompts too. |
|
The post summary generation is not updated in this PR. Probably because that went to a separate code path: |
|
I used a news article in Chinese for testing. The post tags generation in the "Post Settings" page does not work for me. Here are some debug prints: |
7bcbb5b to
7b4abe2
Compare
Generated by 🚫 Danger |
42b0f77 to
2e0050f
Compare
980a3a6 to
2b2870b
Compare
|
I added a significant amount of automated tests, improved the prompts to get it to produce the correct language output at a much higher rate, and updated the PR description. I also have another upcoming PR with additional LLM-based evaluation for automation. |
7dc87fb to
6599afe
Compare
|
There are a couple of SwiftLint issues which I will be addressing before merge. |
196312a to
c3076f2
Compare
|
I've tested a few different options and thanks to the unit tests identified the one that seems to work the best. I included I skipped other checks for testing it: @available(iOS 26, *)
@Test(arguments: ExcerptTestCaseParameters.nonEnglishCases)
func excerptGenerationNonEnglish(parameters: ExcerptTestCaseParameters) async throws {
_ = try await runExcerptTest(parameters: parameters, skip: [.skipWordCountCheck, .skipDiversityCheck])
}The language tests are not all passing. Previously, there were still a bit inconsistent for some of the supported languages.
|
d02b66a to
933bedf
Compare
|
I updated tag and summary generation to follow the same example. I plan to continue iterating on the general quality of the responses in separate PR and in the scope of #25059. The localization support now seem to be up to acceptable quality level. |
|
I can still reproduce the issue in my comment above. |
305075b to
7437099
Compare
7437099 to
a45969e
Compare
|
|
The excerpt works for me now. But the suggested tags are still unrelated to the post content. Also, maybe we don't need to give the option to generate excerpts if the content is super short? It'd be pretty tricky for LLM to generate accurate excerpts if the post content is short. |
| /// - siteTags: Existing tags from the site | ||
| /// - postTags: Tags already added to this post | ||
| /// - Returns: Formatted prompt string ready for the language model | ||
| public func makePrompt(post: String, siteTags: [String], postTags: [String]) async -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function async because detectLanguage can be slow?
| with fewer than 10 words. | ||
| The summary should be clear, informative, and written in a neutral tone. | ||
| You MUST generate the summary in the same language as the support request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this prompt work with giving LLM the detectLanguage result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I think it's okay to generate a summary in the device language in this context. The summary is for the customer, not website visitors.
| /// - Returns: Formatted prompt string ready for the language model | ||
| public func makePrompt(content: String) async -> String { | ||
| let extractedContent = IntelligenceService.extractRelevantText(from: content, ratio: 0.8) | ||
| let language = IntelligenceService.detectLanguage(from: extractedContent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want the generated summary to be in the device language. The summary is for the current user after all.
I guess the AI summary feature can be used as a quick way to translate and summarize if I don't understand the original post's language.
|
For some reason, GitHub won't let me reply to my own PR comments. Here's what I posted so far:
I was just thinking about it – do we want to translate the post or not and whether it should be the same thing as "summarize". I would argue it should be two separate features. I do plan to add translation, starting with comments CMM-744: On-device translation for comments and then post. It is not in the scope of this PR. I'd start with summarizing without translation the way it does now.
Yes, it to make |
I did a bit more testing, and I confirm this is a potential issues, but it's not directly related to localization. I opened CMM-1073: Improve quality of suggested tags to track it. The issue is that the prompt prioritizes current site tags, their language, and format – which is generally a good thing for real sites. If you have a test site with a bunch of tags unrelated to the posts, it will not work as well, which is also expected. However, I think it still needs improvement as it should suggest tags only if there is a high level of confidence they match the text. Having said that, I'm not completely sure because you may be tagging based on some other criteria, so showing some suggestions can be better than showing none. I think the app might need do a little of work in the background to try and find relationships between your posts and tags before making suggestions.
It is an existing ticket CMM-763: If the post content is too short, the results are not helpful. I don't know how to define "short", so I haven't looked into it yet. It doesn't seem like an issue that should be addressed as nothing bad happens if you do it and it's an unlikely scenario a user would use generation in the first place. For tags, I'll look into it in the scope of the previous issue I linked. |







What
Fixes CMM-762: Excerpts are created in the system language not the content language and CMM-798: Add support for other locales
I smoke-tested it by generating a post in Spanish and testing that the excerpts are also in Spanish. It should work for any other scenarios as well.
How
WordPressModuleand use it as a namespace (rename some of the features)WordPressIntelligence.It's a large PR, but the majority of the changes are in unit tests. The main changes to look for is in prompts.