Skip to content

Conversation

@stesee
Copy link
Collaborator

@stesee stesee commented Jan 12, 2026

No description provided.

lowellstewart and others added 6 commits January 6, 2026 13:22
Fix UnicodeMapper and OpenXmlRegex bugs regarding lastRenderedPageBreakFix/open xml regex bugs
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 6 to 7.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@v6...v7)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
…ctions/download-artifact-7

Bump actions/download-artifact from 6 to 7
Copilot AI review requested due to automatic review settings January 12, 2026 18:58
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances OpenXmlRegex and UnicodeMapper to properly ignore lastRenderedPageBreak elements, which are temporary layout markers that don't represent actual document content. This ensures that text matching and processing operations work correctly even when these invisible markers are present in Word documents.

Changes:

  • Modified UnicodeMapper.RunToString() to return an empty string for lastRenderedPageBreak elements
  • Updated OpenXmlRegex to filter out lastRenderedPageBreak elements when processing run content
  • Added comprehensive unit tests and integration tests to verify the fix

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
OpenXmlPowerTools/UnicodeMapper.cs Added handling to ignore lastRenderedPageBreak elements by returning empty string
OpenXmlPowerTools/OpenXmlRegex.cs Updated filtering logic to exclude lastRenderedPageBreak alongside rPr elements
OpenXmlPowerTools.Tests/UnicodeMapperTests.cs Added test verifying that lastRenderedPageBreak is ignored in text extraction
OpenXmlPowerTools.Tests/OpenXmlRegexTests.cs Added test verifying regex matching works despite lastRenderedPageBreak markers
OpenXmlPowerTools.Tests/DocumentAssemblerTests.cs Added integration test using document with lastRenderedPageBreak
TestFiles/DA-lastRenderedPageBreak.xml Added test data file for DocumentAssembler integration test
TestFiles/DA-lastRenderedPageBreak.docx Added test document containing lastRenderedPageBreak elements
.github/workflows/dotnet.yml Updated download-artifact action from v6 to v7

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

public void CanMatchDespiteInvisibleLayoutMarkers()
{
XDocument partDocument = XDocument.Parse(LastRenderedPageBreakXmlString);
XElement p = partDocument.Descendants(W.p).Last();
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to p is useless, since its value is never read.

Suggested change
XElement p = partDocument.Descendants(W.p).Last();
XElement p;

Copilot uses AI. Check for mistakes.
@stesee stesee merged commit b30d59c into release Jan 12, 2026
15 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jan 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants