Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ objc2-core-media = { version = "0.3" }
objc2-core-video = { version = "0.3" }
objc2-foundation = { version = "0.3" }
objc2-screen-capture-kit = { version = "0.3" }
objc2-vision = { version = "0.3" }
pollster = { version = "0.4" }
serde = { version = "1.0", features = ["derive"] }
thiserror = { version = "2.0" }
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Pure-Rust menubar screenshot prototype (macOS-first).
- In Frozen mode, a dragged-region capture can be dragged from inside the bright selection area to reposition it without resizing.
- In Frozen mode, `Space` copies the current frozen PNG to the clipboard and exits.
- In Frozen mode, Cmd+S (macOS) / Ctrl+S saves the current PNG to disk and exits.
- On macOS, Frozen mode can recognize text from the current capture and copy the result to the clipboard from the toolbar.
- After a dragged region freeze, press `s` or use the frozen toolbar `Scroll Capture ↓` action to enter scroll capture.
- Scroll capture is currently implemented on macOS for dragged-region freezes and uses image-first downward stitching with a live side preview.
- Upward scrolling may be observed for rewind/reacquire, but it never appends stitched rows.
Expand Down Expand Up @@ -93,6 +94,7 @@ cargo run -p rsnap
### Output (save-to-disk)

- In Frozen mode, use Cmd+S (macOS) / Ctrl+S to save a PNG to disk and exit.
- On macOS, use the frozen toolbar `Recognize Text` action to copy recognized text from the current frozen capture and exit.
- After entering scroll capture from a dragged region on macOS, downward scrolling may append newly proven rows into the side preview.
Upward scrolling never appends. Returning to already-stitched content should not grow the export; only newly proven content may be added.
The scroll-capture commit path uses discrete region screenshots plus pairwise image registration; clipboard and save must match the committed preview the user sees.
Expand Down
6 changes: 6 additions & 0 deletions apps/rsnap/src/app/capture.rs
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,12 @@ impl App {
OverlayExit::PngBytes(png_bytes) => {
tracing::info!(bytes = png_bytes.len(), "Capture copied to clipboard.");
},
OverlayExit::TextCopied(character_count) => {
tracing::info!(
characters = character_count,
"Recognized text copied to clipboard."
);
},
OverlayExit::Saved(path) => {
tracing::info!(path = %path.display(), "Capture saved to file.");
},
Expand Down
2 changes: 2 additions & 0 deletions docs/spec/v0.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ cross-platform architecture.
- Hovering over a window in live mode shows a glowing border that tracks the target
window.
- `Space` copies the frozen PNG of the selected region/window/fullscreen to clipboard.
- On macOS, Frozen mode may recognize text from the current frozen capture and copy the recognized text to the clipboard.
- Cmd+S (macOS) / Ctrl+S saves the frozen PNG to disk.
- `Esc` cancels capture.
- In Frozen mode, a loupe and toolbar are part of the floating HUD set and can still
Expand All @@ -57,6 +58,7 @@ cross-platform architecture.
- Left click (without drag) -> hit-test window under the cursor on the same monitor and
freeze that window bounds; fallback to fullscreen of the current monitor if no window is hit
- `Space` -> copy the frozen cropped PNG (region/window/fullscreen) to the system clipboard, then exit
- On macOS, the frozen toolbar may expose `Recognize Text`, which runs Apple Vision OCR on the current frozen capture, copies the recognized text to the clipboard, and exits
- Cmd+S (macOS) / Ctrl+S -> save the frozen cropped PNG to disk, then exit
- Esc -> cancel and exit without copying
- After a dragged-region freeze enters Frozen mode, dragging inside the bright region
Expand Down
2 changes: 1 addition & 1 deletion packages/rsnap-overlay/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ objc2-core-media = { workspace = true }
objc2-core-video = { workspace = true }
objc2-foundation = { workspace = true }
objc2-screen-capture-kit = { workspace = true }
objc2-vision = "0.3.2"
objc2-vision = { workspace = true }
raw-window-handle = { workspace = true }

[dev-dependencies]
Expand Down
2 changes: 2 additions & 0 deletions packages/rsnap-overlay/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ pub mod replay_support {
mod backend;
#[cfg(target_os = "macos")]
mod live_frame_stream_macos;
#[cfg(target_os = "macos")]
mod ocr_macos;
mod overlay;
mod png;
mod scroll_capture;
Expand Down
57 changes: 57 additions & 0 deletions packages/rsnap-overlay/src/ocr_macos.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
use color_eyre::eyre::{Result, WrapErr};
use image::RgbaImage;
use objc2::rc::{self, Retained};
use objc2::runtime::AnyObject;
use objc2::{AnyThread, ClassType};
use objc2_foundation::{NSArray, NSData, NSDictionary};
use objc2_vision::{
VNImageOption, VNImageRequestHandler, VNRecognizeTextRequest, VNRequest,
VNRequestTextRecognitionLevel,
};

use crate::png;

pub(crate) fn recognize_text_from_image(image: &RgbaImage) -> Result<String> {
rc::autoreleasepool(|_| {
let image_data = NSData::with_bytes(
&png::rgba_image_to_png_bytes(image).wrap_err("failed to encode OCR source image")?,
);
let options: Retained<NSDictionary<VNImageOption, AnyObject>> = NSDictionary::new();
let request_handler = VNImageRequestHandler::initWithData_options(
VNImageRequestHandler::alloc(),
&image_data,
&options,
);
let request = VNRecognizeTextRequest::new();

request.setRecognitionLevel(VNRequestTextRecognitionLevel::Accurate);
request.setUsesLanguageCorrection(true);
request.setAutomaticallyDetectsLanguage(true);

let requests: Retained<NSArray<VNRequest>> =
NSArray::from_slice(&[request.as_super().as_super()]);

request_handler
.performRequests_error(&requests)
.wrap_err("Vision text recognition request failed")?;

let mut lines = Vec::new();

if let Some(results) = request.results() {
for index in 0..results.count() {
let observation = results.objectAtIndex(index);
let candidates = observation.topCandidates(1);
let Some(candidate) = candidates.firstObject() else {
continue;
};
let line = candidate.string().to_string();

if !line.trim().is_empty() {
lines.push(line);
}
}
}

Ok(lines.join("\n"))
})
}
Loading
Loading