Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

On my Linux system, I hooked up xfce4-screenshooter's "custom action" to a shell script (ocr.sh %f) with tesseract like so:

    #!/bin/sh
    set -o pipefail
    lang=${2:-eng}
    if tesseract "$1" - -l $lang | xclip -selection clipboard ; then
      notify-send "Text copied"
    else
      notify-send "Could not copy text"
    fi
It works great most of the time along with the xfce4-screenshooter's ability to select a rectangle.

When the text is especially difficult for tesseract, I can use Gemma3-4B via llama.cpp's llama-mtmd-cli, but that takes a minute.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: