Skip to main content
FOR DEVELOPERS

How to Fix Reversed Hebrew in Code (Visual ↔ Logical)

A short, dependency-free function that reorders Hebrew/Arabic text from visual order back to logical order. Copy-paste in your language.

Try the interactive tool →

The problem: visual order vs. logical order

Hebrew and Arabic are written and read right-to-left. In memory, Unicode stores them in logical order - i.e. reading order. The trouble starts when software (often PDF engines, legacy systems, or graphics output libraries) stores the characters in the order they are painted on screen - the right-to-left run becomes a left-to-right byte sequence. That is "visual order". When another system reads that sequence and assumes it is logical, the Hebrew shows up reversed.

The fix for the common case is a controlled flip, which is exactly the inverse of the Unicode bidi algorithm for a single paragraph:

  1. If a line contains no RTL character at all, leave it untouched (so a lone phone number is not "corrected" the wrong way).
  2. Reverse the whole line.
  3. Re-reverse each Latin/number run, so embedded English words and numbers stay readable left-to-right.
// input (visual order)
"World 123 םולש"
// output (logical order)
"שלום World 123"

Reference implementation

// Fix Hebrew/Arabic text stored in visual order -> logical order.
// Dependency-free. Works in the browser and Node.

const RTL = /[\u0590-\u05FF\u0600-\u06FF\u0750-\u077F\u08A0-\u08FF\uFB1D-\uFB4F\uFB50-\uFDFF\uFE70-\uFEFC]/;
const LTR_RUN = /[A-Za-z0-9\u00C0-\u024F][A-Za-z0-9\u00C0-\u024F .,:;'"\/\-_+=&%#@!?]*[A-Za-z0-9\u00C0-\u024F]|[A-Za-z0-9\u00C0-\u024F]/g;

function fixLine(line) {
  if (!RTL.test(line)) return line;            // no RTL -> leave as-is
  const flipped = [...line].reverse().join('');
  return flipped.replace(LTR_RUN, (run) => [...run].reverse().join(''));
}

export function fixReversedHebrew(text) {
  return text.replace(/\r\n?/g, '\n').split('\n').map(fixLine).join('\n');
}

Notes & limitations

  • This handles the common case: a Hebrew/Arabic paragraph with embedded English words and numbers. It is NOT a full UAX#9 (Unicode bidi) implementation.
  • No glyph mirroring (brackets): the common extraction model preserves original codepoints in visual order, so a plain reversal already restores them. Mirroring on top would corrupt the result.
  • A line with no RTL character is left untouched - this avoids wrongly "fixing" a standalone Latin string or number.
  • The Latin-run detection covers extended Latin letters, digits, and common punctuation. For text with deeply nested directions, use a full bidi library (e.g. ICU, fribidi, or python-bidi).
  • The code operates on Basic Multilingual Plane codepoints, which covers all Hebrew and Arabic. No surrogate-pair handling is needed for this text.
Just need to fix some text?
Paste and get the result - no code required.
Open the tool →