Title: Yudit BIDI Format
Version: 2000-04-09
Author: Gaspar Sinai 

Introduction
============
I am writign this dodument on the train so it might look a bit messy.
This short ducument will describe the bi-directinal format Yudit will use.
The bi-directional format in the 
 <a href="http://www.unicode.org/unicode/reports/tr9/"> Unicode standard </a> 
has the some faults that lead me write this document:
1. The proposed bi-di format suppposes that each character has an attriubute
that tells about what directional group the character should belong.
This is bad because updating the standard would make old documents unreadable unless we have support for all version of unicode which would mean keeping a map file of all characters for each version.
2. The unicode bi-di format is difficult to use, most users would not
care about principles and would not know about what group characters 
belong to. 
3. The format that changes the directionality is often ambiguous and one-way.
Once characters are rendered from input buffer through the re-ordering
algorithm it is impossible to get back the same buffer from the 
rendering buffer.

The goals of Yudit bi-di
========================
1. Be able to read fully compliant unicode script.
2. Be able to produce a format that is fully compliant, still user-friendly and Yudit-friendly. 
3. If the input is Yudit format when converted back, the output will 
always be the same if no text change is made.
4. User friendly, easy to use, easy to program.

The Yudit format:
=================
The document can be broken into 2 types of lines. One line looks like
embedded pairs of
<RLO> line <PDF>
<LRO> line <PDF>

RLO = right->left override U+202E
LRO = left->right override U+202D
PDF = pop directional format U+202C

The beginning of the line is adjusted into right or left accordingly.
The LRO should not be present to specify LR direction, at the beginning
of the text. The default is LR.

one line consists of
<RLE> text <PDF>
<LRE> text <PDF>
text where
RLE = right-left embedding
LRE = left -rihgt embedding.
PDF = pop directional format

The pairs can be nested. The same pair should not appear twice (just 
to make the text output consistent), and should always contain some string.

These are invalid:
Text<RLE><PDF> Text
Text<RLE>a<RLE>b<PDF>c<PDF> Text

They should be converted to:
Text Text
Text<RLE>b<PDF>Text

Yudit will always write these two types of lines into files. When reading files
it first checks if it is yudit-compliant, if it is it uses it, if not
it tries to run the unicode bi-di algorithm to produce this format. 

Screen Input
============
1. The line directionality is inherited from the previous line
2. Embedding pairs can be created. Only paris can be added and
deleted.

this is an <LRE>CIBARA<RLE>meaning arabic</RLE></LRE> text.
             M          M                    m     m 

Each M enters into a new embedding mode and each m exits an embedding mode.
</LRE> = PDF
</RLE> = PDF
in Unicode terms.

