Urdu font not rendering properly in the editor

Discussion:

Uwe Stöhr

2014-03-30 02:31:37 UTC

Am 29.03.2014 10:07, schrieb Jamil Haider:

> With 2.1 RC1 I am able to set Urdu as document language.

Good.

> However, cursor
> doesn't switch to Right to Left order. Also, each character appears as a
> single separate character with no "contextual form".

We will have a look. Could you nevertheless please report this bug also to our bugtracker?

> The good news is that
> each character is rendered properly.

Fine.

Many thanks for testing LyX. We implemented support for many new languages in LyX:
http://wiki.lyx.org/LyX/NewInLyX21
but could not stress-test this feature yet. We need native-speakers like you to give us feedback.

best regards
Uwe

Uwe Stöhr

2014-03-30 13:33:07 UTC

Permalink

Am 30.03.2014 04:31, schrieb Uwe Stöhr:

>> However, cursor
>> doesn't switch to Right to Left order. Also, each character appears as a
>> single separate character with no "contextual form".
>
> We will have a look.

We did not take into account that Urdu is a right-to-left language. It should work for you when you
replace the file "languages" that is in LyX's install folder under "Resources" with the one that I
attached to this mail. Does Urdu then work for you?

thanks and regards
Uwe

Jamil Haider

2014-03-30 19:04:27 UTC

Permalink

Hi Uwe Stöhr, your supplied languages file fixed the RTL issue. However,
characters still do not appear in contextual form. Here is the ticket:

http://www.lyx.org/trac/ticket/9066

Regards
Jamil Haider

On Sun, Mar 30, 2014 at 6:33 PM, Uwe Stöhr <***@web.de> wrote:

> Am 30.03.2014 04:31, schrieb Uwe Stöhr:
>
>
> However, cursor
>>> doesn't switch to Right to Left order. Also, each character appears as a
>>> single separate character with no "contextual form".
>>>
>>
>> We will have a look.
>>
>
> We did not take into account that Urdu is a right-to-left language. It
> should work for you when you replace the file "languages" that is in LyX's
> install folder under "Resources" with the one that I attached to this mail.
> Does Urdu then work for you?
>
> thanks and regards
> Uwe
>

Uwe Stöhr

2014-03-30 22:43:37 UTC

Permalink

Am 30.03.2014 21:04, schrieb Jamil Haider:

> Hi Uwe Stöhr, your supplied languages file fixed the RTL issue.

Fine, so one bug less.

> However,
> characters still do not appear in contextual form. Here is the ticket:
>
> http://www.lyx.org/trac/ticket/9066

Thanks, we will have a look.

regards
Uwe

Jamil Haider

2014-07-28 19:07:41 UTC

Permalink

Hello all

Don't know if it is best place to ask, but where can I get my hands on
Windows build of the master ?

Regards

On Mon, Mar 31, 2014 at 3:43 AM, Uwe StÃ¶hr <***@web.de> wrote:

> Am 30.03.2014 21:04, schrieb Jamil Haider:
>
>
> Hi Uwe StÃ¶hr, your supplied languages file fixed the RTL issue.
>>
>
> Fine, so one bug less.
>
>
> However,
>> characters still do not appear in contextual form. Here is the ticket:
>>
>> http://www.lyx.org/trac/ticket/9066
>>
>
> Thanks, we will have a look.
>
> regards
> Uwe
>

Jean-Marc Lasgouttes

2014-07-28 19:29:33 UTC

Permalink

Le 28/07/14 21:07, Jamil Haider a écrit :
> Hello all
>
> Don't know if it is best place to ask, but where can I get my hands on
> Windows build of the master ?

A few month ago, I would have said that you have to do it yourself, but
now I have a proper answer, thanks to Peter Kümmel.

Here: http://syntheticpp.github.io/LyX-bleeding-edge/

Have fun,
JMarc

Uwe Stöhr

2014-08-24 21:57:38 UTC

Permalink

Am 28.07.2014 um 21:07 schrieb Jamil Haider:

> Don't know if it is best place to ask, but where can I get my hands on
> Windows build of the master ?

Hi Jamil,

do you need a Windows build of master to test?
If so I can upload one for you.

regards Uwe

Jean-Marc Lasgouttes

2014-10-13 16:51:13 UTC

Permalink

Le 28/07/2014 21:07, Jamil Haider a écrit :
> Hello all
>
> Don't know if it is best place to ask, but where can I get my hands on
> Windows build of the master ?

Hello Jamil,

Did you manage to try it out?

JMarc

Uwe Stöhr

2014-03-30 13:38:17 UTC

Permalink

When Implementing support for the new languages in LyX 2.1 I forgot that Urdu and Systriac are
RTL-languages: http://en.wikipedia.org/wiki/Right-to-left

The attached simple patch fixes that. I would like to have that in master because otherwise Urdu
support will not work as advertised in
http://wiki.lyx.org/LyX/NewInLyX21

thanks and regards
Uwe

Uwe Stöhr

2014-03-30 22:44:50 UTC

Permalink

Am 30.03.2014 15:38, schrieb Uwe Stöhr:
> When Implementing support for the new languages in LyX 2.1 I forgot that Urdu and Systriac are
> RTL-languages: http://en.wikipedia.org/wiki/Right-to-left
>
> The attached simple patch fixes that.

The Urdu user confirmed that this patch fixes the problem for him. So can it go in?

regards Uwe

Vincent van Ravesteijn

2014-03-31 07:55:17 UTC

Permalink

On Mon, Mar 31, 2014 at 12:44 AM, Uwe Stöhr <***@web.de> wrote:

> Am 30.03.2014 15:38, schrieb Uwe Stöhr:
>
> When Implementing support for the new languages in LyX 2.1 I forgot that
>> Urdu and Systriac are
>> RTL-languages: http://en.wikipedia.org/wiki/Right-to-left
>>
>> The attached simple patch fixes that.
>>
>
> The Urdu user confirmed that this patch fixes the problem for him. So can
> it go in?
>
> regards Uwe
>

You say: "I would like to have that in master because otherwise Urdu
support will not work as advertised in
http://wiki.lyx.org/LyX/NewInLyX21"

However, Urdu is still unusable even with your patch. Maybe it's better to
drop the support instead of advertising support that is not there ?

Vincent

Georg Baum

2014-03-31 19:46:46 UTC

Permalink

Vincent van Ravesteijn wrote:

> However, Urdu is still unusable even with your patch. Maybe it's better to
> drop the support instead of advertising support that is not there ?

+1

IMHO we should not add any new language without a test by a native speaker.
There are so many languages with properties that can't be imagined by those
not knowing the language that is is impossible for us to tell whether a
newly implemented language works correctly. If no volunteer is available to
do at least brief testing then nobody needs this language in LyX.

Georg

Uwe Stöhr

2014-03-31 21:26:35 UTC

Permalink

Am 31.03.2014 21:46, schrieb Georg Baum:

> IMHO we should not add any new language without a test by a native speaker.

My experience is that it works the opposite way: people see that LyX supports their language and
give it a try. It is only a dream that someone comes to us and say, please implement e.g. Xhosa and
I will test and compile it.

However, please note that the new supported languages were already supported. They use the existing
XeTeX/polyglossia framework via the languages file. So in fact the "new thing" is that I added these
languages to the languages file.
The only mistake I made was that I forgot to set the RTL flag for 2 languages.

regards Uwe

Uwe Stöhr

2014-03-31 22:11:58 UTC

Permalink

Am 31.03.2014 23:26, schrieb Uwe Stöhr:

> Am 31.03.2014 21:46, schrieb Georg Baum:
>
>> IMHO we should not add any new language without a test by a native speaker.
>
> My experience is that it works the opposite way: people see that LyX supports their language and
> give it a try. It is only a dream that someone comes to us and say, please implement e.g. Xhosa and
> I will test and compile it.

I should have also said that personally I would not use a program that does not support my language.
I mean how should I use LyX when I cannot write German with it: no German hyphenation, no
spell-checker, English terms like "Figure", etc.

I therefore strongly disagree that when nobody shouts nobody needs this for LyX. Take for example
Serbian (complex language because of its dualism of Cyrillic and Latin script). I took a while until
I could provide full language support for Serbian on LyX from Windows but now I count 2 - 3
downloads per week of Serbian dictionaries via LyX. I also got some feedback that it works and that
people started to use LyX at Universities. That is a success that would not have been possible if we
would have not release support for Serbian despite it was not bug-free at the beginning.

In Pakistan and the other regions where Urdu is used most people have other problems than to develop
software. However, when some Pakistani students see that LyX offers now some basic support
(spell-checker and hyphenation), we perhaps have the chance to attract people to sort out the font
issues that are apparently in LyX.

For the non RTL languages I think LyX works properly. I got feedback from former students of mine
regarding Hindi and Tamil at the time I added support for it. But clear, they will also first test
it thoroughly with real-life documents when LyX 2.1 is out. Nevertheless the Tamil guy already writ
his bachelor thesis using LyX in English. If he forwards the info to his colleagues that LyX has now
support for Tamil, we can reach more persons. (I know I am sometimes too enthusiastic ;-) )

best regards
Uwe

Georg Baum

2014-04-01 19:12:06 UTC

Permalink

Uwe Stöhr wrote:

> I should have also said that personally I would not use a program that
> does not support my language. I mean how should I use LyX when I cannot
> write German with it: no German hyphenation, no spell-checker, English
> terms like "Figure", etc.

I agree that this is important for many people. I also agree that missing
languages should be added rather sooner than later. However, I do not agree
to do this in a way which violates basic software engineering principles: If
anybody implements a new feature it needs to be tested by somebody who has
enough knowledge of that feature, and is not the person who implemented it.
Omitting these tests leads to crappy software, and I don't want LyX to
become known as crappy.

Georg

Uwe Stöhr

2014-04-01 23:23:28 UTC

Permalink

Am 01.04.2014 21:12, schrieb Georg Baum:

>> I should have also said that personally I would not use a program that
>> does not support my language. I mean how should I use LyX when I cannot
>> write German with it: no German hyphenation, no spell-checker, English
>> terms like "Figure", etc.
>
> I agree that this is important for many people. I also agree that missing
> languages should be added rather sooner than later. However, I do not agree
> to do this in a way which violates basic software engineering principles: If
> anybody implements a new feature it needs to be tested by somebody who has
> enough knowledge of that feature, and is not the person who implemented it.
> Omitting these tests leads to crappy software, and I don't want LyX to
> become known as crappy.

I understand you point and I would fully agree for a language used in a region where computer are
widespread and, more important, people can rely on electricity the whole day. During my travels I
learned that outside the western world one cannot use our principles. For example in Bolivia, most
people don't have the money to buy software. They mainly (still) use a cracked Win XP and Open
source software because it is free. Most use old laptops so that they can also work for a while when
the circuit is down. For Internet access they go to an Internet store for an hour to send and
receive mails and to download programs. The students are however as clever as we are and do a lot to
improve programs they need/want. In Bolivia they can benefit from many programs because of Spanish.
For Urdu they will most probably not. So what are their options: to use a cracked old MS Word with
basic or crappy Urdu support or to give LyX a try. LyX will already provide spell-checking,
translations of words like "Figure" and hyphenation. The output looks fine as Jamil stated, only the
ligatures withing LyX are not yet working. Well, that can be improved but one can already use LyX to
write texts.

In the Western-world we have a relative high level of expectations - when something is not working
perfectly, people tend to give up quickly while in other parts of the world people are more thankful
that they at all have something that basically works for free and that they can get feedback and can
help to improve it. (Try that e.g. with MS Word - even if you have a support contract, your feature
request or bug report will not be fixed soon if your company is too small or you are a private
person. And only very few in non-Western countries have the money for a Software support contract.)

OK, I have never been in a country where Urdu is spoken, Jamil could state more precisely about it,
I only wanted to explain why I think our strict rules are not helpful in every case for every
region. (Which does not mean that I want to weak LyX's development rules in principle! I only refer
to language support.)

best regards Uwe

Georg Baum

2014-04-03 20:12:28 UTC

Permalink

Uwe Stöhr wrote:

[snip a lot about third world problems]

> OK, I have never been in a country where Urdu is spoken, Jamil could state
> more precisely about it, I only wanted to explain why I think our strict
> rules are not helpful in every case for every region. (Which does not mean
> that I want to weak LyX's development rules in principle! I only refer to
> language support.)

I understand your motivation to help people in developing countries (and I
appreciate it), but please do not assume that your proposed solution of a
particular problem is the only one which is feasible. Claiming that LyX
supports a new language although it was not tested enough is an absolute no-
go. We cannot lie to our users.

However, if a new language which is not tested enough is clearly marked as
experimental I don't see any problem: People who want a stable solution are
warned that they should not use it, and people in desperate need of a
document processor in their native language can try it out and contribute
improvements.

Georg

Uwe Stöhr

2014-04-03 23:18:53 UTC

Permalink

Am 03.04.2014 22:12, schrieb Georg Baum:

> I understand your motivation to help people in developing countries (and I
> appreciate it), but please do not assume that your proposed solution of a
> particular problem is the only one which is feasible. Claiming that LyX
> supports a new language although it was not tested enough is an absolute no-
> go. We cannot lie to our users.

I know this is a difficult issue I just wanted to point out my thinking. I don't claim that this the
best way to go. I only think for some cases we need more flexibility. We are not lying to our users
if something is not working perfectly. Software can always have bugs. Users who don't use Urdu won't
be affected so that we still have a stable release for the cast majority.

> However, if a new language which is not tested enough is clearly marked as
> experimental I don't see any problem: People who want a stable solution are
> warned that they should not use it, and people in desperate need of a
> document processor in their native language can try it out and contribute
> improvements.

I could add a note to
http://wiki.lyx.org/LyX/NewInLyX21
that fonts within LyX don't yet have ligatures for Urdu
Should I?

btw. Jamil will help us out testing and implementing Urdu screen fonts. Great!

regards Uwe

Georg Baum

2014-04-08 20:53:37 UTC

Permalink

Uwe Stöhr wrote:

> Am 03.04.2014 22:12, schrieb Georg Baum:
>
>> I understand your motivation to help people in developing countries (and
>> I appreciate it), but please do not assume that your proposed solution of
>> a particular problem is the only one which is feasible. Claiming that LyX
>> supports a new language although it was not tested enough is an absolute
>> no- go. We cannot lie to our users.
>
> I know this is a difficult issue I just wanted to point out my thinking. I
> don't claim that this the best way to go. I only think for some cases we
> need more flexibility. We are not lying to our users if something is not
> working perfectly.

If we were not aware of problems (or only minor problems) I'd agree, but
Jean-Marc clearly stated that there are fundamental problems. He wrote on
first of april "So Uwe, there is no chance that Urdu can work right now.".
Therefore I still maintain that we are currently lying.

> Software can always have bugs. Users who don't use Urdu
> won't be affected so that we still have a stable release for the cast
> majority.

This does not contradict my request: If the urdu language is marked as
experimental then everybody not using urdu knows that he does not need to
care.

>> However, if a new language which is not tested enough is clearly marked
>> as experimental I don't see any problem: People who want a stable
>> solution are warned that they should not use it, and people in desperate
>> need of a document processor in their native language can try it out and
>> contribute improvements.
>
> I could add a note to
> http://wiki.lyx.org/LyX/NewInLyX21
> that fonts within LyX don't yet have ligatures for Urdu
> Should I?

Yes please, and please also write that there are very likely more
fundamental problems.

> btw. Jamil will help us out testing and implementing Urdu screen fonts.
> Great!

Yes, indeed!

Georg

Uwe Stöhr

2014-04-08 21:56:11 UTC

Permalink

Am 04.04.2014 08:25, schrieb Vincent van Ravesteijn:

> Sorry, but there is nothing about Urdu that we support (except an
> entry in the languages file),

And with this we support hyphenations, automatic translations of words like "figure" etc. So as
Jamil stated, the output is correct.
What is not working yet are the special ligatures in the font within LyX. As said Jamil volunteered
to get that fixed soon. I am not the expert but JMarc said this could probably be done in the 2.1.x
cycle. So let's at least try this. I think that Jamil will also translate layouttranslations etc. in
the 2.1 cycle.

> and claiming that there is a bug that
> software can always have is an understatement if it appears that you
> didn't even had a clue of how Urdu is written.

I only forgot the RTL flag in the languages file. Yes, that was a stupid oversight but I don't
understand why I am not allowed to fix that for 2.1.0.

It is not necessary to know how a language works to support it. I added support for Latvian,
Vietnamese etc. and I don't know much about these languages. But I know from users that they use LyX
with these languages. And that was the goal!
Btw., for Hindi I asked a fried from India and this seems to work.

regards Uwe

Vincent van Ravesteijn

2014-04-09 08:49:35 UTC

Permalink

On Tue, Apr 8, 2014 at 11:56 PM, Uwe Stöhr <***@web.de> wrote:
> Am 04.04.2014 08:25, schrieb Vincent van Ravesteijn:
>
>
>> Sorry, but there is nothing about Urdu that we support (except an
>> entry in the languages file),
>
>
> And with this we support hyphenations, automatic translations of words like
> "figure" etc. So as Jamil stated, the output is correct.
> What is not working yet are the special ligatures in the font within LyX. As
> said Jamil volunteered to get that fixed soon. I am not the expert but JMarc
> said this could probably be done in the 2.1.x cycle. So let's at least try
> this. I think that Jamil will also translate layouttranslations etc. in the
> 2.1 cycle.
>
>
>> and claiming that there is a bug that
>>
>> software can always have is an understatement if it appears that you
>> didn't even had a clue of how Urdu is written.
>
>
> I only forgot the RTL flag in the languages file. Yes, that was a stupid
> oversight but I don't understand why I am not allowed to fix that for 2.1.0.

I don't understand why you keep saying that only the RTL flag is
missing, while we have made it clear to you that there is so much more
missing. It's a bit shocking you didn't even know that arabic
characters change shape depending on whether they are connected to the
character before, after, both, or to none. And even when you had the
two examples next to each other on screen, you still didn't notice
this difference.

I can easily allow you to fix the flags, but we also agreed on not
advertizing unusable languages. And still, I see on the NewInLyX21
page that we now support "Urdu" and "Syriac". So, as long as you don't
act on feedback from others and, even worse, you seem to forget what
others tell you, you cannot expect others to act on yours.

Vincent

Jürgen Spitzmüller

2014-04-09 08:56:08 UTC

Permalink

2014-04-09 10:49 GMT+02:00 Vincent van Ravesteijn:

> I don't understand why you keep saying that only the RTL flag is
> missing, while we have made it clear to you that there is so much more
> missing. It's a bit shocking you didn't even know that arabic
> characters change shape depending on whether they are connected to the
> character before, after, both, or to none. And even when you had the
> two examples next to each other on screen, you still didn't notice
> this difference.
>
> I can easily allow you to fix the flags, but we also agreed on not
> advertizing unusable languages. And still, I see on the NewInLyX21
> page that we now support "Urdu" and "Syriac". So, as long as you don't
> act on feedback from others and, even worse, you seem to forget what
> others tell you, you cannot expect others to act on yours.
>

I think this is a similar case than the Bibtex errors: We are not sure (or
even think that) this is not yet ready for prime time.

I propose the following:

Disable this feature for the 2.1 release. If you just comment out the
language entry, Jamil can easily enable it again on his side (without
recompiling) and help us testing and finishing the support for this
language.

I think this is a reasonable compromise.

Jürgen

>
> Vincent
>

Vincent van Ravesteijn

2014-04-09 19:09:48 UTC

Permalink

>
> I propose the following:
>
> Disable this feature for the 2.1 release. If you just comment out the
> language entry, Jamil can easily enable it again on his side (without
> recompiling) and help us testing and finishing the support for this
> language.
>
> I think this is a reasonable compromise.
>
>

I agree and I did so.

Vincent

Uwe Stöhr

2014-04-10 21:09:58 UTC

Permalink

Am 09.04.2014 10:56, schrieb Jürgen Spitzmüller:

> I propose the following:
>
> Disable this feature for the 2.1 release. If you just comment out the
> language entry, Jamil can easily enable it again on his side (without
> recompiling) and help us testing and finishing the support for this
> language.

Fine with me.

regards Uwe

Uwe Stöhr

2014-04-10 21:09:14 UTC

Permalink

Am 09.04.2014 10:49, schrieb Vincent van Ravesteijn:

> I don't understand why you keep saying that only the RTL flag is
> missing, while we have made it clear to you that there is so much more
> missing.

I said how I see it and also said that you all should decide. If we should not advertise Urdu as new
language for LyX 2.1, then please remove it from our announcement.
The RTL flag can nevertheless be corrected and that was what I was referring to.

> It's a bit shocking you didn't even know that arabic
> characters change shape depending on whether they are connected to the
> character before, after, both, or to none. And even when you had the
> two examples next to each other on screen, you still didn't notice
> this difference.

I sent you a screenshot because I was unsure. I also said that from the beginning that I am aware
that the ligatures don't work and that is why I invited Jamal who kindly already volunteered. JMarc
will have a look what we can do with this issue for 2.1.x. So everything should be fine.
(It might shock you again ;-): I helped once to implement Arabic for LyX while not knowing how this
language works.)

regards Uwe

Jean-Marc Lasgouttes

2014-04-09 08:57:02 UTC

Permalink

08/04/2014 23:56, Uwe Stöhr:
> So as Jamil stated, the output is correct. What is not working yet
> are the special ligatures in the font within LyX. As said Jamil
> volunteered to get that fixed soon. I am not the expert but JMarc
> said this could probably be done in the 2.1.x cycle.

I have to retract that. From what I understand there are a lot of
ligatures in Urdu and we cannot make them work "by hand". This will have
to be done in the str-metrics branch (by leveraging Qt to do the work),
which is 2.2 material.

JMarc

Vincent van Ravesteijn

2014-04-09 09:06:18 UTC

Permalink

On Wed, Apr 9, 2014 at 10:57 AM, Jean-Marc Lasgouttes
<***@lyx.org> wrote:
> 08/04/2014 23:56, Uwe Stöhr:
>
>> So as Jamil stated, the output is correct. What is not working yet
>> are the special ligatures in the font within LyX. As said Jamil
>> volunteered to get that fixed soon. I am not the expert but JMarc
>> said this could probably be done in the 2.1.x cycle.
>
>
> I have to retract that. From what I understand there are a lot of ligatures
> in Urdu and we cannot make them work "by hand". This will have to be done in
> the str-metrics branch (by leveraging Qt to do the work), which is 2.2
> material.
>

Am I misunderstanding or will we have a 90% solution if we just assume
that the script is the same as arabic ? Then there are only a few
exceptions that might need special attention ?

Vincent

Jean-Marc Lasgouttes

2014-04-09 10:33:49 UTC

Permalink

09/04/2014 11:06, Vincent van Ravesteijn:
> Am I misunderstanding or will we have a 90% solution if we just assume
> that the script is the same as arabic ? Then there are only a few
> exceptions that might need special attention ?

The problem is to have the equivalent of Encodings::transformChar (and
of the arabic_table[172] array) for Urdu. If my plans go well, we will
be able to get rid of all this stuff and let Qt handle it all. But of
course, this will require lots of testing from all the RtL users we can
find.

Note that this is not a requirement for the str-metrics branch, and that
I will probably work on it in a separate branch once str-metrics has landed.

JMarc

Vincent van Ravesteijn

2014-04-09 10:47:47 UTC

Permalink

On Wed, Apr 9, 2014 at 12:33 PM, Jean-Marc Lasgouttes
<***@lyx.org> wrote:
> 09/04/2014 11:06, Vincent van Ravesteijn:
>
>> Am I misunderstanding or will we have a 90% solution if we just assume
>> that the script is the same as arabic ? Then there are only a few
>> exceptions that might need special attention ?
>
>
> The problem is to have the equivalent of Encodings::transformChar (and of
> the arabic_table[172] array) for Urdu.

Urdu uses the same script, so the same arabic_table can be reused,
just as the table is also used for Farsi.

Some specific Urdu letters are not implemented in this table, but it
is almost trivial to fix that.

Vincent

Jean-Marc Lasgouttes

2014-04-09 11:40:51 UTC

Permalink

Jean-Marc Lasgouttes

2014-04-11 10:11:06 UTC

Permalink

09/04/2014 12:47, Vincent van Ravesteijn:
> Urdu uses the same script, so the same arabic_table can be reused,
> just as the table is also used for Farsi.
>
> Some specific Urdu letters are not implemented in this table, but it
> is almost trivial to fix that.

... or we could do like the following patch, (for the str-metrics
branch). This patch uses Qt for drawing RTL strings and _seems_ to work.
However, I did not commit it because
1/ I do not know what I am doing
and
2/ I have a "better" version (see the code), but it does not work.

The tests for RTL characters (check whether it is in the Arabic or
Hebrew ranges) is probably a bit rough around the edges, but I
think it is possible to get it right. The trick is that we take into
account the layout direction of the characters themselves, not only of
the language.

The nice and scary property of this patch is that it will lead us to
kill our carefully hand tuned Arabic and Hebrew support (for display).
I would appreciate to testing or some comments before applying it to my
branch. I only tried it visually on one Hebrew and one Farsi document.

Note that at this point, I do not intend to implement perfect bidi
handling, but only to re-implement what we already have using
string-level metrics. So please do not try to lure me into implementing
implementing the full specs. The goal is just (for now) to find what I'd
call a Gudinov algorithm, that is something that does not regress wrt
the current implementation.

Comments welcome.

JMarc

Jean-Marc Lasgouttes

2014-04-17 13:20:39 UTC

Permalink

A cleaned-up version of this patch has been committed to
features/str-metrics now. I encourage every person who feels bored
enough to take a look at this branch and report bugs. At this point, it
is mostly feature-complete. I'd like to fix the most embarassing bugs
that might be found before landing.

I would in particular be interested to know whether the weird languages
(that is, weirder than French is) cause problems.

The only problem that I know of is that the screen moves when selecting,
which is due to some ligature/kerning effects. We will have to discuss
whether we want to fix it (at the price of dropping Color_selectiontext,
which may annoy people with strange color schemes).

JMarc

11/04/2014 12:11, Jean-Marc Lasgouttes:
> ... or we could do like the following patch, (for the str-metrics
> branch). This patch uses Qt for drawing RTL strings and _seems_ to work.
> However, I did not commit it because
> 1/ I do not know what I am doing
> and
> 2/ I have a "better" version (see the code), but it does not work.
>
> The tests for RTL characters (check whether it is in the Arabic or
> Hebrew ranges) is probably a bit rough around the edges, but I
> think it is possible to get it right. The trick is that we take into
> account the layout direction of the characters themselves, not only of
> the language.
>
> The nice and scary property of this patch is that it will lead us to
> kill our carefully hand tuned Arabic and Hebrew support (for display).
> I would appreciate to testing or some comments before applying it to my
> branch. I only tried it visually on one Hebrew and one Farsi document.
>
> Note that at this point, I do not intend to implement perfect bidi
> handling, but only to re-implement what we already have using
> string-level metrics. So please do not try to lure me into implementing
> implementing the full specs. The goal is just (for now) to find what I'd
> call a Gudinov algorithm, that is something that does not regress wrt
> the current implementation.

Georg Baum

2014-04-01 19:01:32 UTC

Permalink

Uwe Stöhr wrote:

> Am 31.03.2014 21:46, schrieb Georg Baum:
>
>> IMHO we should not add any new language without a test by a native
>> speaker.
>
> My experience is that it works the opposite way: people see that LyX
> supports their language and give it a try. It is only a dream that someone
> comes to us and say, please implement e.g. Xhosa and I will test and
> compile it.

Well, what I am trying to say is that it is almost impossible to support a
langaguage which we do not currently support in a way which is actually
_usable_, unless somebody with very good knowledge of that language tests
what we implement. Without such a test I cannot claim with good conscience
that LyX supports the language. Copy-pasting from wikipedia is not enough.

I do not say that adding new languages is bad. I only do not want that new
languages are implemented in a way that _almost always_ will disappoint
users.

> However, please note that the new supported languages were already
> supported.

No. A language which is not in the languages file is not supported by
definition, even if a lot of code is already prepared for it.

> They use the existing XeTeX/polyglossia framework via the
> languages file. So in fact the "new thing" is that I added these languages
> to the languages file.

This changed the languages from "unsupported" to "supported".

> The only mistake I made was that I forgot to set
> the RTL flag for 2 languages.

Which would have been found if these languages had actually been tested by
somebody who knows them sufficiently well.

Georg

Uwe Stöhr

2014-03-31 21:04:53 UTC

Permalink

Am 31.03.2014 09:55, schrieb Vincent van Ravesteijn:

> You say: "I would like to have that in master because otherwise Urdu
> support will not work as advertised in
> http://wiki.lyx.org/LyX/NewInLyX21"
>
> However, Urdu is still unusable even with your patch.

How do you come to the conclusion?
My patch fixes an obvious oversight from my side and note that it also fixes Syriac. So I don't see
why this cannot got in. (It would embarrass me when LyX 2.1 comes with such a stupid mistake from my
side.)

The other problem the Urdu user has it about its special font. For me it works when I copy some Urdu
text from the Wikipedia and using Windows 7 standard fonts. So the bug is about his special font -
this can be sort out later (if LyX is really to blame here).

> Maybe it's better to
> drop the support instead of advertising support that is not there ?

That was meant ironic, right? If not, why should I remove a feature when I can fix is with a 2-liner
patch of a text file?

regards Uwe

p.s. In German we have a saying "Ironie im Text zieht nie!" (irony in texts will not be recognized
as such)

Vincent van Ravesteijn

2014-03-31 21:25:00 UTC

Permalink

Uwe Stöhr schreef op 31-3-2014 23:04:
> Am 31.03.2014 09:55, schrieb Vincent van Ravesteijn:
>
>> You say: "I would like to have that in master because otherwise Urdu
>> support will not work as advertised in
>> http://wiki.lyx.org/LyX/NewInLyX21"
>>
>> However, Urdu is still unusable even with your patch.
>
> How do you come to the conclusion?

I draw this conclusion because a user writing Urdu has made a bug report
that all characters appear as separate characters and they do not appear
in their contextual form. Jean-marc confirmed that this is hardcoded for
arabic and hebrew. Also, when I type arabic characters in an document
with the language Urdu, they do not appear correct. Also, when opening
the test file the characters are wrong.

> My patch fixes an obvious oversight from my side and note that it also
> fixes Syriac. So I don't see why this cannot got in. (It would
> embarrass me when LyX 2.1 comes with such a stupid mistake from my side.)

Please read what I write. I believe that Urdu support is not fully
working, so I just asked the question ? Is it better to advertise a
non-working Urdu support or is it better to not advertise it at all.

>
> The other problem the Urdu user has it about its special font. For me
> it works when I copy some Urdu text from the Wikipedia and using
> Windows 7 standard fonts. So the bug is about his special font - this
> can be sort out later (if LyX is really to blame here).

Please show me that it works for you. You can make a screenshot or
whatever. At least, I don't see the correct characters whatever I try to.

>
> > Maybe it's better to
>> drop the support instead of advertising support that is not there ?
>
> That was meant ironic, right? If not, why should I remove a feature
> when I can fix is with a 2-liner patch of a text file?

No, I was serious. You should not advertise a feature if it doesn't work.

>
> regards Uwe
>
> p.s. In German we have a saying "Ironie im Text zieht nie!" (irony in
> texts will not be recognized as such)

Please read what I write and don't read what I don't write.

Vincent

Uwe Stöhr

2014-03-31 21:34:57 UTC

Permalink

Am 31.03.2014 23:25, schrieb Vincent van Ravesteijn:

>> How do you come to the conclusion?
>
> I draw this conclusion because a user writing Urdu has made a bug report that all characters appear
> as separate characters and they do not appear in their contextual form. Jean-marc confirmed that
> this is hardcoded for arabic and hebrew. Also, when I type arabic characters in an document with the
> language Urdu, they do not appear correct. Also, when opening the test file the characters are wrong.

Then we should indeed remove Syriac and Urdu from languages until we support tits RTL.

> Please show me that it works for you. You can make a screenshot or whatever. At least, I don't see
> the correct characters whatever I try to.

Attached is a screenshot. I am not an expert but for me it looks OK. Or what do I miss?

regards Uwe

Vincent van Ravesteijn

2014-03-31 21:43:23 UTC

Permalink

Op 31 mrt. 2014 23:35 schreef "Uwe Stöhr" <***@web.de>:
>
> Am 31.03.2014 23:25, schrieb Vincent van Ravesteijn:
>
>
>>> How do you come to the conclusion?
>>
>>
>> I draw this conclusion because a user writing Urdu has made a bug report
that all characters appear
>> as separate characters and they do not appear in their contextual form.
Jean-marc confirmed that
>> this is hardcoded for arabic and hebrew. Also, when I type arabic
characters in an document with the
>> language Urdu, they do not appear correct. Also, when opening the test
file the characters are wrong.
>
>
> Then we should indeed remove Syriac and Urdu from languages until we
support tits RTL.
>
>
>> Please show me that it works for you. You can make a screenshot or
whatever. At least, I don't see
>> the correct characters whatever I try to.
>
>
> Attached is a screenshot. I am not an expert but for me it looks OK. Or
what do I miss?
>
> regards Uwe

The image has too low resolution to see text.

Vincent

Uwe Stöhr

2014-03-31 21:47:45 UTC

Permalink

Am 31.03.2014 23:43, schrieb Vincent van Ravesteijn:

> The image has too low resolution to see text.

it has 1980 x 1080 pixels at 96 dpi - that is the resolution of my monitor. I'll re-send you the
image in a private mail again. maybe your newsreader mangled it.

regards Uwe

Vincent van Ravesteijn

2014-03-31 21:56:55 UTC

Permalink

Op 31 mrt. 2014 23:35 schreef "Uwe Stöhr" <***@web.de>:
>
> Am 31.03.2014 23:25, schrieb Vincent van Ravesteijn:
>
>
>>> How do you come to the conclusion?
>>
>>
>> I draw this conclusion because a user writing Urdu has made a bug report
that all characters appear
>> as separate characters and they do not appear in their contextual form.
Jean-marc confirmed that
>> this is hardcoded for arabic and hebrew. Also, when I type arabic
characters in an document with the
>> language Urdu, they do not appear correct. Also, when opening the test
file the characters are wrong.
>
>
> Then we should indeed remove Syriac and Urdu from languages until we
support tits RTL.
>

Rtl and contextual characters are separate issues.

Vincent

Uwe Stöhr

2014-03-31 22:19:40 UTC

Permalink

Am 31.03.2014 23:56, schrieb Vincent van Ravesteijn:

> Rtl and contextual characters are separate issues.

And what should we do now?
Do you see the bug in my screenshot or is this OK for you?

regards Uwe

Vincent van Ravesteijn

2014-03-31 22:48:24 UTC

Permalink

Op 1 apr. 2014 00:19 schreef "Uwe Stöhr"
<***@web.de<javascript:_e(%7B%7D,'cvml','***@web.de');>
>:
>
> Am 31.03.2014 23:56, schrieb Vincent van Ravesteijn:
>
>
>> Rtl and contextual characters are separate issues.
>
>
> And what should we do now?
> Do you see the bug in my screenshot or is this OK for you?
>
> regards Uwe

As I said. I cannot see the text in your image, so I cannot tell.

Vincent

Vincent van Ravesteijn

2014-04-01 07:28:48 UTC

Permalink

On Tue, Apr 1, 2014 at 12:19 AM, Uwe Stöhr <***@web.de> wrote:

> Am 31.03.2014 23:56, schrieb Vincent van Ravesteijn:
>
>
> Rtl and contextual characters are separate issues.
>>
>
> And what should we do now?
> Do you see the bug in my screenshot or is this OK for you?
>
> regards Uwe
>

I cut out two words, both in the wikipedia text, and the text in LyX. Are
you sure that the words in the left column are the same as those in the
right column ? Especially pay attention to the middle letter of the top
word, and the starting letter (the rightmost one) of the word at the
bottom. To me it looks very different.

Vincent

Jean-Marc Lasgouttes

2014-04-01 08:10:57 UTC

Permalink

01/04/2014 09:28, Vincent van Ravesteijn:
> I cut out two words, both in the wikipedia text, and the text in LyX.
> Are you sure that the words in the left column are the same as those in
> the right column ? Especially pay attention to the middle letter of the
> top word, and the starting letter (the rightmost one) of the word at the
> bottom. To me it looks very different.

Currently RtL text is painted character by character, which means that
all character ligatures are broken. For Arabic and Hebrew, this is
handled explicitly by functions like Encodings::transformChar.

As part of the str-metrics branch, I have good hope of being able to
draw RtL text word-wise and thus rely on Qt (or rather ICU if I
understand correctly) to do the right thing. But this is definitely 2.2
stuff.

Currenttyl, we have in the code many "if (hebrew)" or "if (arabic)"
constructs that should be audited and removed if we want to have a
chance to make Urdu work.

What I would need is someone who understands all (some of) these
languages and can test the features/str-metrics branch to tell me what
works and what does not work (note that right now the display is not
correct for RtL languages).

So Uwe, there is no chance that Urdu can work right now.

JMarc

Vincent van Ravesteijn

2014-04-01 08:46:42 UTC

Permalink

On Tue, Apr 1, 2014 at 10:10 AM, Jean-Marc Lasgouttes <***@lyx.org>wrote:

> 01/04/2014 09:28, Vincent van Ravesteijn:
>
> I cut out two words, both in the wikipedia text, and the text in LyX.
>> Are you sure that the words in the left column are the same as those in
>> the right column ? Especially pay attention to the middle letter of the
>> top word, and the starting letter (the rightmost one) of the word at the
>> bottom. To me it looks very different.
>>
>
> Currently RtL text is painted character by character, which means that all
> character ligatures are broken. For Arabic and Hebrew, this is handled
> explicitly by functions like Encodings::transformChar.
>
> As part of the str-metrics branch, I have good hope of being able to draw
> RtL text word-wise and thus rely on Qt (or rather ICU if I understand
> correctly) to do the right thing. But this is definitely 2.2 stuff.
>
> Currenttyl, we have in the code many "if (hebrew)" or "if (arabic)"
> constructs that should be audited and removed if we want to have a chance
> to make Urdu work.
>

I somehow remember that Farsi did work ? Or do I remember incorrectly
? Basically they use the same script so the "if (arabic)" clause might
handle all of Arabic, Farsi, Urdu.

> What I would need is someone who understands all (some of) these languages
> and can test the features/str-metrics branch to tell me what works and what
> does not work (note that right now the display is not correct for RtL
> languages).
>

I know how to read and write the arabic script (though I don't know many
words).

Vincent

Jean-Marc Lasgouttes

2014-04-01 09:26:22 UTC

Permalink

01/04/2014 10:46, Vincent van Ravesteijn:
> Currently, we have in the code many "if (hebrew)" or "if (arabic)"
> constructs that should be audited and removed if we want to have a
> chance to make Urdu work.
>
> I somehow remember that Farsi did work ? Or do I remember incorrectly
> ? Basically they use the same script so the "if (arabic)" clause might
> handle all of Arabic, Farsi, Urdu.

Indeed Farsi is handled. But what I do not know is what characters have
to be handled specially. But indeed it should not be too difficult to
add support for Urdu in 2.1.x timeframe by adding some code similar to
what is done in Arabic. But do we want to to that? And who would be our
local expert?

One problem is that, from what I understand, some of this hebrew/arabic
special casing has been added just to be sure to avoid breaking LtR
code. So some of it may be useless.

>
> What I would need is someone who understands all (some of) these
> languages and can test the features/str-metrics branch to tell me
> what works and what does not work (note that right now the display
> is not correct for RtL languages).
>
> I know how to read and write the arabic script (though I don't know many
> words).

Very good, it will be useful (although I do not undertand why you have
this peculiar piece of knowledge...). I am not sure when I will be able
to have hopefully correct RtL painting, but I'll try to do that ASAP.

And then implementing Urdu should be easy.

JMarc

Uwe Stöhr

2014-04-02 22:50:30 UTC

Permalink

Am 02.04.2014 07:39, schrieb JeanMarc Lasgouttes:

> OK, except that I never stated that I would find time to work on it soon.

Sorry, then I misread your post:

>>> But indeed it should not be too difficult to add support for Urdu in
>>> 2.1.x timeframe by adding some code similar to what is done in Arabic.

However, I asked Jamil if he could help us out to be our local tester. If not for 2.1.x we hopefully
get it to work with his help for 2.2

regards Uwe

Uwe Stöhr

2014-04-01 22:44:54 UTC

Permalink

Am 01.04.2014 11:26, schrieb Jean-Marc Lasgouttes:

> Indeed Farsi is handled. But what I do not know is what characters have to be handled specially.
> But indeed it should not be too difficult to add support for Urdu in 2.1.x timeframe by adding
> some code similar to what is done in Arabic. But do we want to to that? And who would be our
> local expert?

My proposal after reading today's posts: I commit my language file patch (to 2.1.-staging or master
as Vincent decides) to give jamil a version with wich he can start to work. I will now also
encourage Jamil to help us and tell what special ligatures are in Urdu and to test JMarc's work?

OK?

regards Uwe

JeanMarc Lasgouttes

2014-04-02 05:39:07 UTC

Permalink

OK, except that I never stated that I would find time to work on it soon. Actually the reasanable thing to do is to try to get it right in 2.2.

JMarc

On 2 avril 2014 00:44:54 UTC+02:00, "Uwe StÃ¶hr" <***@web.de> wrote:
>Am 01.04.2014 11:26, schrieb Jean-Marc Lasgouttes:
>
>> Indeed Farsi is handled. But what I do not know is what characters
>have to be handled specially.
>> But indeed it should not be too difficult to add support for Urdu in
>2.1.x timeframe by adding
>> some code similar to what is done in Arabic. But do we want to to
>that? And who would be our
> > local expert?
>
>My proposal after reading today's posts: I commit my language file
>patch (to 2.1.-staging or master
>as Vincent decides) to give jamil a version with wich he can start to
>work. I will now also
>encourage Jamil to help us and tell what special ligatures are in Urdu
>and to test JMarc's work?
>
>OK?
>
>regards Uwe