Tux

...making Linux just a little more fun!

Followup: Apertium in 'The Guardian'

Jimmy O'Regan [joregan at gmail.com]


Fri, 15 Aug 2008 18:41:30 +0100

There was an article about our Welsh translator in yesterday's Guardian: http://www.guardian.co.uk/technology/2008/aug/14/freeourdata.opensource

"Flummoxed by a document in Welsh? Now you can get a free translation at cymraeg.org.uk. The Apertium-cy software, described as the first free automatic translator from Welsh to English, is the fruit of a multilingual effort involving developers in Spain, Wales and Ireland pushing forward the possibilities of open-source software and, they hope, free public-sector data."

The focus of the article is on how we weren't able to use public data compiled by the Welsh Language Board:

'When we contacted the Welsh Language Board, however, it said the Apertium team couldn't be more wrong. "We welcome re-use," it said. Although the small print forbids unauthorised reproduction, the board says it would be delighted to consider requests. Where feasible, it will make products available under what it says would be "a suitable free non-commercial agreement".'

Well, if they had ever returned any of my phone calls, maybe we could have used their data. Maybe they'll give me an answer now :)


Top    Back


Jimmy O'Regan [joregan at gmail.com]


Sat, 16 Aug 2008 00:21:31 +0100

2008/8/15 Rick Moen <rick@linuxmafia.com>:

> Quoting Jimmy O'Regan (joregan@gmail.com):
>
>> The focus of the article is on how we weren't able to use public data
>> compiled by the Welsh Language Board:
>>
>> 'When we contacted the Welsh Language Board, however, it said the
>> Apertium team couldn't be more wrong. "We welcome re-use," it said.
>> Although the small print forbids unauthorised reproduction, the board
>> says it would be delighted to consider requests. Where feasible, it
>> will make products available under what it says would be "a suitable
>> free non-commercial agreement".'
>>
>> Well, if they had ever returned any of my phone calls, maybe we could
>> have used their data. Maybe they'll give me an answer now :)
>
> On available evidence, Apertium (which is GNU GPL) would have been able
> to make, at best, limited use of their data.  Note the Board's phrase:
> 'non-commercial agreement'.
>

The database is here: http://www.e-gymraeg.org/bwrdd-yr-iaith/termau/default.aspx?lang=en

The actual licence is:

The Welsh Language Board is the owner and/or manager of the copyright;
database rights and all other rights pertaining to this database of
terms.
 
Users are only allowed to download lists of terms to the memory of one
computer or to translation memories shared across one closed network
for their personal use or the sole use of their employers.
 
It is not permitted to reproduce, copy or publish these list in any
form whatsoever without the Board's prior permission.

Clearly not open source compatible.

> Maybe the distinction between 'free' and 'free' is clearer in Cymraeg.
> ;->

It seems to be :) 'am ddim' for 'no cost', 'rhydd' for 'at liberty'


Top    Back


Jimmy O'Regan [joregan at gmail.com]


Sat, 16 Aug 2008 13:50:42 +0100

2008/8/16 Jimmy O'Regan <joregan@gmail.com>:

> 2008/8/15 Rick Moen <rick@linuxmafia.com>:
>> Quoting Jimmy O'Regan (joregan@gmail.com):
>>
>>> The focus of the article is on how we weren't able to use public data
>>> compiled by the Welsh Language Board:
>>>
>>> 'When we contacted the Welsh Language Board, however, it said the
>>> Apertium team couldn't be more wrong. "We welcome re-use," it said.
>>> Although the small print forbids unauthorised reproduction, the board
>>> says it would be delighted to consider requests. Where feasible, it
>>> will make products available under what it says would be "a suitable
>>> free non-commercial agreement".'
>>>
>>> Well, if they had ever returned any of my phone calls, maybe we could
>>> have used their data. Maybe they'll give me an answer now :)
>>
>> On available evidence, Apertium (which is GNU GPL) would have been able
>> to make, at best, limited use of their data.  Note the Board's phrase:
>> 'non-commercial agreement'.
>>
>
> The database is here:
> http://www.e-gymraeg.org/bwrdd-yr-iaith/termau/default.aspx?lang=en
>
> The actual licence is:
>
> "
> The Welsh Language Board is the owner and/or manager of the copyright;
> database rights and all other rights pertaining to this database of
> terms.
>
> Users are only allowed to download lists of terms to the memory of one
> computer or to translation memories shared across one closed network
> for their personal use or the sole use of their employers.
>
> It is not permitted to reproduce, copy or publish these list in any
> form whatsoever without the Board's prior permission.
> "

I think the main issue that the Welsh Language Board is likely to have is of not wishing modified lists to be misrepresented as being 'official'. We have a mechanism for annotating each entry with its author individually. My assumptions are - and I hope that you'll correct me if I'm wrong - that copyright law forbids the misrepresentation of your work as someone else's just as much as the opposite (for our use, that we mark modified versions of their work as such), and that, if someone attempts to extract only the WLB's terms, that the WLB's collection copyright, and thus their licence, is in effect, rather than ours (or is it safer to try to avoid that completely?)


Top    Back


Rick Moen [rick at linuxmafia.com]


Mon, 18 Aug 2008 17:49:41 -0700

Quoting Jimmy O'Regan (joregan@gmail.com):

> I think the main issue that the Welsh Language Board is likely to have
> is of not wishing modified lists to be misrepresented as being
> 'official'. We have a mechanism for annotating each entry with its
> author individually. My assumptions are - and I hope that you'll
> correct me if I'm wrong - that copyright law forbids the
> misrepresentation of your work as someone else's just as much as the
> opposite (for our use, that we mark modified versions of their work as
> such), and that, if someone attempts to extract only the WLB's terms,
> that the WLB's collection copyright, and thus their licence, is in
> effect, rather than ours (or is it safer to try to avoid that
> completely?)

In the UK, the right to be secure against misrepresentation of your work as someone else's is part of what are called the 'moral rights' of authors, which are secured by Parliamentary statutes, EU directives, and provisions of the Berne Convention.

(I'm going to be lazy and not going to chase down specific legal citations for you, in part because it's not a very controversial point of law -- but you'd search for phrases like 'right of attribution' and 'moral rights'.


Top    Back


Jimmy O'Regan [joregan at gmail.com]


Tue, 19 Aug 2008 02:02:58 +0100

2008/8/19 Rick Moen <rick@linuxmafia.com>:

> Quoting Jimmy O'Regan (joregan@gmail.com):
>
>> I think the main issue that the Welsh Language Board is likely to have
>> is of not wishing modified lists to be misrepresented as being
>> 'official'. We have a mechanism for annotating each entry with its
>> author individually. My assumptions are - and I hope that you'll
>> correct me if I'm wrong - that copyright law forbids the
>> misrepresentation of your work as someone else's just as much as the
>> opposite (for our use, that we mark modified versions of their work as
>> such), and that, if someone attempts to extract only the WLB's terms,
>> that the WLB's collection copyright, and thus their licence, is in
>> effect, rather than ours (or is it safer to try to avoid that
>> completely?)
>
> In the UK, the right to be secure against misrepresentation of your work
> as someone else's is part of what are called the 'moral rights' of
> authors, which are secured by Parliamentary statutes, EU directives, and
> provisions of the Berne Convention.
>
> (I'm going to be lazy and not going to chase down specific legal
> citations for you, in part because it's not a very controversial point
> of law -- but you'd search for phrases like 'right of attribution' and
> 'moral rights'.

'Moral rights' was the phrase I needed - thanks Rick!


Top    Back