> Is language a cultural heritage ? Should we preserve them ? How ?
In the era of data and computation, language becomes even more a part of
cultural heritage, in my opinion.
Language today gets dissected, analysed, generated etc through digital
tools, natural language processing techniques, AI, machine learning.
And these are not neutral processes.
First of all for their own structure, origins, agendas etc. A natural
language analysis process using machine learning designed for English does
not necessarily work for Italian, or French. Or, if it works, it will make
certain assumptions, based on the starting vocabularies, and language
structures, histories, cultures etc which will have certain expressions
untranslatable, misinterpreted etc.
This also causes loops. For example: if AI interprets language wrong, and
it uses the interpretation to control processes which have impacts on the
world, this creates a loop, because if I say "A", the AI interprets "B" and
feeds "B" to me, and if this is systematic (for example think about it in
terms of things happening in social networks or search engines, where these
things can cause systematic communication and information with/to millions
of people, multiple times a day), "B" will progressively become
"something", even if it was not before.
This is linguistic evolution, but the fact that it can be so directly (and
opaquely) influenced by so many different agendas may be something to
What can we do?
first of all support and open source.
support: these research activities are largely unfinanced. Developers and
researchers use software tools lightly, most of the times, using software
made for English to parse Italian, French etc. And even if they use
dictionaries in these languages, the software itself is not neutral (and
the people who created it, in English etc). Code is law, once said Lessig:
it applies also here.
And then open source: there are multiple semantic databases etc, which
could be open sourced and preserved, studied, observed for how they
describe language as it evolves.
Instead most of the times they are used to drive Dictionary and Thesaurus
websites, to obtain ad-driven revenue, or only accessible to corporations,
to infer data from text content, so they can build up rich user profiles.
It is a pity, and strategies should be put in place to make them available,
accessible and usable.
A museum of language, in this sense, would be a wonderful thing to do.
Yasmin_discussions mailing list
Yasmin URL: http://www.media.uoa.gr/yasmin
SBSCRIBE: click on the link to the list you wish to subscribe to. In the page that will appear ("info page"), enter e-mail address, name, and password in the fields found further down the page.
HOW TO UNSUBSCRIBE: on the info page, scroll all the way down and enter your e-mail address in the last field. Enter password if asked. Click on the unsubscribe button on the page that will appear ("options page").
TO ENABLE / DISABLE DIGEST MODE: in the options page, find the "Set Digest Mode" option and set it to either on or off.
If you prefer to read the posts on a blog go to http://yasminlist.blogspot.com/