Linux Adoption of Indian languages in Opensource from India

bosky101

Explorer
Indic Script Encoding Standards

A Background ISCII - Indian Script Code for Information Interchange. Current standard IS 13194:1991. The ISCII code standard specifies a 7-bit code table which can be used in a 7 or 8-bit ISO compatible environment. It allows English and Indian script alphabets to be used simultaneously. It retains the ASCII character set in the lower half (0-127) of the 8-bit code table and provides Indian script characters in the upper half (160-255). ISCII caters to the following 10 Indian scripts - Devanagari, Gujarati, Punjabi, Bengali, Assamese, Oriya, Telugu, Tamil, Malayalam, Kannada.

The ISCII code table is a superset of all the characters required for the above mentioned scripts. First version released in 1983 and adopted by the Bureau of Indian Standards (BIS) in 1991 after revisions in 1986 and 1988.

Unicode - Unicode is a 16-bit universal character encoding standard for multilingual text. It covers all the major scripts used for writing Indian languages. The Unicode Standard for Indic scripts is based on the ISCII-1988 revision and is a superset of the ISCII-1991 character encoding.

Texts encoded in ISCII-1991 may be automatically converted to Unicode values and back to their original encoding without loss of information. This is the universal standard which is gradually gaining momentum in the ever increasing multilingual World Wide Web. Complexity in the rendering and editing of Indic Scripts The display of Indic scripts is non-linear in nature. Glyphs have variable widths and have positional attributes. Vowel signs can be attached to the top, bottom, left and right sides of the base consonant. Vowel signs may also combine with consonants to form independent glyphs.

Consonants frequently combine with each other to form complex conjunct glyphs. Although the encoding encapsulates only the basic alphabetic characters, the number of glyphs and their combinations required for the exhaustive rendering of these scripts can be quite large (2000 - 4000+ glyph combinations).

eg :

techba1.jpg

Keyboard Layouts for Indian Languages.

There are 3 different keyboard layouts for Indian Languages :

a. Romanised Layout: In Romanised layout, phonetic English mappings are used to compose the Hindi Text. For example, the key raamaa (or rAmA) can be used to type 'Rama'.

b. Typewriter Layout: This layout is similar to the Hindi typewriter layout & useful for Hindi typists & other people familiar with Hindi Typewriter layout. Typewriter Layout & Key Sequence Charts

c. DOE Phonetic: This layout is standardized by the Department Of Electronics (DOE), Govt. Of India. The advantage of this layout is that the layout remains identical for all Indian Languages. For example, the key 'k' is used to represent the letter 'ka' in all Indian Languages. The Keyboard Layout and the Key Sequence Charts can be used to find the correct key combinations.

Current Status of Open Source in India

The Department of Information Technology initiated the TDIL (Technology Development for Indian Languages) with the objective of developing Information Processing Tools and Techniques to facilitate human-machine interaction without language barrier; creating and accessing multilingual knowledge resources; and integrating them to develop innovative user products and services. Among several other initiatives , there has been active and noted work in this direction among academic institutions , user groups and other research groups.

Indian Linux Project

The goal of this project is to create a Linux distribution that supports Indian Languages at all levels. This Indianisation project will strive to bring the benefits of Information Technology down to the Indian masses. We want to make technology accessible to the majority of India that does not speak English.The task of localization has several pieces that need domain expertise. Some examples are I/O modules, development of fonts, kernel enablement, word translation etc. The project is looking for experts and volunteers to champion the cause of Indian language computing. You may volunteer and participate here.The Indian Linux project is open source and completely free. It is licensed under the GNU General Public License.Here are the complete licensing terms and conditions.

BharateeyaOO

BharateeyaOO is a Unicode based office suite in Indian Languages which can be used across all major platforms. One of the main objectives of this project is to reach out to the masses, breaking language barriers and physical boundaries, through the support for not only Indian languages but for International languages as well.

IndiX project

The aim of the IndiX project is to design a localized "user friendly" interface at the system level in Linux, which will look more natural to the vernacular computer user. Suitable components within Linux OS (desktop environment etc.) are also being localized to enable existing applications to create, edit and print contents in Indian Languages.

OpenOffice.org

open source productivity and creativity suite is gearing itself for wider use. It has a build for Hindi script localization that can be downloaded. For bi -lingual needs - OpenOffice.org 1.03 English and Devanagiri - a workaround has been devised by smartxpark@vsnl.net - using Sanskrit99 font and Devanagiri and phonetic transliteration text through freeware iTranslator99 from http://www.omkarananda-ashram.org/ - to cover Hindi, Marathi and Sanskrit. CD Roms containing the builds are available through community distributors.

The Bangla Native-Language Team for OpenOffice.org http://www.openoffice.org/ is at http://bn.openoffice.org

a new magazine tentatively to be called 'Linux For You' http://www.linuxforu.com/ is being planned. It will be devoted entirely to Free Software and Open Source. It is being published by the same group that brings out 'Electronics For You' and the 'i.t.'

IT@School project

Free Software supporters are actively compaigning for the government to use Free Software in it's IT@School project

TDIL Data Centre CD

This free CD includes a full suite of Hindi office automation tools, web browser, email client, OCR tools and language interface facilities. The Ministry has also released a similar package of computer applications, fonts and tools for the Tamil language.

Penetration of Speech Technologies in Indian Langauges

In line with promoting speech enabled research, the release of Indian language applications, fonts and tools is another important step in the realization of the dream to develop indigenous Indian technologies that can help to close the digital divide and build up the capabilities of the nation.

Matrubhasha - Speech Technology for Indian Languages

Matrubhasha is a Unicode and MBROLATM based Software solution for Text to Speech Synthesis (TTS) and CMU Sphinx based Speech Recogniser for Indian languages. Matrubhasha is visualized with the objective of building a framework, which can be used by any software developer to incorporate speech capabilities (in Indian languages) into her/his software thus increasing its usability across different sections of society.

San-kshaepuk

San-kshaepuk is a unicode compliant, open source multilingual text summarizer COM component for windows platform . It summarizes text San-kshaepuk is based on Open Text Summarizer . The addin can accept text content in any language based on internal intelligence and distinguishes important sentences from others summarize it into either as simple text or can be HTML formatted with important sentence highlighted.

MBRola

The aim of the MBROLA project, initiated by the TCTS Lab of the Faculté Polytechnique de Mons (Belgium), is to obtain a set of speech synthesizers for as many languages as possible, and provide them free for non-commercial applications. The ultimate goal is to boost academic research on speech synthesis, and particularly on prosody generation, known as one of the biggest challenges taken up by Text-To-Speech synthesizers for the years to come.

CMU Sphinx

The Sphinx Group at Carnegie Mellon University is committed to releasing the long-time, DARPA-funded Sphinx projects widely, in order to stimulate the creation of speech-using tools and applications, and to advance the state of the art both directly in speech recognition, as well as in related areas including dialog systems and speech synthesis.

Organisations Working in Open Source

itmission.org - Kerala -india -the GNU/Linux Support Help and Resources...

http://www.itmission.org/ admin@itmission.org

LUGs / FSUGs

The GNU/Linux Users' Group of Bombay (Mumbai) - the biggest of all GLUGs -

http://www.ilug-bom.org.in/

The GNU/Linux Users' Group of South Maharashtra - http://groups.yahoo.com/groups/ilug_sm

The Bangalore Linux User Group - with several thousand over the mailing lists - http://linux-bangalore.org/

The Chennai Linux Users Group - http://www.chennailug.org/

The Margao (Goa) GNU/Linux Users Group - http://www.ilug-margao.org/html/

The Trivandrun GNU/Linux Users Group - called Tri(G)LUG - http://triglug.linuxense.com/

The Kolkata Chapter of the Indian Linux User Group - http://www.ilug-cal.org/

The Burdwan Chapter of the Indian Linux User Group - http://yahoogroups.com/group/ilug-bwn

The Belgaum Chapter of the Indian Linux User Group - http://groups.yahoo.com/group/ilug-belgaum

The MVGR college chapter of the Indian Linux User Group - http://ilugmvgr.sourceforge.net/

The Tricy (Tamil Nadu) GNU/Linux users group - http://glugt.linuxisle.com/

Hyderabad Linux User Group - http://linuxindia.virtualave.net/

Linux India-Delhi - http://www.netshooter.com/linux/

Linux Users Group of Ahmedabad - http://www.luga.org/

Other Groups

A group for PHP users and developers in India - http://groups.yahoo.com/group/in-phpug

Linux Learning Centre - lots of Linux training courses - http://www.linuxlearningcentre.com/

The Indian TeX Users Group - TUGIndia - http://www.tug.org.in/

National OSS Portals / Websites

http://linuxinindia.pitas.com/

http://www.linux-india.org/

http://www.gnu.org.in/ - the Free Software Foundation now has an India branch!

http://www.bengalinux.org/ - The GNU/Linux Bengali localization project.

http://www.sarovar.org/ - Indian Portal for hosting free/open source software projects

Open Source Localization Projects

FSF India's long list of localization projects

Indix - Linux in Hindi

Dhvani - text to speech software for Indian languages.

GLUE - GNU/Linux Utilities for Education - a bootable CD that will install a GNU/Linux system with OpenOffice, several educational software packages and the Terminal Server Software "within 10 minutes", available for download at : ftp://ftp.seul.org/pub/glue/. Made by Indian software developer Ajith Kumar.

Vernacular SMS - Zi Corporation, has introduced predictive text input for SMS in Hindi. Read the article here : http://www.zdnetindia.com/news/features/stories/77297.html

The Ankur Bangla Project http://www.bengalinux.org/ has released a LiveCD called AnkurBangla LiveDesktop Technology Preview v1.0 as a public preview of the Bangla Localisation efforts thus facilitating enablement

Indian GNU/Linux project - aims "to create a Linux distribution that supports Indian Languages" - http://www.indlinux.org/ - their 'Milan' software enables "Hindi users to use computers in their own language" - read more here

The Assamese Localisation project has opened up a portal for interaction with the community at http://luit.sourceforge.net/

Planet FLOSS India

A collection of blogs from Free/Libre and Open Source Software India community

Open Source Forum - National Informatics Centre

Open Source Forum is an Initiative of National Informatics Centre, to encourage participation by viewers to share their experience and important information about different Open Source areas.

Free Software Foundation of India - FSF India is a non-profit organisation committed to advocating, promoting and propagating the use and development of swatantra software in India. Their goal is to ensure the long term adoption of free software, and aim for the day when all software will be free. This includes educating people about software freedom and convincing them that it is the freedom that matters.

Linux Users Group - Goa (India)

Linux Users Group - Goa (India). All are welcome to join in this friendly network, to share software, ideas and concepts (like freedom). Volunteers welcome to support and promote this movement.

Indian Koha Interest Group

This list attempts to bring together the nascent Koha user community in India together under the umbrella of Indian Koha Interest Group. Koha is a library management system.

External Links

- Linux forum from India to world with lots of localized discussion as well as global discussion.

http://linux-bangalore.org/2001/schedules/ - all the slides from the 2001 Linux Bangalore conference. Lots of good material! See also Open source conferences.

http://linuxbazar.com/ - online Linux CD sales for anywhere in India.

http://lincds.com/ - cheap online Linux CD sales for India.

http://pusatlinux.com/ - from SouthEast Asia ships Linux CDs to India.

http://btbytes.com/ - wide range of Opensource software [linux,BSD,Openoffice.org,GNU winII] at very affordable prices.

http://magic-cauldron.com/ - some OSS information.

http://www.linux2003.net/ - Linux India-National User Expo 2003.

http://www.tug.org.in/tug2002/index.html -- Reports, slides of TUG 2002, the first annual TeX Users Group meeting to be held outside the US/Europe.

http://www.quickserver.org/ -- A free, open source Java library for quick creation of robust and multi-threaded, multi-client TCP server applications.

Linux Forum

Recent Events

LINUXASIA 2005 India Habitat Centre, New Delhi, 2005-02-09 00:00:00 - 2005-02-11 00:00:00

Expo, workshop, and conference on Linux

Also to note

Go For Open Source Code, Kalam Tells IT Industry

Why Indian companies should sponsor Open Source projects

The Linux OutReach Program

Richard Stallman believes India is key to the global free software movement - http://www.siliconindia.com/tech/tech_pgtwo.asp?newsno=18805&newscat=

Deepti - A Hindi chat bot similar to Alice which is in English

Mozilla in Hindi project - http://bttlindia.com/mozilla/
 
first para changed

I am not associated with that site TechArena ,and hence do not permit any reprint of my articles courtesy that portal . However the same article has been published at a new location ,which I will be glad to allow you to reprint provided its new source 'TechEnclave' is given due credit .So please where ever you to correct the source from 'TechArena Community Forums' to "TechEnclave Forums" . The new address for the article which can be reprinted and formatted as needed is at :
 
Hello bosky101
I am wondering if you will permit me to publish this excellent forum post as an article with some modification in our website www.accb.net.

We will give you full credit to you but would like to publish the article under a creative commons licence.

jbsarma
 
Hi ,ive PM'ed you regarding the request for publishing this article.Incidently the very same article has also been requested by the Computer Society of India for their monthly publication titled CSI Communications.
 
Back
Top