Language Facts - Languages of the World
Pic: Open book Back to Language Facts


Languages of the World

It is difficult to give an exact figure of the number of languages that exist in the world, because it is not always easy to define what a language is. The difference between a language and a dialect is not always clear-cut. It has nothing to do with similarity of vocabulary, grammar, or pronunciation. Sometimes, the distinctions are based purely on geographical, political, or religious reasons. It is usually estimated that the number of languages in the world varies between 3,000 and 8,000.

There is a list of the world's languages, called "Ethnologue" (Grimes 1996). There are 6,500 living languages listed. Of these, 6,000 have registered population figures. 52% of the 6,000 languages are spoken by less than 10,000 people, and 28% are spoken by less than 1,000 people. 83% of them are limited to single countries.

The ten largest languages in the world are the first languages for nearly half of the world's population.

Here is a list of the top 10 languages in February 1999 according to Ethnologue:

The figures refer to the number of people who have the language as their first language. If those speakers who have learnt the language as a foreign language were to be included, English might be at the top of the list.
Arabic would be among the 10 most widely spoken languages, if it were to be counted as one language. Ethnologue lists ten variants of spoken Arabic among its top 100. The biggest of these is Egyptian Arabic with 42.5 million speakers. If they were to be counted as one and the same language, Arabic would come out sixth with 175 million speakers, and Wu would drop out of the top ten.
These figures are from 1999, so some languages may have shifted postions on the list for demographical reasons, and then particularly in positions 4 through 7, where also Arabic might turn up, see above.

The branch of linguistics which is called comparative philology, has classified the world's languages into different families. All of the relationships within the families are not yet clear, and therefore the classification must be seen as preliminary.

The languages within a family usually share a common language, from which they developed. However, sometimes languages are considered to be related just because they happen to be geographically close to one another.

You can look at "Mark Rosenfelder's maps of the world's language families (in a new window, close it to get back here).

The Indo-European language family is the most researched of all the families. Languages, which belong to this family, are spoken in India, Pakistan, Iran, and nearly all of Europe. The Indo-European language family has been split into smaller language groups:

  • The Indo-Iranian has about 600 million speakers and includes languages such as Urdu, Hindi, Bengali, and Punjabi. These languages are spoken in northern India and in Pakistan. The ancient Indian language, Sanskrit, has had enormous impact on the historical language research. The systematic similarities between Sanskrit, Latin, and Greek were observed as early as the 18th century,

    Persian and Kurdish are also a part of the Indo-Iranian language group.

  • The Romance language group developed from Latin and has about 600 million speakers in Europe and Latin America. Spanish, Portuguese, French, Italian, and Romanian belong to this group.

  • The Germanic language group has about 500 million speakers in Europe and North America. The Scandinavian languages (Swedish, Danish, Norwegian, Icelandic, and Faroese) belong to this group along with English, German, Dutch, Flemish (which is spoken in a part of Belgium), and Afrikaans (which is related to Dutch and is spoken in South Africa).

  • The Slavic language group is mainly confined to Eastern Europe. It has 300 million speakers. The largest language in this group is Russian. Other Slavic languages are Belarusian, Ukrainian, Polish, Czech, Slovakian, Bulgarian, Serbian, Croatian, and Bosnian.

    The remaining language groups within the Indo-European language family are considerably smaller.

  • The Baltic language group is represented by Latvian and Lithuanian.

  • The Greek language group is made up of Modern Greek together with various older forms of Greek.

  • The Celtic language group was once spoken all over Europe, but now it is only made up of small languages such as Breton, Irish Gaelic, Welsh, and Scottish Gaelic.

    Beside the Indo-European languages there are also a few other language families represented in Europe. The two largest are the Turkic language group, spoken by about 40 million speakers in Turkey, and the Finno-Ugric language group. Finnish, Estonian, Saami, and Hungarian belong to the Finno-Ugric language group. Another interesting language in Europe is Basque, which is spoken in the Basque region in northern Spain and in a small part of southwestern France. Basque, as far as we know, has no known relatives. The languages in Africa are usually divided into four language families:

  • The Niger-Congo language family is usually divided into ten sub-groups. Each sub-group includes several hundred languages. Nearly half of the Niger-Congo languages are made up of different Bantu languages. Bantu languages are spoken by about 200 million people in sub-Saharan Africa. Swahili is the most known and wide spread of the Bantu languages.

  • The Khoisan language family is spoken by a couple of hundred thousand people in southern Africa, especially in the Kalahari Desert in Namibia and Botswana. The Khoisan language family is usually referred to as "click" languages, because of the exotic click sounds the speakers use. The Khoisan family is divided into three groups, North, Central, and South. Earlier, the family was only divided into two main groups: the Hottentots (cattle herders) and the Bushman (hunters and gatherers nomads).

  • The Afro-Asian language family is found in the northern and eastern parts of Africa from Mauritania in the west to Somalia in the east. This family is usually divided into five sub-groups. The Semitic is the most common and most understood, much thanks to the spread of Arabic, which is understood in the whole of North Africa. Arabic is understood by about 150 million people and is often the language of education. Other important Semitic languages are Amharic and Tigrinya, which are spoken by about 10 million people in Ethiopia. The long extinct Egyptian language, which is known for its hieroglyphics, is considered to have belonged to the Afro-Asian language family.

  • The Nilo-Saharanlanguage family is all the languages that were "left over" when Africa's language families were being established. The Nilo language group includes about 150 languages, spoken by approximately 8 million people in east Africa. The Saharan language group includes 10 languages with about 5 million speakers in Chad, Niger, and Libya.

    Besides these four language families, several Indo-European languages are spoken in Africa, such as English, French, Portuguese, German, and Afrikaans.

    The Indo-European languages are spoken by many people in Asia, especially in India, Pakistan, and the Middle East. The Afro-Asian languages are also well represented in the Middle East, especially Arabic.

    The Sino-Tibetan language family has the largest number of speakers. This language family is estimated to have 1 billion speakers. Mandarin is the largest language within this family. It is spoken by about 700 million people in northern China. Other large languages in this family are Hakka, Wu, and Yue (Cantonese). These languages are spoken in China. Sometimes, they are called Chinese, but the people who speak these different languages can not understand one another. The reason why they are often lumped together as Chinese is due to the fact that they all share the same written language.

    Other languages within the Sino-Tibetan language family are Burmese, Tibetan, and Taiwanese. The relationships between the languages of this family are unclear and disputed.

    The Malayo-Polynesian language family is another large language family in Asia and Oceania. It has about 200 million speakers and covers a vast geographical area from Madagascar via Indonesia to Hawaii. After Indo-European, this is the most widespread language family in the world.

    The largest languages within the Malayo-Polynesian language family are Javanese, Indonesian, Tagalog (found in the Philippines), and Malay. These belong to the Indonesian (West) branch of the Malayo-Polynesian language family.

    The Polynesian (East) branch is usually divided up into Micronesian, Polynesian, and Melanesian languages. Among these you also find Fiji and Maori (the latter spoken in New Zealand).

    The Dravidian language family is spoken by 160 million speakers in southern India. The largest languages in this family are Tamil and Telugu each have about 55 million speakers. The Australian language family is significantly smaller than the others. Its languages spoken by the Australian aborigines.

    There are also a number of languages whose relationships have not been thoroughly investigated. The largest are Japanese (120 million speakers), Korean (60 million speakers), Vietnamese (50 million speakers), and Thai (40 million speakers). Thai and Vietnamese are considered distant relatives, but neither Japanese nor Korean have any known relatives.

    The languages of New Guinea, which number about 700, are usually grouped into a Papuan language family, but only because of its geographic position. The relationships of the languages in the family are unclear.

    Indo-European languages were the colonial languages in America, especially English and Spanish. Only a few of the languages that were spoken by the original inhabitants are still spoken. They are usually grouped together under the name American Indian languages. This term covers 20 different families with several languages in each. The largest languages are Quechua (spoken in Bolivia and Peru) and Guaraní (spoken in Paraguay). If you want to know more about where the different languages are spoken see the World Map.

  • Ethnologue, 13th edition, published by Summer Institute of Linguistics in Dallas, Texas, USA.
    The Web version of this printed reference work gives a short description of some 6,700 languages.
  • European minority (or minoritized) languages from Sabhal Mór Ostaig, Scotland.
  • Foundation for Endangered Languages
  • The Universal Declaration of Human Rights in over 300 languages, from the United Nations High Commissioner for Human Rights. There is also an alphabetical listing by language name of all available translations.

    Indo-European languages

  • Piotr Ga,siorowski's Indo-European Page
  • Cyril Babaev's Linguistic Studies (CyBaLiSt) - The Indo-European Database
    To top of page
    Back to Language Facts