Safekipedia

Indo-European languages

Adapted from Wikipedia · Adventurer experience

Map showing the historical spread of Indo-European cultures and languages across Europe and Asia.

The Indo-European languages are a large group of languages spoken by many people around the world. They are native to places like the northern Indian subcontinent, most of Europe, and the Iranian plateau. You can also find them in parts of Central Asia, like Tajikistan and Afghanistan, as well as in Sri Lanka and the Maldives.

Today, some of the most widely spoken Indo-European languages include English, Spanish, Portuguese, Russian, Hindi, Bengali, Punjabi, French, German, and Persian.

All these languages come from one ancient language called Proto-Indo-European. We have written records of Indo-European languages from the Bronze Age, like Mycenaean Greek and the Anatolian languages, such as Hittite and Luwian.

History of Indo-European linguistics

See also: Indo-European studies § History

In the 1500s, people from Europe visiting the Indian subcontinent noticed that some words in Indo-Aryan, Iranian, and European languages sounded alike. In 1583, a missionary named Thomas Stephens wrote a letter about how North Indian languages were similar to Greek and Latin.

Franz Bopp was a pioneer in the field of comparative linguistic studies.

Later, a merchant named Filippo Sassetti who traveled to India also wrote about word similarities between Sanskrit and Italian. In the 1600s, a Dutch scholar named Marcus Zuerius van Boxhorn thought many Asian and European languages came from one old language he called Scythian.

In 1786, a man named Sir William Jones gave a talk showing clear links between Latin, Greek, and Sanskrit. He suggested these languages came from a shared ancient source.

The term "Indo-European" was first used in 1813 by Thomas Young to describe languages from Western Europe to North India. Studies of these languages grew into a major field of learning.

Classification

See also: List of Indo-European languages

Indo-European language family tree based on "Ancestry-constrained phylogenetic analysis of Indo-European languages" by Chang et al. 2015

The Indo-European languages are a large family of languages. People speak them in many places, such as Europe, parts of Asia, and some islands. There are ten main groups of these languages, and they are:

Some significant isoglosses in Indo-European daughter languages at around 500 BC.   Blue: centum languages  Red: satem languages  Orange: languages with augment  Green: languages with PIE *-tt- > -ss-  Tan: languages with PIE *-tt- > -st-  Pink: languages with instrumental, dative and ablative plural endings (and some others) in *-m- rather than *-bh-
  • Albanian: Spoken since the 1300s. It comes from an old language.
  • Anatolian: These languages are no longer spoken. They were used in places like Anatolia.
  • Armenian: Spoken since the early 5th century AD. It comes from an old Proto-Armenian language.
  • Balto-Slavic: This group includes Slavic languages like Russian, Polish, and Czech, and Baltic languages like Lithuanian and Latvian.
  • Celtic: Spoken since the 6th century BC. Modern Celtic languages include Welsh, Irish, and Scottish Gaelic.
  • Germanic: This group includes English, German, Dutch, and the Scandinavian languages.
  • Hellenic: This includes Greek.
  • Indo-Iranian: This group splits into Indo-Aryan languages (like Hindi) and Iranian languages.
  • Italic: This includes Latin and its descendants, such as Italian and French.
  • Tocharian: These languages were spoken in parts of China and are no longer used today.

There are also some older or less-known languages that might be related to these groups, but we are still learning more about them.

Proposed external relations

Since people started studying Indo-European languages, they have tried to connect them with other language families. But no single idea has been widely accepted, and many experts are still unsure.

Some ideas suggest linking Indo-European languages with other groups, such as:

Other proposals include broader groups like:

These larger ideas, such as Borean, suggested by Harold C. Fleming and Sergei Starostin, aim to include nearly all the world's languages except those from sub-Saharan Africa, New Guinea, Australia, and the Andaman Islands.

Evolution

Proto-Indo-European

Main article: Proto-Indo-European language

The proposed Proto-Indo-European language (PIE) is the reconstructed common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans. During the 1960s, knowledge of Anatolian became certain enough to establish its relationship to PIE. Using the method of internal reconstruction, an earlier stage, called Pre-Proto-Indo-European, has been proposed.

PIE is an inflected language, in which the grammatical relationships between words were signalled through inflectional morphemes, usually endings. The roots of PIE are basic morphemes carrying a lexical meaning. By addition of suffixes, they form stems, and by addition of endings, these form grammatically inflected words, such as nouns or verbs. The reconstructed Indo-European verb system is complex and, like the noun, exhibits a system of ablaut.

Diversification

See also: Indo-European migrations

The diversification of the parent language into the attested branches of daughter languages is historically unattested. The timeline of the evolution of the various daughter languages is mostly undisputed.

Using a mathematical analysis borrowed from evolutionary biology, Donald Ringe and Tandy Warnow proposed the following evolutionary tree of Indo-European branches:

  • Pre-Anatolian before 3500 BC
  • Pre-Tocharian
  • Pre-Italic and Pre-Celtic before 2500 BC
  • Pre-Armenian and Pre-Greek after 2500 BC
  • Proto-Indo-Iranian c. 2000 BC
  • Pre-Germanic and Pre-Balto-Slavic; Proto-Germanic c. 500 BC

David Anthony proposes the following sequence:

  • Pre-Anatolian c. 4200 BC
  • Pre-Tocharian c. 3700 BC
  • Pre-Germanic c. 3300 BC
  • Pre-Italic and Pre-Celtic c. 3000 BC
  • Pre-Armenian c. 2800 BC
  • Pre-Balto-Slavic c. 2800 BC
  • Pre-Greek c. 2500 BC
  • Proto-Indo-Iranian c. 2200 BC; split into Iranian and Old Indic c. 1800 BC

From 1500 BC the following sequence may be given:[citation needed]

Key languages for reconstruction

In reconstructing the history of the Indo-European languages and the form of the Proto-Indo-European language, some languages have been of particular importance. These generally include the ancient Indo-European languages that are both well-attested and documented at an early date, although some languages from later periods are important if they are particularly linguistically conservative, most notably, Lithuanian.[according to whom?] Early poetry is of special significance because of the rigid poetic meter normally employed, which makes it possible to reconstruct a number of features, e.g. vowel length, that were either unwritten or corrupted in the process of transmission down to the earliest extant written manuscripts.

Most noticeably:[according to whom?]

  • Vedic Sanskrit (c. 1500–500 BC). This language is unique in that its source documents were all composed orally, and were passed down through oral tradition (shakha schools) for c. 2,000 years before being written down. The oldest documents are all in poetic form; oldest and most important of all is the Rigveda (c. 1500 BC). The oldest inscriptions in the language of the Rigveda, are found in northern Syria, where the Mitanni kingdom was located.[full citation needed] Though it is also likely that the language of the Mitanni kingdom was a related but older Indo-Aryan language.
  • Ancient Greek (c. 750–400 BC). Mycenaean Greek (c. 1450 BC) is the oldest recorded form, but its value is lessened by the limited material, restricted subject matter, and highly ambiguous writing system.[according to whom?]More important is Ancient Greek, documented extensively beginning with the two Homeric poems (the Iliad and the Odyssey, c. 750 BC).
  • Hittite (c. 1700–1200 BC). This is the earliest recorded of all Indo-European languages, and highly divergent from the others due to the early separation of the Anatolian languages from the remainder. It possesses some highly archaic features found fragmentarily, if at all, in other languages. It appears to have undergone many early phonological and grammatical changes which, combined with the ambiguities of its writing system, hinder its usefulness somewhat.[according to whom?]

Other primary sources:

  • Latin, attested in a large amount of poetic and prose material in the Classical period (c. 200 BC – AD 100) and limited Old Latin material from as early as c. 600 BC.
  • Gothic (the most archaic well-documented Germanic language, c. AD 350), along with the combined witness of the other old Germanic languages: most importantly, Old English (c. 800–1000), Old High German (c. 750–1000) and Old Norse (c. 1100–1300, with limited earlier sources dating to c. AD 200).
  • Old Avestan (c. 1700–1200 BC) and Younger Avestan (c. 900 BC)). Documentation is sparse, but nonetheless quite important due to its highly archaic nature.[citation needed]
  • Modern Lithuanian, with limited records in Old Lithuanian (c. 1500–1700).
  • Old Church Slavonic (c. 900–1000).

Other secondary sources, due to poor attestation:

Other secondary sources, due to extensive phonological changes and relatively limited attestation:

  • Old Irish (c. AD 700–850).
  • Tocharian (c. AD 500–800), underwent large phonetic shifts and mergers in the proto-language, and has an almost entirely reworked declension system.
  • Classical Armenian (c. AD 400–1000).
  • Albanian (c. 1284 – present).

Sound changes

Main article: Indo-European sound laws

As speakers of Proto-Indo-European (PIE) dispersed, the language's sound system diverged as well, changing according to various sound laws evidenced in the daughter languages.

PIE is normally reconstructed with a complex system of 15 stop consonants, including an unusual three-way phonation or voicing distinction between voiceless, voiced and "voiced aspirated", i.e. breathy voiced, stops, and a three-way distinction among velar consonantsk-type sounds—between palatal ḱ ǵ ǵh, plain velar k g gh and labiovelar kʷ gʷ gʷh. The correctness of the terms palatal and plain velar is disputed.[by whom?] All daughter languages have reduced the number of distinctions among these sounds, often in divergent ways.

As an example, in English, one of the Germanic languages, the following are some of the major changes that happened:

None of the daughter-language families, except possibly Anatolian, particularly Luvian, reflect the plain velar stops differently from the other two series, and there is even a certain amount of dispute whether this series existed in PIE.[citation needed] The major distinction between centum and satem languages corresponds to the outcome of the PIE plain velars:

  • The central satem languages—Indo-Iranian, Balto-Slavic, Albanian, and Armenian—reflect both plain velar and labiovelar stops as plain velars, often with secondary palatalization before a front vowel (e i ē ī). The palatal stops are palatalized and often appear as sibilants, usually distinct from the secondarily palatalized stops.
  • The peripheral centum languages—Germanic, Italic, Celtic, Greek, Anatolian and Tocharian—reflect both palatal and plain velar stops as plain velars, while the labiovelars continue unchanged, often with later reduction into plain labial or velar consonants.

The three-way PIE distinction between voiceless, voiced and voiced aspirated stops is considered extremely unusual from the perspective of linguistic typology—particularly in the existence of voiced aspirated stops without a corresponding series of voiceless aspirated stops. None of the various daughter-language families continue it unchanged, with numerous resolutions to the unstable PIE situation:

  • The Indo-Aryan languages preserve the three series unchanged and have evolved a fourth series of voiceless aspirated consonants.
  • The Iranian languages probably passed through the same stage, subsequently changing the aspirated stops into fricatives.
  • Greek converted the voiced aspirates into voiceless aspirates.
  • Italic probably passed through the same stage, and reflects the voiced aspirates as f or h, or sometimes plain voiced stops in Latin.
  • Celtic, Balto-Slavic, Anatolian, and Albanian merge the voiced aspirated into plain voiced stops.
  • Germanic and Armenian change all three series in a chain shift, e.g. with bh b p becoming b p f, known as Grimm's law in Germanic.

Among the other changes affecting consonants are:

  • The Ruki sound law, in which s becomes /ʃ/ after r, u, k, i in the satem languages.
  • Loss of prevocalic p in Proto-Celtic.
  • Development of prevocalic s to h in Proto-Greek, with later loss of h between vowels.
  • Verner's law in Proto-Germanic.
  • Grassmann's law, the dissimilation of aspirates, independently in Proto-Greek and Proto-Indo-Iranian.

There are various basic outcomes of PIE consonants in some of the most important[according to whom?] daughter languages for the purposes of reconstruction.

Notes:

  • C- At the beginning of a word.
  • -C- Between vowels.
  • -C At the end of a word.
  • `-C- Following an unstressed vowel (Verner's law).
  • -C-(rl) Between vowels, or between a vowel and *r, l (on either side).
  • CT Before a (PIE) stop (*p, t, k).
  • CT− After a (PIE) obstruent (*p, t, k, etc.; *s).
  • C(T) Before or after an obstruent (*p, t, k, etc.; *s).
  • CH Before an original laryngeal.
  • CE Before a (PIE) front vowel (*i, e).
  • CE' Before secondary (post-PIE) front-vowels.
  • Ce Before *e.
  • C(u) Before or after a (PIE) *u (boukólos rule).
  • C(O) Before or after a (PIE) *o, u (boukólos rule).
  • Cn− After *n.
  • CR Before a sonorant (*r, l, m, n).
  • C(R) Before or after a sonorant (*r, l, m, n).
  • C(r),l,u− Before *r, l or after *r, u.
  • Cruki− After *r, u, k, i (Ruki sound law).
  • C..Ch Before an aspirated consonant in the next syllable (Grassmann's law, also known as dissimilation of aspirates).
  • CE..Ch Before a (PIE) front vowel (*i, e) as well as before an aspirated consonant in the next syllable (Grassmann's law, also known as dissimilation of aspirates).
  • C(u)..Ch Before or after a (PIE) *u as well as before an aspirated consonant in the next syllable (Grassmann's law, also known as dissimilation of aspirates).

Comparison of conjugations

Aa comparison of conjugations of the thematic present indicative of the verbal root *bʰer- of the English verb to bear and its reflexes in various early attested IE languages and their modern descendants or relatives, shows that all languages had in the early stage an inflectional verb system.

Similarities are visible between the modern descendants and relatives of these ancient languages, and the differences have increased over time. Some IE languages have moved from synthetic verb systems to largely periphrastic systems. Some of these verbs have undergone a change in meaning as well.

  • In Modern Irish beir usually only carries the meaning to bear in the sense of bearing a child; its common meanings are to catch, grab. Apart from the first person, the comparative forms are dialectical or obsolete. The second and third person forms are typically instead conjugated periphrastically by adding a pronoun after the verb: beireann tú, beireann sé/sí, beireann sibh, beireann siad.
  • The Hindustani (Hindi and Urdu) verb bʰarnā, the continuation of the Sanskrit verb, can have a variety of meanings, but the most common is "to fill". The comparative forms are etymologically derived from the present indicative, and now have the meaning of future subjunctive. The loss of the present indicative in Hindustani is roughly compensated by the periphrastic habitual indicative construction, using the habitual participle (etymologically from the Sanskrit present participle bʰarant-) and an auxiliary: ma͠i bʰartā hū̃, tū bʰartā hai, vah bʰartā hai, ham bʰarte ha͠i, tum bʰarte ho, ve bʰarte ha͠i (masculine forms).
  • The Gothic forms are a close approximation of what the early West Germanic forms of c. 400 AD would have looked like. The descendant of Proto-Germanic *beraną (English bear) survives in German only in the compound gebären, meaning "bear (a child)".
  • The Latin verb ferre is irregular, and not representative of a normal thematic verb. In most Romance languages such as Portuguese, other verbs now mean "to carry" (e.g. Pt. portar [when?] is pherno φέρνω (modern transliteration ferno) meaning "to bring". Additionally, the perfective form of pherno, used for the subjunctive voice and also for the future tense, is also phero.
  • The dual forms are archaic in standard Lithuanian, and are now used only in some dialects, e.g. Samogitian.
  • Among modern Slavic languages, only Slovene continues to have a dual number in the standard variety.
Proto-Indo-European consonants and their reflexes in selected Indo-European daughter languages
PIESkr.O.C.S.Lith.GreekLatinOld IrishGothicEnglishExamples
PIEEng.Skr.Gk.Lat.Lith. etc.Prs.
*p*p; *phH*p*Ø;
*chT [x]
*f;
`-*b- [β]
*f;
-*v/f-
*pṓds ~ *ped-footpád-poús (podós)pēs (pedis)pãdasPiáde
*t*t; *thH*t*t;
-*th- [θ]
*þ [θ];
`-*d- [ð];
*tT-
*th;
`-*d-;
*tT-
*tréyesthreetráyastreĩstrēstrỹsthri (old Persian)
**ś [ɕ]*s*š [ʃ]*k*c [k]*c [k];
-*ch- [x]
*h;
`-*g- [ɣ]
*h;
-*Ø-;
`-*y-
*ḱm̥tómhund(red)śatámhe-katóncentumšimtassad
*k*k; *cE [tʃ];
*khH
*k;
*čE [tʃ];
*cE' [ts]
*k*kreuh₂
"raw meat"
OE hrēaw
raw
kravíṣ-kréascruorkraûjasxoreš
**p;
*tE;
*k(u)
*qu [kʷ];
*c(O) [k]
*ƕ [ʍ];
`-*gw/w-
*wh;
`-*w-
*kʷid, kʷodwhatkímquid, quodkas, kadce, ci
*kʷekʷlomwheelcakrá-kúkloskãklascarx
*b*b; *bhH*b*b [b];
-[β]-
*p
*d*d; *dhH*d*d [d];
-[ð]-
*t*déḱm̥(t)ten,
Goth. taíhun
dáśadékadecemdẽšimtdah
*ǵ*j [dʒ];
*hH [ɦ]
*z*ž [ʒ]*g*g [ɡ];
-[ɣ]-
*k*c / k;
*chE'
*ǵénu, *ǵnéu-OE cnēo
knee
jā́nugónugenuzánu
*g*g;
*jE [dʒ];
*ghH;
*hH,E [ɦ]
*g;
*žE [ʒ];
*dzE'
*g*yugómyokeyugámzugóniugumjùngasyugh
**b;
*de;
*g(u)
*u [w > v];
*gun− [ɡʷ]
*b [b];
-[β]-
*q [kʷ]*qu*gʷīw-quick
"alive"
jīvá-bíos,
bíotos
vīvusgývasze-
**bh;
*b..Ch
*b*ph;
*p..Ch
*f-;
*b
*b [b];
-[β]-;
-*f
*b;
-*v/f-(rl)
*bʰéroh₂bear "carry"bhar-phérōferōOCS berǫbar-
**dh;
*d..Ch
*d*th;
*t..Ch
*f-;
*d;
*b(r),l,u-
*d [d];
-[ð]-
*d [d];
-[ð]-;
-*þ
*d*dʰwer-, dʰur-doordvā́raḥthurā́forēsdùrysdar
*ǵʰ*h [ɦ];
*j..Ch
*z*ž [ʒ]*kh;
*k..Ch
*h;
*h/gR
*g [ɡ];
-[ɣ]-
*g;
-*g- [ɣ];
-*g [x]
*g;
-*y/w-(rl)
*ǵʰans-goose,
OHG gans
haṁsáḥkhḗn(h)ānseržąsìsgház
**gh;
*hE [ɦ];
*g..Ch;
*jE..Ch
*g;
*žE [ʒ];
*dzE'
*g
*gʷʰ*ph;
*thE;
*kh(u);
*p..Ch;
*tE..Ch;
*k(u)..Ch
*f-;
*g /
-*u- [w];
n*gu [ɡʷ]
*g;
*b-;
-*w-;
n*gw
*g;
*b-;
-*w-
*sneigʷʰ-snowsneha-níphanivissniẽgasbarf
*gʷʰerm-??warmgharmáḥthermósformusLatv. gar̂megarm
*s*s*h-;
-*s;
*s(T);
-*Ø-;
[¯](R)
*s;
-*r-
*s [s];
-[h]-
*s;
`-*z-
*s;
`-*r-
*septḿ̥sevensaptáheptáseptemseptynìhaft
*ruki- [ʂ]*xruki- [x]*šruki- [ʃ]*h₂eusōs
"dawn"
eastuṣā́ḥāṓsaurōraaušrabáxtar
*m*m*m [m];
-[w̃]-
*m*mūsmousemū́ṣ-mũsmūsOCS myšĭmuš
*-m-*m-*˛ [˜]-*n-*m-*n-*Ø*ḱm̥tómhund(red)śatám(he)katóncentumOPrus simtansad
*n*n*n;
-*˛ [˜]
*n*nokʷt-nightnákt-núkt-noct-naktisnáštá
*l*r (dial. *l)*l*leuk-lightruc-leukóslūxlaũkasruz
*r*r*h₁reudʰ-redrudhirá-eruthrósruberraũdassorx
*i̯*y [j]*j [j]*z [dz > zd, z] /
*h;
-*Ø-
*i [j];
-*Ø-
*Ø*j*y*yugómyokeyugámzugóniugumjùngasyugh
*u̯*v [ʋ]*v*v [ʋ]*w > h / Ø*u [w > v]*f;
-*Ø-
*w*h₂weh₁n̥to-windvā́taḥáentaventusvėtrabád
PIESkr.O.C.S.Lith.GreekLatinOld IrishGothicEnglish
Proto-Indo-European
(*bʰer- 'to carry, to bear')
I (1st sg.)*bʰéroh₂
You (2nd sg.)*bʰéresi
He/She/It (3rd sg.)*bʰéreti
We two (1st dual)*bʰérowos
You two (2nd dual)*bʰéreth₁es
They two (3rd dual)*bʰéretes
We (1st pl.)*bʰéromos
You (2nd pl.)*bʰérete
They (3rd pl.)*bʰéronti
Major subgroupHellenicIndo-IranianItalicCelticArmenianGermanicBalto-SlavicAlbanian
Indo-AryanIranianBalticSlavic
Ancient representativeAncient GreekVedic SanskritAvestanLatinOld IrishClassical ArmenianGothicOld PrussianOld Church Sl.Old Albanian
I (1st sg.)phérōbʰárāmibarāmiferōbiru; berimberembaíra /bɛra/*beraberǫ*berja
You (2nd sg.)phéreisbʰárasibarahifersbiri; berirberesbaíris*berabereši*berje
He/She/It (3rd sg.)phéreibʰáratibaraitifertberidberēbaíriþ*beraberetъ*berjet
We two (1st dual)bʰárāvasbarāvahibaírosberevě
You two (2nd dual)phéretonbʰárathasbaíratsbereta
They two (3rd dual)phéretonbʰáratasbaratōberete
We (1st pl.)phéromenbʰárāmasbarāmahiferimusbermaiberemkʿbaíram*beramaiberemъ*berjame
You (2nd pl.)phéretebʰárathabaraθafertisbeirtheberēkʿbaíriþ*berateiberete*berjeju
They (3rd pl.)phérousibʰárantibarəṇtiferuntberaitberenbaírand*beraberǫtъ*berjanti
Modern representativeModern GreekHindustaniPersianPortugueseIrishArmenian (Eastern; Western)GermanLithuanianSloveneAlbanian
I (1st sg.)férno(ma͠i) bʰarūm̥(man) {mi}baram{con}firobeirimberum em; g'perem(ich) {ge}bäreberiubérem(unë) bie
You (2nd sg.)férnis(tū) bʰarē(tu) {mi}bari{con}feresbeirirberum es; g'peres(du) {ge}bierstberibéreš(ti) bie
He/She/It (3rd sg.)férni(ye/vo) bʰarē(ān) {mi}barad{con}ferebeiridhberum ē; g'perē(er/sie/es) {ge}biertberiabére(ai/ajo) bie
We two (1st dual)beriavabéreva
You two (2nd dual)beriatabéreta
They two (3rd dual)beriabéreta
We (1st pl.)férnume(ham) bʰarēm̥(mā) {mi}barim{con}ferimosbeirimid; beireamberum enkʿ; g'perenkʿ(wir) {ge}bärenberiamebéremo(ne) biem
You (2nd pl.)férnete(tum) bʰaro(šomā) {mi}barid{con}ferisbeirthidhberum ekʿ; g'perekʿ(ihr) {ge}bärtberiatebérete(ju) bini
They (3rd pl.)férnun(ye/vo) bʰarēm̥(ānān) {mi}barand{con}ferembeiridberum en; g'peren(sie) {ge}bärenberiabérejo; berọ́(ata/ato) bien

Comparison of cognates

Main article: Indo-European vocabulary

See also: Proto-Indo-European numerals

Present distribution

See also: List of Indo-European languages

An Indo-European language is the majority native language  An Indo-European language is a co-official and majority native language  An Indo-European language is an official but minority native language  An Indo-European language is a cultural or secondary language  An Indo-European language is not widely spoken

Today, many people around the world speak Indo-European languages. Some of the most common ones are English, Hindustani, Spanish, Bengali, French, Russian, Portuguese, German, Persian, and Punjabi. Lots of people also learn these languages as a second language.

These languages spread far because of how people and cultures moved across Eurasia. Over time, many groups helped these languages become common in many places. Now, most people speak at least one Indo-European language, and these languages are important in places like schools, work, and online, with English being very useful for talking to people all over the world.

Images

Animated map showing the spread of Indo-European languages across Eurasia over time.
A colorful world map showing where Romance languages like Spanish, French, Italian, and Portuguese are spoken.

Related articles

This article is a child-friendly adaptation of the Wikipedia article on Indo-European languages, available under CC BY-SA 4.0.

Images from Wikimedia Commons. Tap any image to view credits and license.