Here you will find an electronic version of the most important parts of the census data on ethnicity and language proficiency from 1939, 1959,
1970, 1979 and 1989. For the data from 1970, 1979, 1989, you will find the columns that
are explained in the table below. In the file 1979 according to republic, you will find the
data for some Autonomous Republics and Areas in the North-Western part of the
Soviet Union. The original listed all areas, of course, but I only wrote down
the areas directly relevant to my own research (I invite others to carry on the
work!). In the file 1989 data for the Peoples of the Northern Areas you will find the data only for these 26 peoples. The last file, covering 1939-89, is explained below.
Since the first release, the following errors have been detected and corrected:
980310, (thanks to John Clifton):
The assimilation percentages for Dungan (dungansk) for 1989 were not correct (the data from another lg had crept in instead), but they are now corrected.
NMmax and NMmin were mixed in the Silver formula below, and an errouneus paranthesis was removed
The formula for ABmax had R2 (pro correctly R1) as logical maximum.
| Språk
|
Name
of language (in Norwegian)
|
| gr.
|
Genetic
affiliation of the language
|
| område
|
Main
administrative area where the lg is spoken
|
| #
etn. ´89
|
Number
of individuals that defined themselves as belonging to the ethnic group in
question
|
| #mom=titspr
|
Number
of individuals that defined themselves as speaking the language of their ethnic
group as their mother tongue
|
| #mom=russ
|
Number
of individuals that defined themselves as speaking Russian as their mother tongue
|
| #mom=3dje
|
Number
of individuals that defined themselves as speaking some language other than
Russian or their titular language as their mother tongue
|
| #2spr=titspr
|
Number
of individuals that defined themselves as speaking the language of their ethnic
group as their second language
|
| #2spr=russ
|
Number
of individuals that defined themselves as speaking Russian as their second
language
|
| #2spr=3dje
|
Number
of individuals that defined themselves as speaking some language other than
Russian or their titular language as their second language
|
The following 8 columns contain information about how large percentage of the
minority population is unassimilated, partly or totally assimilated into the
major population. The columns refer to the socalled Silver formula (Silver
19xx), dveloped for estimating degree of assimilation on the basis of Soviet
census data (the terms and the formula are explained below):
NMmax
|
Maximal
number of Native monolinguals
|
NMmin
|
Minimal
number of Native monolinguals
|
UBmax
|
Maximal
number of Unassimilated bilinguals
|
UBmin
|
Minimal
number of Unassimilated bilinguals
|
ABmax
|
Maximal
number of Assimilated bilinguals
|
ABmin
|
Minimal
number of Assimilated bilinguals
|
AMmax
|
Maximal
number of Assimilated monolinguals
|
AMmin
|
Minimal
number of Assimilated monolinguals
|
For a thorough presentation of the formula, see silvers original exposure, or
Lallukka's pedagogical exposure. The formula itself is simple, and this
explanation should be enough to grasp the point.
N, R and T stand for "Native (to the ethnic goup)", "Russian" and "Third",
respectively.
Native monolinguals (NM) know only the language of their group.
Unassimilated bilinguals (UB) have the language of their group as their mother
tongue, but know the majority language as well (in most cases this language is
of course Russian).
Assimilated bilinguals (AB) have the majority language as their mother tongue,
but speak the language of their ethnic group as a second language. In language
shift processes, this group is typically very small.
Assimilated monolinguals (AM) are the ones that have reported themselves as
belonging to a minority group still not speaking this language.
Ignoring people knowing any language other than the language of the ethnic
group or Russian, the formulas for estimating these four groups are very
simple. Taking only people belonging to the ethnic group in question, we get
the following formulas:
NM=N1-R2
|
The
ones with the native languate as first language minus the ones with Russian as
their second
|
UB=R2
|
The
ones with Russian as their second language
|
AB=N2
|
The
ones with the native language as their second language
|
AM=R1-N2
|
The
ones with Russian as their first language minus the ones with the native
language as their second language
|
In real life we cannot exclude this third language, though, and each formula
must be corrected for the interference of the third language.
Logical limitations
1.
Native monolinguals
|
NMmax = N1-R2+T1
|
cannot exceed N1
|
NMmin = N1-R2
|
cannot be less than 0
| |
2.
Unassimilated bilinguals
|
UBmax = R2
|
cannot exceed N1
|
UBmin = R2-T1
|
cannot be less than 0
| |
3.
Assimilated bilinguals
|
ABmax = N2
|
cannot exceed R1
|
ABmin = N2-T1
|
cannot be less than 0
| |
4.
Assimilated monolinguals
|
AMmax = R1-N2+T1
|
cannot exceed R1
|
AMmin = R1-N2
|
cannot be less than 0
|
Note that these logical limitations are not worked into the tables (I did not find a way of programming conditional clauses in Excel). So, whenever you find a negative percentage (for NMmin, UBmin, ABmin or AMmin), simply replace it by zero. Correspondingly, if you, for NMmax or UBmax find numbers greater than N1 or for ABmax or AMmax find numbers greater than R1, replace them with the corresponding N1 and R1 values.
The file, 1939-89
only contains data on first language knowledge. In addition to what can already
be read out of the other files, it contains the data from 1959 and 1939 (these
censuses did not contain information on second language frofiency), and it
contains a comparision of mother tongue retention for each time-span.
Here is the legend to the columns:
Språk
|
Language
name (in Norwegian)
|
gr.
|
Genetic
group of the language in question
|
#
etn. ´39
|
Number
of people identifying themselves with the ethnic group in question in 1939
|
#
talar ´39
|
Number
of people claiming to have the language in question as mother tongue in 1939
|
talar-%
|
Speakers
in percent of members of ethnic group
|
Similarily for the years 1959, 1970, 1979, 1989.
p:59-39
|
Change
in language profiency percentage from 1939 to 1959
|
p:70-59
|
Change
in language profiency percentage from 1959to 1970
|
p:70-39
|
Change
in language profiency percentage from 1939 to 1970
|
59av39
|
Speakers
in 1959 as percentage of speakers in 1939
|
70av59
|
Speakers
in 1970 as percentage of speakers in 1959
|
70av39
|
Speakers
in 1970 as percentage of speakers in 1939
|
89av39
|
Speakers
in 1989 as percentage of speakers in 1939
|
Adm.stat.
|
Administrative
status of the language in question
|
ssr
|
Language
of a Soviet Socialist Republic
|
assr
|
Language
of an Autonomous Soviet Socialist Republic
|
ao
|
Language
of an Autonomous Area
|
ingen
|
No
administrative status
|
nabo
|
Official
language of one of the neighbour countries of the Soviet Union
|
>´37
|
Before
1937 there was...
|
´37>
|
After
1937 there was...
|
u
|
developed
a literary language
|
iu
|
no
literary language developed (or, if developed earlier, not in use)
|
Kommentar
|
short
comments for myself. These are not checked, I did not write down the source (in
some cases Comrie is the source) and should thus be erased. Do not quote these
comments.
|
Among demographs and sociolinguists, the Soviet census data are generally held
as a reliable source, with some known exceptions (e.g. the Nganasan, that are
reported with too high native language profiency), in any case they are the
only ones available, and probably the most extensiv demographic database of
this size and complexity. Norway, to cite a country of the present homepage,
left out questions of language proviciency from its census data during the
first half of this century, and even in earlier questionnnaires, second
language proficiency was never asked for.
Another source of errors not to be ignored is the typist, i.e. myself. I typed
in these numbers from the published sources. After typing them in I went over
them and checked, but I still cannot guarantee that they are error-free. So, in
case of unexpected data, this electronic version should be checked against the
original. needless to say, I would appreciate reports on any detected error in
this material.
This material may freely be used for research upon Soviet linguistic,
sociolinguistic and sociological matters. The reason I now make it available is
exactly to promote such research. The material may not be used for commercial
purposes. In case you use it, I would appreaciate that you mention the source
and make reference to this site.
In addition to the data given here, there are more data available both in the
published and in the unpublished sources. The data are also broken down in age
cohorts, urban/rural, rayon by rayon, etc. Rather than aspiring at presenting
the whole material, I present what I have, and hope that this may inspire
researchers to go to the archives after more fine-grained material.
Vsesojuznaja perepis´ naselenija 1939 goda. Osnovnye
itogi.Rossijskaja Akademija Nauk. Moskva 1992. (1939)
Itogi vsesojuznoj perepisi naselenija 1959 goda (svodnyj tom).
Tsentral'noe statistitsjeskoe upravlenie pri sovete ministrov SSSR. Moskva
1962. (1959)
Itogi Vsesojuznoj perepisi naselenija 1970 goda. Moskva. Statistika (1970)
Tsjislennost' i sostav naselenija SSSR Po dannym Vsesojuznoj perepisi
naselenija 1979 goda. Moskva. Finansy i statistika. 1984. (1979)
Vestnik Statistiki 10/1990. Moskva. Finansy i statistika. (1989)
Lallukka, Seppo 1990: The East Finnic Minoritites in the Soviet Union. Annales academiæ scientiarum Fennicæ ser. B tom. 252.
Silver, B: 1975: Methods of Deriving Data on Bilingualism from the 1970 Soviet Census.Soviet Studies 27:4.