Statistics

Russian Learner Corpus

Native language

Native language Number of texts Number of sentences Number of words
unknown 319 5752 65524
Swedish 182 3821 37812
Estonian 2 13 99
Vietnamese 127 2369 28965
Romanian 16 248 3202
Azerbaijani 6 99 1187
Pashto 53 429 6530
Macedonian 36 598 7548
Hindi 27 354 4555
Dutch 25 544 5941
Korean 263 5959 78324
Khmer 10 143 972
Lao 4 106 1058
Hungarian 9 211 2373
Indonesian 58 1027 11457
Georgian 4 35 718
Turkish 72 1368 14353
French 790 10334 110509
Abkhaz 1 32 456
Norwegian 41 1125 10046
Dari 2 4 302
Bengali 27 419 5350
Tajik 85 1021 12533
Thai 24 378 3811
Italian 257 4279 62590
Kazakh 963 15971 177881
Turkmen 158 4713 72015
Kurdish 18 135 1312
Slovene 111 1800 22118
Nepali 5 34 448
Finnish 1238 26254 314875
Uzbek 5 37 344
Albanian 6 114 1273
Bulgarian 54 1245 14491
Greek 8 87 1135
Serbian 163 3283 35210
Arabic 211 2203 33810
Croatian 22 693 12138
Portuguese 32 470 5285
Chinese 981 13999 165282
Czech 101 2161 33790
Japanese 1596 18111 139969
German 307 4256 49198
Slovak 4 86 939
Mongolian 73 1543 14332
Spanish 385 5933 105622
Urdu 29 386 4277
Farsi 49 525 6779
English 3406 53536 699194

Raw counts

Number of texts 12365
Number of words 2387932
Number of sentences 198243
Number of annotations 123474

Language background

unknown 197
heritage 2994
foreign 9174

Gender counts

unknown 1827
male 3826
female 6712