African language Wikimedia projects summary

A few months ago I wrote an article which summarises my history on the Malagasy Wiktionary, and more generally my history on Malagasy language Wikimedia projects.

I am back here to write a short summary recapitulating the current progression of African language WMF projects. In this article you’ll learn about the current stage of African language projects and their trend.

In terms of community size, the biggest African-language community is the Afrikaans language Wikipedia community; followed by Egyptian Arabic speaking community and Swahili speaking community.

If we look closer to the statistics. The award goes to the Afrikaans language Wikipedia community which has 7 to 8 very active contributors (performing more than 100 edits per month).

The Egyptian Arabic Wikipedia community counts 2-3 very active contributors, which is big for an African language but very small comparing to Standard Arabic community counting more than twenty times more active users (83 very active users in June 2013), most of them being Egyptian contributors.

About Swahili, the number of very active users is one to two. On a 2-year term, this number can be averaged to 1. But the number of active users (i.e. making more than 5 edits per month) is 9 in average, which is a fine thing for a language that is spoken in countries where internet access is quite hard.

These numbers were obviously averaged from July 2011 to June 2013, so it smoothes short-term variations.

In terms of raw article size, the biggest African language Wikimedia project is the Malagasy Wiktionary – which currently counts 2.5 million articles, only smaller than English and bigger than French! – , the Malagasy Wikipedia (40,000+ articles) and the Yoruba Wikipedia (30,000+ articles), followed by the Afrikaans and the Swahili language Wikipedias (respectively 27,000+ and 25,000+ articles).

The Malagasy Wiktionary balecame very big for reasons you can read here, the Malagasy Wikipedia is big thanks to geography articles (~20,000 articles) and celestial objects (~8,000 articles); the Yoruba Wikipedia is made big by articles about people and also celestial objects (~15,000 objects).

Many Wikimedians who consult the statistics should know that the number of content pages does not determine the quality or the comprehensiveness of an encyclopedia. Judging wikis by article count is like judging a book by the appearance of its cover. And many book readers and critics know that looking at the cover is not enough to judge a novel. Here, by its raw size, the Malagasy language dominate in the two biggest projects (Wikipedia and Wiktionary) but that doesn’t mean it has a very active community.

To judge about the quality, comprehensiveness and completeness of the articles of such wikis, it is better to dive into this kind of statistics where scores are given by the absence/presence of vital articles and the size (number of characters) of such articles (if they exist). That kind of statistics are better than article count and page depth which can be inflated by the use of bot and the generation of tons of non-article pages (talk pages, subpages, redirects…).

According to the List of Wikipedias by sample of articles, the best scored African language Wikipedia is the Afrikaans Wikipedia, which ranks 58th and the Swahili Wikipedia (79th) followed by Egyptian Arabic, Yoruba and Somali Wikipedias. Malagasy Wikipedia is quite far behind and ranks 155th which is only higher than Lingala (161st), Wolof (175th) and Shona (187th) Wikipedias having less than 5,000 articles. Which means article count is only the cover of the book and thus some efforts have to be done there to make Malagasy Wikipedia more comprehensive.

What about the trend?

Less than a year ago, some Wikipedias found a way to grow in number of article thanks to species databases. The first ones I saw to grow this way are Winaray and Cebuano Wikipedias. Winaray Wikipedia gained 100,000 articles primarily thanks to low quality geography stubs (consisting in one or two sentences), and secondarily thanks to articles about species, animal and vegetal ones, making it to have 510,000 articles. Cebuano has more than decupled in article count within the last 50 weeks, from 40,000 to more than 500,000 articles. This mania of creating article about species has propagated to Swedish and Dutch Wikipedia which has recently surpassed the German Wikipedia, and in response to that, the latter Wikipedia seemed to have boycotted the Dutch Wikipedia, by deleting the link to the Dutch Wikipedia in the German language Wikipedia main page.

Now let’s write about the growth trend of African language Wikimedia projects. First off, let’s talk about Wikipedias, then Wiktionaries and finally other «minor» Wikimedia projects.

Wikipedia language edition

Current article count

Growth (in 300 days) (1)

Malagasy

40,619

+2,415

Yoruba

30,624

+582

Afrikaans

27,801

+3,928

Swahili

25,368

+1,232

Amharic

12,722

+1,015

Egyptian Arabic

10,764

+1,939

Somali

2,830

+383

Lingala

2,035

+118

Kinyarwanda

1,816

+7

Kabyle

1,517

+778

Wolof

1,172

+49

Kongo

826

+135

Northern Sotho

688

Igbo

739

+44

Zulu

586

+22

Setswana

496

–1

Bambara

392

+6

Siswati

368

+6

Ewe

302

+12

Hausa

291

+17

Oromo

276

+36

Tigrinya

259

+2

Tsonga

250

+7

Sango

204

+17

Kirundi

192

+8

Sesotho

189

+44

Akan

179

+17

Fulfude

166

+12

Luganda

166

–2

Twi

157

+12

Chamorro

157

+6

Xhosa

151

+10

(1) Calculated following this site, data retrieved in July 26th 2013.

On Wikipedia, the growth is slow comparing to other languages spoken in developped countries, where Internet access is easy and unexpensive to the normal citizen. The African language with the biggest community grows at approximately 5,000 articles per year, which is fairly high comparing to Swahili which growth is almost twice lower. If the current trend continues, the Afrikaans Wikipedia will surpass the Yoruba language Wikipedia next year, and the Malagasy Wikipedia in the next 2 years, as the two current biggest Wikipedias are stagnating in article growth.

On smaller Wikipedias, the trend is positive, though slow. All open Wikipedias have more than 100 articles.

The biggest of them is the Malagasy Wiktionary which has its growth kept by the use of Bot-Jagwar. Owned by myself, Bot-Jagwar runs from the Cloud, so it works regardless my computer and my internet connection’s healths. Thanks to it, the Malagasy Wiktionary gains 300 to 500 content pages daily. Automations eases many things in many ways, but automated processes can fail. So I have to keep an eye not only on the source code but also to entries generated thanks to that source code.

African language Wikipedias are slowly but surely gaining articles as time passes. There seems to be a moratorium in closing African language Wikipedias, and this is fine because languages mainly spoken in developping countries need time to develop a community. Furthermore, the official language in these countries, especially African ones, are very often not the local language.

Kurzweil Curve showing growth of computing power. It shows that all human brains can be simulated by 2050.

Kurzweil Curve showing growth of computing power. It shows that all human brains can be simulated by 2050. What about having billions of “virtual” contributors on Wikipedia in 2050? Source (kraxinglogic.com)

An increase of bot-made articles (which constitute nowadays 20% of articles created in Wikipedia) can indicate that in a near future, perhaps in 25 or 30 years, a bot will be able to write article like humans do. This is because Ray Kurzweil predicts the ability to simulate the human brain to be possible in twelve years and that current computers’ calculation power were supercomputers’ in the 1990s.

What about me? Well, it’s been a while since my last big article on the Malagasy Wikipedia. And according to the list of Wikipedias by sample of article, several hundreds of article needed in all Wikipedias are missing, so my first goal for Wikipedia is to fill these gaps, slowly but yet surely. I prefer contributing about geography, but as I am the only contributor of the Wiki, I have to fill gaps a bit everywhere : Biography, Chemistry, Sports, etc. At that pace, I can barely create three or four articles per day. At that pace, I can fill the 1,000 articles that every Wikipedia should have list whithin the year.

It’s been a while since the last time I blogged in Malagasy, So this article will be followed by a Malagasy language article. Perhaps a translation of this one, perhaps a new one.

Useful resources

To read further about what’s mentioned here.

  1. The law of Accelerating Returns by Kurzweil
  2. http://www.wikistatistics.net for all statistics about Wikimedia projects
Advertisements

One thought on “African language Wikimedia projects summary

  1. This is my own first time i visit here. I came across so many interesting stuff within your blog, particularly its discussion. From the a great deal of comments on the posts, I assume I am not alone having every one of the enjoyment the following! Keep up outstanding work.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s