Community Embraces New Word Game at Mid-Year Play Day This past Sunday, families at Takoma Park’s Seventh Annual Mid-Year Play Day had the opportunity to experience OtherWordly for the first time. Our educational language game drew curious children and parents to our table throughout the afternoon. Words in Space Several children gathered around our iPads […]
Read moreWikipedia has 4,362,397 articles in English. But how many of those are seriously encyclopedic, and what are the most important articles?
We’ve been looking closely at Wikipedia for an upcoming app. We wanted to know the most important articles. We calculated an importance score for every article, based on how richly linked a Wikipedia article is within Wikipedia (the number and quality of links to a page), how many languages an article has been translated into, the brevity of the title, how popular an articles is (web hits), and the number of citations/references of an article (scholarliness).
The following are our results. This is an arbitrary, but interesting ranking, so we wanted to share it:
Top 100 English Wikipedia articles:
France, Germany, Canada, Australia, England, United_States, China, Japan, Russia, London, Italy, India, Animal, Poland, Brazil, Iran, Spain, California, Romania, Europe, Mexico, Sweden, Scotland, Switzerland, Netherlands, Turkey, Israel, Paris, Philippines, Pakistan, Norway, United_Kingdom, Insect, Indonesia, Denmark, Greece, Arthropod, Belgium, Chicago, Syria, Texas, Argentina, Marriage, Singapore, Egypt, Malaysia, Austria, Ukraine, Taiwan, Virginia, Islam, Wales, Finland, Florida, Ireland, Philadelphia, Portugal, Rome, Azerbaijan, Afghanistan, Latin, Bird, Boston, Pennsylvania, YouTube, Hungary, Serbia, Vietnam, Berlin, Plant, Quebec, Buddhism, Croatia, Massachusetts, Christianity, Bulgaria, World_War_II, Thailand, Facebook, Protein, Earth, Africa, Chile, Village, Species, Iraq, Colombia, Burma, Slovenia, Toronto, Moscow, Cuba, Mathematics, BBC, Montreal, Fungus, Peru, Chordate, Estonia
101-200:
Jesus, Jews, Nigeria, Lepidoptera, Ontario, Slavery, Ohio, Sydney, Illinois, Napoleon, Basketball, Melbourne, Maryland, Internet, Human, Tokyo, Jazz, Lebanon, Mumbai, Nepal, Istanbul, Bangladesh, Agriculture, Google, Asia, Seattle, Hawaii, Beijing, Warsaw, Iceland, Athens, Philosophy, Venezuela, Atlanta, Michigan, Jerusalem, English_language, Detroit, Cyprus, Guitar, Ethiopia, Vienna, NASA, Kenya, Mollusca, Morocco, Minnesota, Cricket, Association_football, Hinduism, Slovakia, Oxygen, Amsterdam, Bacteria, Algeria, Enzyme, Manhattan, Microsoft, Prague, Alaska, Edinburgh, Television, Belarus, Judaism, Milan, Kerala, Latvia, Vancouver, Mammal, Census, Tennis, DNA, Madrid, Economics, New_York_City, Houston, Oregon, New_Zealand, Baseball, Cancer, Copenhagen, Moon, Barcelona, Dublin, NATO, Manchester, Armenia, Wisconsin, Lithuania, Liverpool, Protestantism, Gene, Madagascar, Indiana, Ecuador, Muhammad, Gold, Sun, Law, Alabama
201-300:
Hangul, Renaissance, Nazism, Physics, Linux, Bible, Budapest, Water, Hydrogen, Albania, Malta, Baltimore, City, Science, Louisiana, Colorado, Birmingham, Soviet_Union, Antarctica, Stockholm, Jordan, World_War_I, Uruguay, Evolution, HIV/AIDS, Jamaica, Singing, Communism, Somalia, Glasgow, Education, Tanzania, Bolivia, Film, Arizona, Pittsburgh, Kentucky, Libya, Luxembourg, Missouri, Wikipedia, Connecticut, Tuberculosis, Ghana, Euro, Kolkata, Sociology, Alberta, Psychology, Twitter, Novel, Sanskrit, Oklahoma, Zimbabwe, Socialism, Shanghai, Kazakhstan, Aristotle, Anime, UNESCO, Dallas, Religion, Dubai, Dog, Ottawa, Mars, Yemen, Venice, Hamburg, Sicily, South_Africa, Greenland, Delhi, Copper, Asteroid, Biology, Quran, Fish, Los_Angeles, Rice, Munich, Seoul, Catholic_Church, CBS, Watt, Chennai, Miami, Cambodia, Archaeology, Actor, Tennessee, Belgrade, Tunisia, New_York, Atheism, Pope, Christmas, Cameroon, Genus, Vermont
301-400:
Computer, Caribbean, Brooklyn, European_Union, Democracy, Oslo, Utah, DVD, Iron, Bangkok, Florence, Ecology, Aluminium, History, Frog, Music, Moldova, Chemistry, Horse, Language, God, Sudan, Mongolia, Iowa, Uganda, Denver, Austria-Hungary, Lisbon, Automobile, Qatar, Jakarta, Naples, Nevada, Maize, Panama, Fascism, Maine, Kuwait, Arkansas, Cat, Malaria, Haiti, Medicine, Augustus, Star, Kiev, Dinosaur, Hindi, Beetle, Mississippi, Newspaper, San_Francisco, Lutheranism, Sugar, Amphibian, Moth, Brussels, Damascus, Muslim, Album, Cleveland, Piano, Bahrain, Midfielder, Reptile, Eminem, Nicaragua, Cairo, Hong_Kong, Plato, Korea, Germans, Culture, Maharashtra, IBM, South_Korea, Bristol, Petroleum, Homosexuality, NBC, Minneapolis, Macau, Guatemala, Angola, Monaco, Uzbekistan, Manitoba, Manila, Bavaria, Karnataka, United_Nations, Astronomy, Tree, River, Namibia, Belfast, Kansas, Spanish_language, Poetry, Geneva
401-500:
University, Americas, Frankfurt, Laos, Charlemagne, Electron, Al-Qaeda, Population, Queensland, Virus, Bangalore, Brisbane, Engineering, Blues, Wheat, Submarine, Hollywood, Barack_Obama, Calgary, Cornwall, Sri_Lanka, IPhone, Poverty, Cologne, Blog, Chess, Atom, Steel, Scandinavia, Cardiff, Snake, Shiva, Helsinki, Carbon, Rock_music, Globalization, Zinc, Suicide, Prussia, Mali, Catholicism, Roman_Empire, Fruit, Linguistics, Manga, Fiji, Middle_Ages, Eukaryote, Radio, Brain, Tehran, Canberra, Edmonton, Milk, Coal, Perth, Alps, Liberia, Stroke, Kosovo, Coffee, Anthropology, Cincinnati, Theology, Municipality, Lion, Pneumonia, Crusades, Hertz, Government, Catalonia, Montenegro, Capitalism, Milwaukee, Cattle, Honduras, Wyoming, North_America, Mauritius, French_language, Oman, Food, Electricity, Bucharest, Volleyball, Vikings, Christian, Auckland, Sheep, Lawyer, Liberalism, Telecommunication, Tourism, Ethanol, Elephant, Gujarat, Winnipeg, Kyrgyzstan, Gibraltar, Earthquake
501-600:
Volcano, Paraguay, Feminism, Turin, Sculpture, MTV, Lake, Senegal, Freemasonry, Painting, Butterfly, Beirut, Saskatchewan, Jupiter, Bhutan, Boxing, Advertising, Silver, Marxism, HIV, Adelaide, Siberia, Marseille, Czechoslovakia, Ottoman_Empire, Brunei, Nebraska, Karachi, Gastropoda, Golf, Urdu, Idaho, Constantinople, Forest, Wine, Mesopotamia, Theatre, Endemism, Baghdad, Oxford, Technology, Nitrogen, Leeds, Anatolia, Delaware, War, Palestine, Belize, Sony, Bollywood, Statistics, Tasmania, Schizophrenia, Johannesburg, Art, Terrorism, Suriname, Stuttgart, Mozambique, Pregnancy, Lead, Racism, Intel, Wii, Toyota, Potato, Vietnam_War, Temperature, Geology, American_Civil_War, Thessaloniki, Greeks, Opera, Biodiversity, Guam, Bermuda, Zambia, Photography, Beer, Extinction, Czech_Republic, Spider, Saudi_Arabia, Balkans, American_football, Rihanna, Barbados, Sport, Desert, Ultraviolet, Cambridge, Anarchism, Email, Baptism, Antisemitism, Java, Kent, Indianapolis, German_language, Politics
601-700:
Mecca, Drama, Jainism, Sufism, Moses, Metallica, Tibet, Sheffield, Ecosystem, Taliban, Metabolism, Conservatism, Batman, Algorithm, Crete, Cocaine, Alcohol, New_Jersey, Planet, Celts, Zagreb, Honolulu, Coca-Cola, Lyon, Mountain, Venus, Vertebrate, Abortion, Bat, Violin, Romanticism, Maldives, Sofia, Yorkshire, Superman, Honda, Nintendo, Havana, Meat, Anglicanism, Republic, Inflation, Guyana, Ammonia, Jay-Z, Geography, Fossil, Copyright, Neolithic, Sulfur, Sharia, Energy, Helicopter, Mineral, Guangzhou, Genetics, Blood, Ship, Obesity, Diamond, Cold_War, Smallpox, Osaka, Bishop, Yahoo!, Yugoslavia, Chad, Library, Physician, Bratislava, Tajikistan, Andalusia, Asphalt, Ethics, Red, Methodism, HBO, Lima, Professor, Town, Prostitution, Apple, Writer, Puerto_Rico, Blue, Tax, Taoism, Liver, CNN, Time, Sardinia, HTML, Myspace, Architecture, Hydroelectricity, Taipei, Potassium, William_Shakespeare, George_Washington, Pinyin
701-800:
Uranium, Riga, Hypertension, Ljubljana, Cotton, Bihar, Wiki, Wellington, Calcium, X-ray, ITunes, Soil, Elizabeth_II, Quakers, Macintosh, Mayor, Honey, Flower, Alcoholism, Satire, Country, Assam, Lancashire, Walmart, Soybean, Himalayas, Concrete, Asthma, Mining, Antwerp, Lahore, Baku, Gospel, Montevideo, Feudalism, Castle, Allmusic, WWE, Genoa, Police, Calvinism, Yoga, Primate, Alexandria, Saturn, Eritrea, Saint_Petersburg, Krishna, Homer, Lesbian, Barley, Dresden, Antibacterial, Logic, Baptists, Turkmenistan, Ant, Mitochondrion, Rape, Strasbourg, Leipzig, Judo, Kidney, Bali, Tiger, Nationalism, Mythology, Heart, Disease, Botswana, Seville, Dhaka, Salt, Insurance, Algae, Michael_Jackson, Malayalam, BMW, Unicode, Sodium, Tobacco, Satellite, Oak, Patent, Metro-Goldwyn-Mayer, Banana, Harvard_University, Bank, Rapping, IPad, PHP, Byzantine_Empire, Organism, Vilnius, Mosque, Santiago, Sparta, Marketing, Mahabharata, Slavs
801-900:
Synthesizer, Transylvania, Talmud, Book, Nokia, Malawi, French_Revolution, Magnesium, Glacier, Rajasthan, Danube, Constitution, Cher, Hewlett-Packard, Cheese, Tea, Crustacean, Liechtenstein, Dorset, Software, Agnosticism, Photosynthesis, Northern_Ireland, Anatomy, Flowering_plant, Nile, Guinea, Infrared, Oceania, Helium, Gothenburg, Rotterdam, Sarajevo, Wi-Fi, North_Korea, Ronald_Reagan, Immigration, Friends, Easter, Apollo, Glass, Goa, Sex, Queens, Cholera, Geometry, Plastic, Ocean, Muscle, Reggae, Microsoft_Windows, FIFA, Andorra, Russians, Tallinn, Autism, EMI, Gravitation, Smartphone, Shark, Pornography, Olympic_Games, Tram, Tornado, York, Xinjiang, Website, Vegetarianism, Influenza, Ancient_Rome, UEFA, Limestone, Database, Sea, Leaf, Zoroastrianism, Universe, Motorcycle, Politician, Museum, Chromosome, Trinity, Samoa, Torah, Hezbollah, Bologna, Bill_Clinton, Death, Rhine, Deforestation, Nickel, Romanization, Vagina, Abraham_Lincoln, Metal, Eucharist, Burundi, Southampton, Akbar, Thermodynamics
901-1000:
Bordeaux, Zeus, Dam, Paleontology, Baroque, Assyria, Passerine, Tomato, Light, Greek_language, Rodent, Habitat, Surrey, Biochemistry, Airport, Hamlet, Saxophone, Murder, Galaxy, Unemployment, Somerset, Basel, RNA, Continent, Benin, Adolescence, Nairobi, Erosion, Cicero, Niger, Aberdeen, Titanium, Brittany, Andes, Family, Rain, Mauritania, Comet, Arabic_language, North_Carolina, Bicycle, Photon, Pop_music, Korean_War, Chicken, Metre, Ganges, EBay, Devon, Wood, Orchidaceae, Kabul, Jersey, Radar, Hamas, Synthpop, Monocotyledon, Odisha, Area, Life, JavaScript, Communication, Refugee, Inflammation, Herodotus, Gabon, Confucianism, PH, Pluto, Kilogram, Aesthetics, Spider-Man, Michelangelo, Nottingham, Amtrak, United_States_dollar, Mercedes-Benz, Flute, Islamabad, Penis, LGBT, Vanuatu, Teacher, Island, Population_density, Ankara, Unix, White, Tin, Chlorine, Zionism, Military, Latitude, Laser, Firefox, IOS, Tuscany, Phosphorus, Comedy, Science_fiction, Research
Other notes
No ranking is perfect, and importance is subjective. Some people will want to have more asteroids or car models, others will want more football players or music albums. However, the above listing is relatively stable — meaning if we adjust the relative weights of various factors, the articles will reshuffle a little, but the list looks basically the same.
Another side effect of ranking Wikipedia articles is that we can evaluate the signal to noise ratio. Very loosely speaking, we believe that approximately half a million Wikipedia articles are solid Encyclopedic topics. The remaining 3.8 million tend to include geographical locations (e.g., a town in Siberia), popular culture artifacts (music albums, old TV shows), lesser companies, politicians and sports figures and other people. Often the lowest-ranking articles were wikispam, and were already removed from Wikipedia by dutiful Wikipedia editors.