Safekipedia

Search engine

Adapted from Wikipedia · Adventurer experience

A screenshot showing search results for The Magic Flute opera, a classic and family-friendly musical work.

A search engine is a special kind of software system that helps people find information on the Web. When someone wants to look up something, they type a word or question, called a query, into a web browser or a mobile app. The search engine looks through many web pages and other information to give back a list of search results. These results usually include links and short descriptions to help the person find what they need.

A Google search result for the phrase "magic flute opera"

Behind the scenes, a search engine uses many computers in different places around the world, called a distributed computing system. It works quickly because of a process called indexing, which is updated all the time by special programs known as web crawlers. These crawlers go through many web servers to collect information.

Many search engines have existed since the Web began, but one called Google Search became very popular and is still used by most people today. Other search engines include Bing, Yandex, Yahoo!, DuckDuckGo, and Baidu. Because of this, many websites try to appear in search results through a process called marketing and optimization.

History

Further information: Timeline of web search engines

Pre-1990s

In 1945, Vannevar Bush wrote about a system to help people find information. He called it a memex in his article "As We May Think" in The Atlantic Monthly. The memex was meant to make finding information easier as more data grew. Vannevar Bush imagined libraries with connected notes, like the hyperlinks we use today.

Link analysis later became important for search engines through methods like Hyper Search and PageRank.

1990s: Birth of search engines

The first search engines appeared before the web existed in December 1990. WHOIS let users search in 1982, and the Knowbot Information Service began in 1989. The first search engine to look through files was Archie, which started on September 10, 1990.

Before September 1993, the World Wide Web was organized by hand. There was a list of webservers kept by Tim Berners-Lee at CERN. As more web servers appeared, this list could not keep up. On the NCSA site, new servers were listed under "What's New!".

The first tool to search the content of the Internet was Archie. It stood for "archive" without the "v". It was created by Alan Emtage, a student at McGill University in Montreal, Quebec, Canada. Archie downloaded lists of files from public FTP sites, creating a searchable database of file names.

The rise of Gopher in 1991 led to new search tools like Veronica and Jughead. Like Archie, they searched file names and titles in Gopher systems. Veronica allowed keyword searches of Gopher menu titles. Jughead got menu information from specific Gopher servers.

In the summer of 1993, no search engine existed for the web, but many catalogs were kept by hand. Oscar Nierstrasz at the University of Geneva created W3Catalog, the web's first simple search engine, released on September 2, 1993.

In June 1993, Matthew Gray at MIT created the World Wide Web Wanderer, an early web robot. It was used to measure the size of the web until late 1995. The web's second search engine, Aliweb, appeared in November 1993. It relied on website administrators to provide information about their sites.

JumpStation, created in December 1993 by Jonathon Fletcher, was the first tool to combine crawling, indexing, and searching. Because of limited resources, it only indexed titles and headings from web pages.

One of the first search engines to search all text was WebCrawler, released in 1994. It let users search for any word on any web page, which became the standard. Also in 1994, Lycos was launched and grew to be popular.

The first popular web search engine was Yahoo! Search. Started by Jerry Yang and David Filo in January 1994, it began as a Web directory called Yahoo! Directory. In 1995, a search function was added, making it a favorite way to find web pages.

Many search engines appeared after that, competing for popularity. These included Magellan, Excite, Infoseek, Inktomi, Northern Light, and AltaVista. People could also browse directories instead of searching by keywords.

In 1996, Robin Li developed the RankDex algorithm for ranking search results. It was the first to use hyperlinks to judge website quality, before Google's similar PageRank in 1998. Li later used this technology for the Baidu search engine, launched in China in 2000.

In 1996, Netscape planned to feature one search engine but ended up making deals with five: Yahoo!, Magellan, Lycos, Infoseek, and Excite.

Google started selling search terms in 1998, changing the search engine business.

Many search engine companies grew quickly in the late 1990s but were affected by the dot-com bubble that ended in March 2000.

2000s–present: Post dot-com bubble

Around 2000, Google's search engine became very popular. It used an algorithm called PageRank, created by Sergey Brin and Larry Page, the founders of Google. This method ranks web pages based on how many other popular sites link to them.

Yahoo! used Inktomi's search technology until 2002, when it bought Inktomi, and then Overture in 2003. Yahoo! used Google's search until 2004, when it launched its own search using its acquisitions.

Microsoft started MSN Search in 1998 using Inktomi's results. In 1999, it used Looksmart and AltaVista at times. In 2004, Microsoft began using its own technology with its web crawler called msnbot.

Microsoft launched its rebranded search engine, Bing, on June 1, 2009. On July 29, 2009, Yahoo! and Microsoft agreed that Yahoo! Search would use Microsoft's Bing technology.

As of 2019, active search engine crawlers include those of Baidu, Bing, Brave, Google, DuckDuckGo, Gigablast, Mojeek, Sogou and Yandex.

Timeline (full list)
YearEngineCurrent status
1993W3CatalogInactive
ALIWEBInactive
JumpStationInactive
WWW WormInactive
1994WebCrawlerActive
Go.comInactive, redirects to Disney
LycosActive
InfoseekInactive, redirects to Disney
1995Yahoo! SearchActive, initially a search function for Yahoo! Directory
DaumActive
Search.chActive
MagellanInactive
ExciteActive
MetaCrawlerActive
AltaVistaInactive, acquired by Yahoo! in 2003, since 2013 redirects to Yahoo!
SAPOActive
1996RankDexInactive, incorporated into Baidu in 2000
DogpileActive
HotBotInactive (used Inktomi search technology)
Ask JeevesInactive
1997AOL NetFindActive (rebranded AOL Search since 1999)
goo.ne.jpActive
Northern LightInactive
YandexActive
1998GoogleActive
IxquickActive as Startpage.com
MSN SearchActive as Bing
empasInactive (merged with NATE)
1999AlltheWebInactive (URL redirected to Yahoo!)
GenieKnowsInactive, rebranded Yellowee (was redirecting to justlocalbusiness.com)
NaverActive
TeomaInactive (redirect to Ask.com)
2000BaiduActive
ExaleadInactive
GigablastInactive
2001KartooInactive
2003Info.comActive
2004A9.comInactive
ClustyActive, Yippy, previously Clusty, now owns Togoda.com
MojeekActive
SogouActive
2005SearchMeInactive
KidzSearchActive, Google Search
2006SosoInactive, merged with Sogou
QuaeroInactive
Search.comActive
ChaChaInactive
Ask.comInactive
Live SearchActive as Bing, rebranded MSN Search
2007wikiseekInactive
SprooseInactive
Wikia SearchInactive
Blackle.comActive, Google Search
2008PowersetInactive (redirects to Bing)
PicollatorInactive
ViewziInactive
LeapFishInactive
ForestleInactive (redirects to Ecosia)
DuckDuckGoActive
TinEyeActive
2009BingActive, rebranded Live Search
YebolInactive
Scout (Goby)Active
NATEActive
EcosiaActive
Startpage.comActive, sister engine of Ixquick
2010BlekkoInactive, sold to IBM
CuilInactive
Yandex (English)Active
ParsijooActive
2011YaCyActive, P2P
2012VoluniaInactive
2013QwantActive
2014EgerinActive, Kurdish / Sorani
SwisscowsActive
SearxActive
2015YoozInactive
CliqzInactive
2016KiddleActive, Google Search
2017PresearchActive
2018KagiActive
2020PetalActive
2021Brave SearchActive
You.comActive
2022PerplexityActive

Approach

A search engine does three main things all the time:

  1. Web crawling
  2. Indexing
  3. Searching

Search engines collect information by moving from website to website and checking each one. They save important words and details from these pages in a big list. When you type a question into a search engine, it uses this list to find the best matches quickly.

When you search, you usually type just a few words. The search engine knows which websites have those words and shows them to you. It also lets you change your search to get better results. The goal is to show the most helpful pages first, and many search engines also show ads.

Market share

As of January 2022, Google is the most used search engine in the world. Other popular search engines include Bing, Yandex, and Yahoo!. Many other search engines exist but are used by fewer people.

In Russia, Yandex is the leading search engine. In China, Baidu is the main search engine, and Google does not operate there. In Japan, Google is the most used, and Yahoo! Japan is also popular. In South Korea, Naver leads, but Google's use has grown. In Taiwan, Google is the most used search engine.

Search engine bias

Further information: Algorithmic bias

Search engines try to show the best and most popular websites when you search. But sometimes, they show information that is not fair or balanced. This can happen for different reasons.

For example, companies that pay to advertise might appear more often in the search results. Also, some countries have laws that make certain information illegal. Because of this, search engines might not show those websites in those places.

Sometimes, the way search engines are set up can leave out less popular ideas or focus more on websites from certain countries, like the United States. People have also tried to change search results for their own purposes, such as to influence what others think about important topics. Researchers have studied how search engines affect our understanding of subjects like terrorism in Ireland, climate change denial, and conspiracy theories.

Google Bombing is one way people have tried to change what shows up in search results for political, social, or business reasons.

Customized results and filter bubbles

Some people worry that search engines like Google and Bing change what you see based on what you do online. This can make it feel like you only see things that match what you already think. In 2011, a person named Eli Pariser talked about this idea.

Because of this, other search engines like DuckDuckGo were created. These try not to change what you see based on your past searches. Some researchers say there isn’t strong proof that this is a big problem. They found that most people still see many different ideas when they search online.

Religious search engines

Because the Internet has grown a lot in the Arab and Muslim world, some people made special search engines for them. These search engines help users find information that follows Islamic rules, called "halal", and avoid information that does not, called "haram". Examples include ImHalal, which started in 2011, and Halalgoogling, which began in 2013. These search engines use filters to keep out unwanted content.

Other religious search engines exist too, like Jewogle for Jewish users and SeekFind.org for Christian users. These also filter out websites that go against their beliefs.

Search engine submission

When someone creates a website and wants people to find it easily, they can tell a search engine about it. This is called submitting a website. But usually, you don't need to do this because search engines have special programs called web crawlers that find websites on their own.

You can tell a search engine about just one page, like the main page, or you can tell them about your whole website using something called a sitemap. There are a couple of reasons to tell a search engine about your website: if it's brand new and not found yet, or if you've changed it a lot and want it to show up faster in search results. Some tools can tell many search engines at once and also add links to your site, but this might not always be the best idea because it can affect how well your site shows up in searches.

Comparison to social bookmarking

See also: Social media optimization

Social bookmarking is different from search engines. In social bookmarking, people—not computers—pick tags to sort websites. This helps people understand what a website is about. People can also save websites that search engines might miss.

Social bookmarking can show how many people like a website by counting how many have saved it. This can be useful. But, like search engines, social bookmarking can be tricked, so it needs ways to stay safe.

Technology

Archie

The first web search engine was Archie. It was created in 1990 by Alan Emtage, a student at McGill University in Montreal.

Archie worked by collecting lists of files stored on File Transfer Protocol sites. FTP is a way for computers to share files online. Users could visit FTP sites to download files. Archie helped people find files by putting them in a list, so users did not need to know where to look.

Veronica

In 1993, Veronica was made by the University of Nevada. It searched files stored on Gopher, another way to share information online, just like Archie searched FTP files. A similar tool called Jughead also appeared around that time.

The Lone Wanderer

The World Wide Web Wanderer was created in 1993 by Matthew Gray. It was the first robot to move around the web and count how many websites there were. It also wrote down the addresses of websites, creating the first web database called the Wandex.

Excite

Excite started as a project by six students at Stanford University in 1993. They wanted to make searching the Internet easier by studying how words were used together. Excite became a popular search engine in 1995.

Yahoo!

In 1994, David Filo and Jerry Yang, two students at Stanford University, created Yahoo!. It began as a list of web pages they liked. As more people used it, they sorted the pages into categories, making it easy to search. Yahoo! was not a typical search engine because it was first made by hand, but it later added search features.

Lycos

Lycos was made in 1994 by Michael Mauldin at Carnegie Mellon University.

Types of web search engines

Web search engines help people find information online. They do this in three main ways:

  1. Look for content that matches the words a person searches for.
  2. Keep a list of where that content can be found.
  3. Let users search through that list.

There are three main types of search engines. Some use robots, called crawlers, to go around the web and collect information. Others depend on people to add information. Some use both ways.

Crawler-based search engines send out robots to visit websites, read their content, and follow links to other sites. These robots bring back information to a central place where it is organized and stored. They visit websites often to see what’s new.

Human-powered search engines rely on people to add information.

When you search using a search engine, you are searching through a list, not the live web. This is why you might sometimes find links that no longer work — the list has not been updated.

Different search engines can give different results because they use different ways to decide which results are most helpful. They look at how often certain words appear on a webpage and how other pages link to it.

Modern search engines are very complex and use many computers to handle the huge amount of information on the web. Some search engines, like Google Scholar, focus on finding scientific research. Researchers are working to make search engines better at understanding the meaning behind words in articles.

TypeExampleDescription
ConventionallibrarycatalogSearch by keyword, title, author, etc.
Text-basedGoogle, Bing, Yahoo!Search by keywords. Limited search using queries in natural language.
Voice-basedGoogle, Bing, Yahoo!Search by keywords. Limited search using queries in natural language.
Multimedia searchQBIC, WebSeek, SaFeSearch by visual appearance (shapes, colors,..)
Q/AStack Exchange, NSIRSearch in (restricted) natural language
Clustering SystemsVivisimo, Clusty, Togoda
Research SystemsLemur, Nutch

Related articles

This article is a child-friendly adaptation of the Wikipedia article on Search engine, available under CC BY-SA 4.0.

Images from Wikimedia Commons. Tap any image to view credits and license.