Internet search engine or web search engine is software designed to carry out web searches on the World Wide Web in a systematic manner for specific information as per requirement. These search results are presented in sequence and referred to as search engine results pages (SERPs). The information returned by the web search is a combination of web pages, infographics, images, videos, articles, and research papers, among others. Search engine data is available on databases or in open directories. These search engines maintain real-time information by running an algorithm on a web crawler.
Search engines are developed in a number of configurations that reflect for applications they are specifically designed. Search engines such as Yahoo! and Google are able to crawl or capture, large volumes of data, and then deliver sub-second response times to queries submitted daily across the globe. Desktop search engines such as the Microsoft Vista search feature is able to rapidly incorporate documents, email, and web pages, as well as provide an intuitive interface.
Open source search engines are other essential types of systems that have different designs and goals than commercial search engines. There are primarily three types of systems based on particular interest such as Lemur, Lucene, and Galago. Lucene is a Java-based search engine that has been used for a wide range of commercial applications. Information retrieval techniques that are used are relatively simple. Lemur is an open source toolkit including Indri C++-based search engine. It is primarily being used by information retrieval researchers in order to compare advanced search methods. Galago is a Java-based search engine, based on Lemur and Indri projects.
Web crawlers are also known as robots, worms, wanderers, spiders, knowbots, and walkers. The first crawler – Wanderer – was developed in 1993. Google search engine uses different machines for crawling, and these crawlers consist of five primary components which operate in different processes. URL server processor reads URLs out of file and forwards them to numerous other crawler processes. In addition, each crawler process operates on different machines, which is single threaded. This process uses asynchronous I/O to obtain data from up to 300 Web servers at a time.
According to Statista, Google dominated the search engine market, maintaining 88.47% revenue share as of April 2019. Majority of Google’s revenue is generated through advertising. Google offers various services such as mail, enterprise products, productivity tools, mobile devices, and others.
Bing: Microsoft’s search engine, Bing ranks websites based on webpage content, number, and quality of the websites that link to pages, and relevance of the website’s content to keywords.
Yahoo!: Yahoo! search engine uses link analysis tools in order to determine page relevancy; however, content is important in deciding search relevance.
AOL: AOL, commonly known as America Online, is a brand of Verizon Media, and is an American web portal and online service provider company.
Baidu: Baidu is considered the most convenient search engine for customers in China as Google has been blocked by ‘Great Firewall of China’. Baidu accounts for over 75% of the China search engine market.
According to Statista in 2017, approximately 46.8% of the global population accessed the Internet, and this figure is expected to increase to 53.7% by 2021
According to Statista in 2018, approximately 52.2% of website traffic worldwide was generated through mobile phones, up from 50.3% the previous year