How will China’s search engine market change in the future?

Q.Now that many innovative search engines have entered China’s search engine market, will this have an impact on established search engines? In China, what will the future search engine market trend be like?

As shown in the figure, the data mainly comes from Bing
As shown in the figure, the data mainly comes from Bing

A. will not.

As shown in the picture above, “Industrial Quick Search” is only a Meta Search Engine, and the data comes from Bing Search. The meta search engine is to search on multiple search engines at the same time after accepting the user’s query request, and return the results to the user. Putting aside the question of whether the API is legal, whether there are search restrictions, etc., assuming that Bing has blocked the IP or domain name of the meta search engine, or the search volume exceeds the limit, he will not get any data. In fact, many meta search engines have died under the blockade of regular search engines. Now that Aol and Yahoo have a certain market share, the data actually comes from the cooperation with the Bing search engine, and they do not have their own crawlers.

Aol · Data comes from Bing
Aol · Data comes from Bing
Yahoo · Data comes from Bing
Yahoo · Data comes from Bing
Duckduckgo · Data comes from Bing, Yandex, Wolfram Alpha, DuckduckBot
Duckduckgo · Data comes from Bing, Yandex, Wolfram Alpha, DuckduckBot
Duckduckgo obtains data from more than 400 sources, mainly from Bing. Now it has its own web crawler: DuckduckBot

As the title says, without Baidu and Bing, there would be no such search engine. So let alone impact and replacement.

Keys to Search Engine Success

Obviously, it is impossible for a new search engine to compete with, or even surpass, existing search engines just by relying on perfect program and algorithm design. For example, Bing can surpass Baidu in terms of aesthetics and algorithm, strength, capital, and ease of use. However, few people even know it in China. Another example is 360 Search. Its design and algorithm are so bad, but it can directly grab market share with Baidu and take away a small number of users from Baidu.
PS: Baidu is the search engine with the largest market share in China, with a market share of 76.91%

Why? Because 360 company relies on its 360 browser with a large market share and 360 security software with a large number of installations, it only needs to compulsorily and bundle all the user’s browser homepages to 360 search.

Therefore, there is a good saying: In fact, most people don’t know whether he is using 360, Baidu or Bing (if the logo is hidden), they only know to type in the search box, and then click “Search”.

Difficulties of The new search engine

The new search engine must have a lot of money to invest in the server. However, even if the money is strong, your Bot (web crawler) will be directly blocked by a large number of websites and cannot crawl the data because it is not well-known. Because many existing search engines are also one of the content providers, their websites will also block new search engines.

For example, the new search engine in the Chinese market —— Toutiao Search. Baidu’s Baidu Q&A and other series of websites have all banned Toutiaospider (headline search crawler) from crawling, and even Baidu invested in Zhihu.com (China’s largest Q&A site, similar to Quora) also banned Toutiaospider from crawling. In other words, Toutiao search can’t actually get the content of many well-known websites under Baidu, and the blocking of large websites is fatal to search engines.

Toutiao search
Toutiao search

Explanation of Robots.txt:

User-agent: means crawler UA, for example, Bytespider is the crawler UA of Toutiao search.
Disallow: / means to prohibit web crawlers from crawling all pages. Allow: / means to allow web crawlers to crawl all pages.

Robots.txt is a text file webmasters create to instruct web robots (typically search engine robots) how to crawl pages on their website. The robots.txt file is part of the the robots exclusion protocol (REP), a group of web standards that regulate how robots crawl the web, access and index content, and serve that content up to users. The REP also includes directives like meta robots, as well as page-, subdirectory-, or site-wide instructions for how search engines should treat links (such as “follow” or “nofollow”).

In practice, robots.txt files indicate whether certain user agents (web-crawling software) can or cannot crawl parts of a website. These crawl instructions are specified by “disallowing” or “allowing” the behavior of certain (or all) user agents.

Baidu Q&A website blocked the crawler "Toutiao"
Baidu Q&A website blocked the crawler “Toutiao”
Zhihu.com blocks the web crawlers of the Toutiao search engine
Zhihu.com blocks the web crawlers of the Toutiao search engine

What benefits can be obtained?

Obviously, it is impossible for the new search engine to surpass the existing search engine with its advantages in design, beauty, ease of use, etc.

If you want to surpass the existing search engines, you may have to invest up to several times more than the existing search engines. However, the income from the search engines afterwards is likely to be far less than the invested funds.

Therefore, the sentence is still good: “At present, only the emergence of technologies that can completely surpass search engines can defeat the existing search engines. A better search engine cannot defeat the existing search engines.”

Default image
Marugu Fuyeor
Welcome to the Fuyeor :-) The fuyeor.com has videos, photography and tutorials. —— Marugu Fuyeor
Articles: 95

Leave a Reply