A search algorithm is defined as a math formula that takes a problem as input and returns a solution to the problem, usually after evaluating a number of possible solutions. A search engine algorithm uses keywords as the input problem, and returns relevant search results as the solution, matching these keywords to the results stored in its database. These keywords are determined by search engine spiders that analyze web page content and keyword relevancy based on a math formula that will vary from one search engine to the next.
Types of Information that Factor into Algorithms
Some services collect information on the queries individual users submit to search services, the pages they look at subsequently, and the time spent on each page. This information is used to return results pages that most users visit after initiating the query. For this technique to succeed, large amounts of data need to be collected for each query. Unfortunately, the potential set of queries to which this technique applies is small, and this method is open to spamming.
Another approach involves analyzing the links between pages on the web on the assumption that pages on the topic link to each other, and authoritative pages tend to point to other authoritative pages. By analyzing how pages link to each other, an engine can both determine what a page is about and whether that page is considered relevant. Similarly, some search engine algorithms figure internal link navigation into the picture. Search engine spiders follow internal links to weigh how each page relates to another, and considers the ease of navigation. If a spider runs into a dead-end page with no way out, this can be weighed into the algorithms as a penalty.
Original search engine databases were made up of all human classified data. This is a fairly archaic approach, but there are still many directories that make up search engine databases, like the Open Directory, that are entirely classified by people. Some search engine data are still managed by humans, but after the algorithmic spiders have collected the information.
One of the elements that a search engine algorithm scans for is the frequency and location of keywords on a web page. Those with higher frequency are typically considered more relevant. This is referred to as keyword density. It’s also figured into some search engine algorithms where the keywords are located on a page.
Like keywords and usage information, Meta tag information has been abused. Many search engines do not factor in Meta tags any longer, due to web spam. But some still do, and most look at Title and Descriptions. There are many other factors that search engine algorithms figure into the calculation of relevant results. Some utilize information like how long the website has been on the Internet, and still others may weigh structural issues, errors encountered, and more.
Why are Search Engines so different?
Search engine algorithms are highly secret, competitive things. No one knows exactly what each search engine weighs and what importance it attaches to each factor in the formula, which leads to a lot of assumption, speculation, and guesswork. Each search engine employs its own filters to remove spam, and even have their own differing guidelines in determining what web spam is!
Search engines generally implement two or three major updates every year. One simply has to follow the patent filings to know this. Even if you are not interested in the patent itself, they may give you a heads up to possible changes that will be following in a search engine algorithm.
Another reason that search engines are so diverse is the widespread use of technology filters to sort out web spam. Some search engines change their algorithms to include certain filters, while others don’t change the basic algorithms, yet implement filters on top of the basic calculations. According to the dictionary, filters are essentially “higher-order functions that take a predicate and a list and returns those elements of the list for which the predicate is true.” A simpler way to think of search engine filters are like you would think of a water purifier: the water passes through a device made of porous material that removes unwanted impurities. A search engine filter also seeks to remove unwanted “impurities” from its results.
Source: Seochat