Find out exactly how a search engine works so you can improve your rankings and drive more traffic to the page. Read on to learn more.
Contents
Interested in a specific aspect of how search engines work? Use the links below to skip to a specific section within the article. If you want to know specifically about keywords check out this article on how search engines use keywords.
How a search engine like Google finds content
Indexing
Ranking algorithms
Understanding the Google algorithm
-Relevance
-Authority
-Trust
-Usability
Results type and personalisation
How a search engine like Google finds content
In this videoMatt Cutts from Google explains the basics of how Google works. We're going to go into a bit more detail than this video does. But it's a great primer to the content.
As mentioned in the video Google crawls the web using a bit of a code called a 'spider'. This is a small program that follows links from one page to the next and each page it lands on is copied and passed on to the servers. The web (hence spider) is huge, and as such if Google were to keep a record of all the content it found it would be unmanageable. This is why Google only records the page code and will dump pages it doesn't think are useful (duplicates, low value, etc).
Spiders work in a very specific way, hopping from link to link discovering new pages. This is why if your content is not linked to it won't get indexed. When a new domain is encountered the spider will first look for this page:
domain.com/robots.txt
Any messages you have for the spider, such as what content you want to be indexed or where to find your sitemap, can be left on this page. The spider should then follow these instructions. However, it doesn't have to. Google's spiders are generally well behaved through and will respect the commands left here.
You can find out more about how robots.txt works here, where we cover some of the more technical aspects of SEO.
The spider itself is a small, simple program. There are lots of open source versions which you can download and let loose on the web yourself for free. As vital as it is to Google, finding the content is not the clever bit. That comes next.
Indexing
When you have a large amount of content you need a way to shortcut to that content. Google can't just have one big database containing all the pages, which they sort through every time a query is entered. It would be way too slow. Instead, they create an index which essentially shortcuts this process. Search engines use technology such as Hadoop to manage and query large amounts of data very quickly. Searching the index is far quicker than searching the entire database each time.
Common words such as 'and', 'the', 'if' are not stored. These are known as stop words. They don’t generally add to the search engine's interpretation of the content (although there are exceptions: “To be or not to be” is made up of stop words) so they are removed to save space. It might be a very small amount of space per page, but when dealing with billions of pages it becomes an important consideration. This kind of thinking is worth bearing in mind when trying to understand Google and the decisions it makes. A small per page change can be very different at scale.
Ranking algorithms
The content has now been indexed. So Google has taken a copy of it and placed a shortcut to the page in the index. Great, it can now be found and displayed when matching a relevant search query. Each search you make in Google will likely have 1000's of results, so now Google needs to decide what order it's going to display the results in. This is really at the heart of SEO - adjusting factors to manipulate the order of results.
Google decides which query goes where through the algorithm. An algorithm is a genericterm whichmeans a process or rule-set that's followed in order to solve a problem. In reference to Google, this is the set of weighted metrics which determines the order in which they rank the page.
Understanding the Google algorithm
The Google algorithm is not the mystery it once was and the individual factors and metrics which it is made up of are fairly well documented. We know what all the major on-page and off-page metrics are. The tricky bit is in understanding the weighting or correlation between them.
If you searched for 'chocolate cake recipes' the algorithm will then weight the pages against that search term.
Let's take a simplified look at two metrics and how they might influence each other.
Metric 1 is the URL.The keywords might appear in the URL, such as: www.recipes.com/chocolate-cake
Google can see the keywords 'chocolate cake' and 'recipes' in the URL so it can apply a weighting accordingly.
Now on to Metric 2, the backlinks for the page. Lots of these might have the keywords 'chocolate cake' and 'recipes' in them. However Google would then down-weight this metric because if the keywords appear in the URL you would expect them to appear in the backlinks, relevant or not. Conversely, Google might choose to apply more weight to Metric 2 if the keywords didn't appear anywhere in the URL.
All the different factors Google looks at affect each other. Each one may be worth more or less (in the weighting) and the relationship between them is constantly shifting. Google issues hundreds of updates every year, constantly tweaking this. It is most commonly this relationship and weighting that's changed more than the metrics themselves. When this does happen it is usually in a more major update, such as Penguin or Panda.
The different metrics can be broken down into four key sections:
Relevance
How relevant is the content to the query? The indexer is the first test on this, determining if it should appear in the results at all. However, this is taken a step further in order to rank the keywords. It makes sense that when searching for something, you want to see the most relevant results possible.
Relevance is determined by a mix of on-page and off-page factors. Both of these focus on the placement of keywords, such as in page titles and anchor text. Some metrics are a combination of these. For example, if the domain as a whole is seen to be relevant to the search term, this is going to boost the relevancy score of the individual page being scored. If you want to find out more about this I recommend reading my article 'How search engines use keywords'.
Authority
Authority has its roots in PageRank, invented by Larry Page (hence the name). It’s the backbone of how Google ranks content. Understanding PageRank is part of the key to understanding how Google works, but it’s worth remembering that there are hundreds of additional factors which can also affect ranking, and PageRank is less important than it was in the past.
PageRank is often explained in terms of votes. Each link to a page is a vote, the more votes it has the better it should rank. If a page with a lot of votes links to another page, then some of that voting power is also passed on. So even if a page only has one link, if that link is from a page which has a lot of votes, it may still rank well and pages it links to will also benefit from that.The value passed from page to page via links is known as linkjuice or page juice.
Relevance is also important in the context of authority. A link with relevant anchor text may pass on more weight than a link which is not from a relevant site and does not have relevant anchor text, and which Google is more likely to disregard in the context of that search result.
Trust
This is an anti-spam algorithm, focused on making it harder to artificially manipulate the search results. Google has a love-hate relationship with SEO and the trust mechanism is part of it. On the one hand, lots of SEO is about creating great content and user experience. On the other, it's also about trying to artificially manipulate what Google has determined as the natural order of the results.
Trust metrics are very hard to manipulate and they give Google greater confidence in the other metrics. Things like the age of the content, or the domain are trust metrics. If you have lots of links from 'bad neighbourhoods' (think red light district) these links are not only going to be worthless but will also make Google think twice about ranking your site for that 'chocolate cake recipe' search. In the same way if the page or domain links out to bad neighbourhoods it's going to damage those trust metrics.
Google is actually a domain registrar, meaning they can see all the whois data for different domains. This allows them to incorporate information, such as how often a domain has changed hands or how long until the registration expires, into those trust metrics. These are much more difficult to manipulate.
Trust is also determined by the type of domain or page and what type links to you. With the opposite effect to a bad neighbourhood, academic sites such as .edu domains carry high trust. Other domain types may also have a high trust score, making links from them more valuable.
Usability
Google wants the content it displays in its search results to be attractive to humans as well as search engine robots. There is a set of metrics which is dedicated just to these factors. Having great content but then, for instance, covering it in ads is not going to make for a great user experience. This is why Google will down-weight a page where the ad placement is overly prominent.
Page speed is another important factor; pages that load too slowly are an annoyance to searchers, causing people to click back to the search results and pick another page. Google wants people to keep using Google and so it's in their interest that the results they show load quickly. They measure page speed from the HTML but may also use Chrome user data.
Results type and personalisation
If you're searching on a mobile phone that's going to display a different set of results than if you are searching on a desktop computer. The actual results returned from the indexer (so at a low level) will be different. It's not just device type which affects the results you see though, Google may choose to show results in an entirely different format depending on the search terms you use.
Localised searches are weighted differently and show in a different results page format to, for instance, product searches. You also have mixed media searches where Google may return results including videos and images. Some searches have dedicated results pages for a very narrow set of terms. These are commonly related to current events such as sports games or elections.
Another factor is personalisation. What you have previously searched for will influence the results that Google returns. There is a degree of machine learning at play here. So where someone searches for one type of result consistently Google will assume that future similar searches will be of the same nature. This is especially prominent for ambiguous searches, where one word has multiple meanings.
The rest of the Keyword Basics Series
Keyword Basics Part 2: Finding keywords
Keyword Basics Part 3: Understanding a keyword's structure
Keyword Basics Part 4: Targeting your primary and secondary keywords
Keyword Basics Part 5: How to narrow down your keyword list
Keyword Basics Part 6: Keyword mapping
Keyword Basics Part 7: Using keyword modifiers
Keyword Basics Part 8: Building keyword rich inbound links
FAQs
What type of algorithm does Google use? ›
PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results.
Does Google have an algorithm? ›Google's algorithms are a complex system used to retrieve data from its search index and instantly deliver the best possible results for a query. The search engine uses a combination of algorithms and numerous ranking factors to deliver webpages ranked by relevance on its search engine results pages (SERPs).
How many algorithms does Google use? ›You might already know that Google uses over 200 ranking factors in their algorithm… But what are they, exactly?
Which coding is used in Google? ›Go is an open source, strongly typed, compiled language written to build concurrent and scalable software. The language was invented at Google by Rob Pike, Ken Thomson, and Robert Griesemer.
What is Google's latest algorithm? ›Being an extension of Google's product review algorithm first launched in March 2021, this algorithm update is expected to reward sites that publish product reviews based on in-depth research rather than the ones that just rephrase the information that appears on the manufacturer's website.
What are the 4 types of algorithm? ›Introduction To Types of Algorithms
Brute Force algorithm. Greedy algorithm. Recursive algorithm. Backtracking algorithm.
Incidentally, Bharat was the person behind Google News. 8- Singhal famously re-wrote the original Google algorithm that was created earlier by Larry Page.
Is the Google algorithm an AI? ›But Google's "algorithm" isn't just one thing. It's a complex web of AI-powered algorithms that determine what search results appear and how they appear. Nobody on the outside has a complete understanding of how these algorithms work, but often try to learn elements of these algorithms in order to do good SEO.
How do I get into Google algorithm? ›- Optimize for Mobile. ...
- Audit Your Inbound Links. ...
- Boost User Engagement. ...
- Decrease Site Load Time. ...
- Avoid Duplicate Content. ...
- Create Informative Content. ...
- Avoid Keyword Stuffing. ...
- Don't Over-optimize.
These factors are: Keywords in Meta Titles, User Engagement, Trustworthiness, Mobile-Friendliness, and Page Speed.
Is the Google algorithm secret? ›
And though Google provides SEO insiders with frequent updates, the company's Search algorithms are a black box (a trade secret that it doesn't want to give to competitors or to spammers who will use it to manipulate the product), which means that knowing what kind of information Google will privilege takes a lot of ...
How does Google know everything? ›It's not just a theory, it's reality. The tech giant tracks your every move online and uses this knowledge to provide you with targeted ads. If you use Gmail, Google has even more of your personal data.
How does Google decide what comes up first? ›Google's Algorithm Ranks Pages Based On:
Relevance. Webpage authority. Content quality. Number of backlinks.
For example, Google has 2 billion lines of code, MacOS has 85 million lines of code, and Facebook has 60 million lines of code.
Is Google written in C++? ›Google search was primarily written in Java and Python. However, they have made some changes and now Google is a mix of those and also C and C++. They also use their own languages and frameworks. Google Chrome browser is also written in C++, Assembly and Python.
Is Google written in Python? ›In fact, Python is the authentic official language of Google, Besides, Java and C++. Python has significantly supported Google and in turn, they promote and support the language actively. Also, in most of Google's internal systems, the Python runs successfully and it is visible in many Google APIs.
How long does it take for Google algorithm to work? ›Most SEO experts agree that it takes 2 – 4 months to see the first results from SEO. However, factors like the website's quality, age, and authority can affect this period. In general, 94.6% of experts surveyed say that websites with good links and content can expect SEO results within 6 months.
How many times does Google change algorithm? ›Most experts estimate that Google changes its search algorithm around 500 to 600 times each year. That's somewhere between once and twice each day. While most of these changes don't significantly change the SEO landscape, some updates are significant and may change the way we go about writing for SEO.
What are the 7 algorithms? ›...
7 Algorithms Every Programmer Should Know
- Dijkstra's Algorithm. ...
- Merge Sort. ...
- Quicksort. ...
- Depth First Search. ...
- Breadth-First Search. ...
- Binary Search. ...
- Minimum Spanning Tree Algorithms.
- Tying Your Shoes.
- Following a Recipe.
- Classifying Objects.
- Bedtime Routines.
- Finding a Library Book in the Library.
- Driving to or from Somewhere.
- Deciding What to Eat.
What are the top 10 algorithms? ›
- 1950: Krylov Subspace Method.
- 1951: The Decompositional Approach to Matrix Computations.
- 1957: The Fortran Optimizing Compiler.
- 1959: QR Algorithm.
- 1962: Quicksort.
- 1965: Fast Fourier Transform.
- 1977: Integer Relation Detection.
- 1987: Fast Multipole Method.
The Merge Sort algorithm is by far one of the most important algorithms that we have today. It is a comparison-base sorting algorithm that uses the divide-and-conquer approach to solve a problem that once was a O(n^2). It was invented by the mathematician John von Neumann in 1945.
Is Google's algorithm public? ›Google's Response
Google has clearly stated in the past that it won't reveal its algorithm for two primary reasons: The algorithm is a business secret.
It's thanks to Persian mathematician Muhammad al-Khwarizmi who was born way back in around AD780.
What language is Google's AI written in? ›TensorFlow. TensorFlow is a Python library developed and open-sourced by Google to be used for AI programs. The library is used to write AI programs that utilize machine learning. It also has support for implementing neural networks and makes up most of Google's production AI services.
What machine learning algorithm does Google use? ›RankBrain is basically a deep neural network that is helpful in providing the required search results. It is one of the factors in the Google Search algorithm that determines which search pages are displayed.
How does Google work step by step? ›- Step 1 Crawling. When one types something in the search bar in the first place, it finds what pages exist on the web. ...
- Step 2 Indexing. advertisement. ...
- Step 3 Ranking.
As an Internet marketing strategy, SEO considers how search engines work, the computer-programmed algorithms that dictate search engine behavior, what people search for, the actual search terms or keywords typed into search engines, and which search engines are preferred by their targeted audience.
What is Google Base salary? ›Salaries at Google, Inc. range from an average of $72,476 to $176,297 a year. Google, Inc. employees with the job title Staff Software Engineer make the most with an average annual salary of $160,993, while employees with the title Data Center Technician make the least with an average annual salary of $58,868.
How does TikTok algorithm work? ›The watch time, relative to video length is what propels the TikTok algorithm to push the video and enable it to appear in users' FOR YOU feeds. The more users watch the video in its entirety, the greater chance it has of going viral.
How do Facebook algorithms work? ›
The Facebook algorithm determines which posts people see every time they check their Facebook feed, and in what order those posts show up. Essentially, the Facebook algorithm evaluates every post. It scores posts and then arranges them in descending, non-chronological order of interest for each individual user.
How does Instagram algorithm work? ›The Instagram algorithm is a set of rules that rank content on the platform. It decides what content shows up, and in what order, on all Instagram users' feeds, the Explore Page, the Reels feed, hashtag pages, etc. The Instagram algorithm analyzes every piece of content posted to the platform.
What are main types of SEO? ›- White-Hat SEO. When you hear someone say white-hat SEO, that means the SEO practices that are in-line with the terms and conditions of the major search engines, including Google. ...
- Black-Hat SEO. ...
- Gray-Hat SEO. ...
- On-Page SEO. ...
- Off-Page SEO. ...
- Technical SEO. ...
- International SEO. ...
- Local SEO.
Keyword Research
As you nail down your audience and industry norms for SEO, keyword research is necessary to pinpoint the best possible user intent to go after and find what your audience is searching for. But, not only that, what your audience searches for is just as important as how they search for it.
A Google penalty is a punishment against a website whose content conflicts with the marketing practices enforced by Google. This penalty can come as a result of an update to Google's ranking algorithm, or a manual review that suggests a web page used "black hat" SEO tactics.
What is the secret formula of Google? ›But the secret sauce is Google's patented formula for following and scoring every link on a page to learn how different sites connect, which means a site is deemed reliable based largely on the quality of the sites that link to it. bit of text or code.
What is the Golden Triangle Google? ›What is Google's Golden Triangle? The Golden Triangle is a distinct area of intense eye scan activity that is shown in the diagram below. It's important to understand that the Golden Triangle pattern is seen in first time visits to a results page.
What are 10 things Google believes? ›- Focus on the user and all else will follow. ...
- It's best to do one thing really, really well. ...
- Fast is better than slow. ...
- Democracy on the web works. ...
- You don't need to be at your desk to need an answer. ...
- You can make money without doing evil. ...
- There's always more information out there.
If you have a certain setting enabled on your Android phone, saying "OK Google" or "Hey Google" will cause it to listen for a command. Before you say this wake phrase, your phone is listening for the keywords, but is not recording everything you say and uploading it to Google.
Does Google know what I am thinking? ›From the instant autocomplete suggestions to answering questions directly in the search results, Google knows exactly what I'm thinking most of the time.
How many Google algorithms are there? ›
There are 9 types of Google algorithms are : Panda : Google Panda is a major change to Google's search results ranking algorithm that was first released in February 24, 2011. Google Panda check the quality of content.
What was the first Googled? ›But when it started out in 1998, it was reportedly serving 10,000 search queries per day. Google was conceived in a dorm room at Stanford University in the mid-1990s. The first search query on the engine was the name Gerhard Casper, then president of Stanford University.
What are the main 4 elements of content that Google wants? ›...
Good content has four elements:
- Relevant. People want content relevant to their interests. ...
- Intellectual. ...
- Sensorial. ...
- Emotional.
MUM stands for Multitask Unified Model, and is a type of AI that Google uses as a much more powerful version of BERT. MUM uses more powerful AI techniques to better understand context around searches, search intent, and searches in different languages.
What is Google's algorithm AI called? ›MUM, Multitask Unified Model, is Google's most recent AI in search. MUM was introduced in 2021 and then expanded again at the end of 2021 for more applications, with a lot of promising uses for it in the future.
What methodology is used at Google? ›OKR is a goal-setting methodology that has helped companies like Google, Intel, LinkedIn, and other Silicon Valley giants succeed.
What is the IQ of Google AI? ›(See also: Artificial Intelligence Will Add $15.7 Trillion to the Global Economy: PwC.) Researchers Feng Liu, Yong Shi and Yin Liu carried out tests throughout 2016, which ranked Google's AI IQ at 47.28, just shy of the average IQ they found for a human 6-year-old: 55.5.
Is Google's algorithm AI? ›Since 2015, Google has been utilizing RankBrain, a machine learning algorithm. It is how it facilitates processing search results and delivering more relevant answers to users. Google uses AI every time a user enters a search query, and the technology is constantly learning and improving.
How Google is using deep learning? ›It's Google Assistant speech recognition AI uses deep neural networks to learn how to better understand spoken commands and questions. Techniques developed by Google Brain were rolled into this project. More recently, Google's translation service was also put under the umbrella of Google Brain.
Who wrote Google Search algorithm? ›Incidentally, Bharat was the person behind Google News. 8- Singhal famously re-wrote the original Google algorithm that was created earlier by Larry Page.
Is Google translate an algorithm? ›
It doesn't translate individual words anymore. Instead, it uses an AI-powered neural machine translation algorithm that fetches the meaning from a broader context. And instead of mirroring the source text's word sequence, it tries to mimic the target language's grammar and syntax rules.
Is Panda a Google algorithm? ›Google Panda is a major change to Google's search results ranking algorithm that was first released in February 2011. The change aimed to lower the rank of "low-quality sites" or "thin sites", in particular "content farms", and return higher-quality sites near the top of the search results.
Which AI algorithm is best? ›- Linear regression. ...
- Logistic regression. ...
- Decision trees. ...
- Support vector machines (SVMs) ...
- Naive Bayes algorithm. ...
- KNN classification algorithm. ...
- K-Means. ...
- Random forest algorithm.
Google's mission is to organize the world's information and make it universally accessible and useful. That's why Search makes it easy to discover a broad range of information from a wide variety of sources. Some information is simple, like the height of the Eiffel Tower.
What are the coding skills required for Google? ›Preferred qualifications:
Experience with one or more general purpose programming languages including but not limited to: Java, C/C++, C#, Objective C, Python, JavaScript, or Go. Ability to learn other coding languages as needed.
Developers at Google use Python for a variety of system building, code evaluation tools, and system administration tools. Python can also be found in several Google APIs. The usage of Python has been growing especially heavily used for their data analysis, machine learning, artificial intelligence and robotic projects.