Pages

Thursday, August 2, 2012

Sharepoint Search (Index Server, Query Server and Crawl Server)

Updated on Oct 17,2013 - P.S. When I wrote this article, my references are based on SharePoint 2007(MOSS 2007).To a certain extent, this would still hold true to a certain extent for SharePoint 2010.But the SharePoint Search concepts would remain the same.
                 
 I had spent some time to understand the difference between Sharepoint Index Server, Query Server, and Crawl Server and how they are configured. At the end I learned that we can configure this in different ways. But it depends on how we want our farm to perform.Based on the usage of SharePoint, the number of users, and how extensively search is being used, we can have multiple configurations for the same purpose

.i.e.” Searching Content in Sharepoint"....

When we are planning to add a new server (either query or index server) to a farm we get 2 checkboxes:

"Use this server to index content" and
"Use the server for serving search queries"



What would be the difference? Do we check both to make to it a dedicated Index Server/Query Server?

This is a very important distinction, and the decision depends on your preferred architecture and performance.  The index checkbox is definitely a requirement for making it a dedicated index server. This gives it the role of building and storing the index.  The query role does not have to be on your index server.  You can instead use your web front ends (1 or more) as query servers.  This tells the index server to propagate its index to the WFEs that are set as query servers so that they have a local copy of the index. When someone does a search (this is done on the WFE), then that WFE will search itself locally instead of going across the network to query the index server.  This increases speed at the time of query, but it of course introduces additional overhead in terms of having multiple full copies of the index on the network and the network demand of propagating those index copies all the time. 

If you select the query role on your index server, then the index will not get propagated and all searches will query the index server across the network. To set WFEs as query servers, you have to activate the Office Search Service and only select the query checkbox, and then tell it where to store the index.



Another important role is that of crawling that is defined in the SSP settings.  The crawl server (or servers) is the WFE that the indexer uses for crawling content.  A new concept being used is that you can actually make your index server a WFE that isn't part of your normal web browsing rotation (not load balanced) and then set it as the dedicated crawler. This allows the indexer to crawl itself, which does two things: avoid the network traffic of building the index across the network and eliminates the crawling load on the WFEs.  Since your index server becomes an out-of-rotation WFE for regular browsing, you can actually use it to host your Central Admin and SSP web apps, which again reduces load/overhead on the content WFEs.

A query server can't be a query server unless it has the index.  Whenever you add the query role to a server, it asks for a file location, and the index gets propagated to that location on that server.  It's good to have the WFE as a query server so that the searches are fast (queries itself locally) and for some redundancy, since index servers cannot be made redundant.  If the index server goes down, WFEs still have a copy of the index for allowing searches with current info - they just don't get refreshed until the index server comes back online.

If we put query on the index server, then queries have to go from the WFE to the index server and back, which can cause a performance hit, but it's still doable.  We have to decide what is best for our situation.  Just remember that acting as a query server will compete with the very intense indexing process if they're on the same box.

One of the farms I worked in had the below architecture:


5 WFE (4 were load balanced and 1 of the WFE had the index and crawl roles and excluded from the load balanced rotation)
2 App Servers
2 DB Servers (mirrored)
2 Query Servers


No comments:

Post a Comment