Optimized Way To Get Row Count From A Query Contains Large Amount Of Data

May 30, 2023 Post a Comment

i am using the below query to return rowcount for paging, it works fine but take very long to return, because all of the table have millions of records. currently its taking 7 sec

Solution 1:

I'd say to look at the indexes, but it probably won't help much, because a) you probably did it already, and b) you can get no seeks with this kind of a query, only scans.

The idea is to get rid of these ORs and allow the optimizer to produce a sound plan.

There are two options.

Don't know which version of SQL Server is in question, but if it's SQL 2008 SP1 CU5 (10.0.2746) or later, or SQL 2008 R2 CU1 (10.50.1702) or later, or anything newer than that, add an option (recompile) to the query. This should produce much better plan, using seeks on relevant indexes.

This will, however, add some recompile overhead to every execution, so maybe the second option is better.

You can rewite the query into dynamic one, and elliminate the NULL parameters before optimizer even see the query. I tried to rewrite your query, don't have your data so can't test it, and there may be some errors in it, but you'll get my intention nevertheless. And I had to guess the datatypes. (BTW, is there a specific reason for SELECT p.policyid?)

Here it is:

declare@qry nvarchar(4000), @prms nvarchar(4000);
set@qry= N'
SELECT count(*)
         FROM   resident (nolock) r 
                INNER JOIN resident_policy (nolock) rp 
                        ON r.residentid = rp.residentid 
                INNER JOIN policy (nolock) p 
                        ON p.policyid = rp.policyid 
                INNER JOIN policy_locations (nolock) pl 
                        ON pl.policyid = p.policyid 
                INNER JOIN location (nolock) l 
                        ON pl.locationid = l.locationid 
                INNER JOIN property (nolock) pr 
                        ON pr.propertyid = l.propertyid 
         WHERE  r.primary_resident = 0x1 '
if @ResidentFirstNameISNOTNULLset@qry=@qry+' AND R.firstname LIKE @ResidentFirstName + ''%'''  
if @ResidentLastNameISNOTNULLset@qry=@qry+' AND R.firstname LIKE @ResidentLastName + ''%'''
if @PropertyAddressISNOTNULLset@qry=@qry+' AND pr.address LIKE @PropertyAddress + ''%''' 
if @PolicynumberISNOTNULLset@qry=@qry+' AND p.policynumber LIKE @Policynumber + ''%''' 
if @LocationAddressISNOTNULLset@qry=@qry+' AND l.address2 LIKE @LocationAddress + ''%''' 
if @CityISNOTNULLset@qry=@qry+' AND pr.city LIKE @City + ''%''' 
if @ZipCodeISNOTNULLset@qry=@qry+' AND pr.zipcode = @ZipCode'
if @StateIdISNOTNULLset@qry=@qry+' AND pr.stateid = @StateId'
if @PolicyStatusIdISNOTNULLset@qry=@qry+' AND p.policystatusid = @PolicyStatusId'set@prms= N'@PolicyStatusId int, @StateId int, @ZipCode int,
@City varchar(50), @LocationAddress varchar(50), @Policynumber varchar(50), 
@PropertyAddress varchar(50), @ResidentLastName varchar(50), @ResidentFirstName varchar(50)'exec sp_executesql 
@qry, 
@prms,
@PolicyStatusId=@PolicyStatusId, @StateId=@StateId, @ZipCode=@ZipCode,
@City=@City, @LocationAddress=@LocationAddress, 
@Policynumber=@Policynumber, @PropertyAddress=@PropertyAddress, 
@ResidentLastName=@ResidentLastName, @ResidentFirstName=@ResidentFirstName

If you chect the execution plan you'll see the index seeks, provided you have nonclustered indexes on WHERE and JOIN columns.

Moreover, the plan will be cached, one for each combination of parameters.

Solution 2:

This is hard to answer because with huge bulk of data many things could happen.

In term of join, this should perform well. If this query is just here to perform a count, then I can just suggest you to do it directly SELECT count('x') without CTE and without (nolock).

SELECT@rowcount=count('x') as rc
FROM   
    resident r 
    INNERJOIN resident_policy rp 
        ON r.residentid = rp.residentid 
    INNERJOIN policy p 
        ON p.policyid = rp.policyid 
    INNERJOIN policy_locations pl 
        ON pl.policyid = p.policyid 
    INNERJOIN location l 
        ON pl.locationid = l.locationid 
    INNERJOIN property pr 
        ON pr.propertyid = l.propertyid 
WHERE  
    r.primary_resident =0x1AND ( ( @ResidentFirstNameISNULL ) 
        OR R.firstname LIKE@ResidentFirstName+'%' ) 
    AND ( ( @ResidentLastNameISNULL ) 
        OR R.firstname LIKE@ResidentLastName+'%' ) 
    AND ( @PropertyAddressISNULLOR pr.address LIKE@PropertyAddress+'%' ) 
    AND ( @PolicynumberISNULLOR p.policynumber LIKE@Policynumber+'%' ) 
    AND ( @LocationAddressISNULLOR l.address2 LIKE@LocationAddress+'%' ) 
    AND ( @CityISNULLOR pr.city LIKE@City+'%' ) 
    AND ( @ZipCodeISNULLOR pr.zipcode =@ZipCode ) 
    AND ( @StateIdISNULLOR pr.stateid =@StateId ) 
    AND ( @PolicyStatusIdISNULLOR p.policystatusid =@PolicyStatusId )

If this CTE is used for both rowcount and retrieve data from CTE be sure that you are retrieve only data for the page in question (only 20 elements with a ROWCOUNT() as RC and RC > 0 AND RC <= 20)

In database side, you can check if you have indexes for all of your join clause. It looks like there is only PK so they already have indexes. Be sure, you have index on joined columns.

If you continue to have trouble, use "execution plan in real time" fonction to see what the hell is going on.

LIKE condition can be a performance killer depending on the text size and database content. You can think about COLLECTION to store your texts and have some gain on text comparison.

Solution 3:

There are some general instructions:

Create Non clustered Index on All of your foreign-key columns
Create Non clustered Index on primary_resident columns
Include Actual Execution Plan when you run your query and see which part is wasting time
Put Statements that are more likely to be false at first
When you run your query SQL server will suggest you some hints, try them too

Spark Sql Blog

Optimized Way To Get Row Count From A Query Contains Large Amount Of Data

Solution 1:

Solution 2:

Solution 3:

Post a Comment for "Optimized Way To Get Row Count From A Query Contains Large Amount Of Data"