{"id":97,"date":"2013-03-22T11:36:24","date_gmt":"2013-03-22T11:36:24","guid":{"rendered":"http:\/\/ruirib.net\/blog\/?p=97"},"modified":"2013-03-23T00:43:31","modified_gmt":"2013-03-23T00:43:31","slug":"set_thinking","status":"publish","type":"post","link":"https:\/\/ruirib.net\/index.php\/2013\/03\/set_thinking\/","title":{"rendered":"Set thinking"},"content":{"rendered":"<style type=\"text\/css\"><\/style>It&#8217;s quite frequent to read about the efficiency of relational database systems when executing set based operations. Relational DBMSs are, indeed, efficient when executing set based operations especially when compared to cursor based operations. Of course, the debate can become quite heated, leading some people to\u00a0excommunicate\u00a0cursor based operations as the devil incarnate of\u00a0inefficiency. As in everything, such views regarding cursors are extreme, as there are situations where cursors may be a valid choice.\r\n\r\nWhile the focus is usually on avoiding cursors, there are other ways in which we can take advantage of the efficiency of set operations. A situation I faced recently involved a stored procedure that was being executed multiple times, with a single changing value each time. The situation is\u00a0illustrated\u00a0by the stored procedure shown below (Initial Stored Proc). The stored proc performs a few queries similar to one included.\r\n<pre class=\"height-set:true width-set:true width:90 width-unit:1 left-set:true right-set:true float-enable:true toolbar:1 show-lang:1 wrap:true scroll:true whitespace-before:1 whitespace-after:1 lang:tsql mark:7-10,26 decode:true\" title=\"Initial Stored Proc\">CREATE PROC myProc(@year int, @month int)\r\nAs\r\n\r\n-- Temporary table to hold just the relevant cost centers\r\n\r\nCREATE TABLE #CC (\r\n\tCCID int,\r\n\t CCName varchar(1000)\r\n\t);\r\n\r\n-- Add the cost centers to the temp table\r\nINSERT INTO #CC\r\nSELECT DISTINCT C.Id, C.Name\r\n\tFROM AccountsInCostCenters AI INNER JOIN CostCenters C ON AI.CostCenterId=C.Id\r\n\t WHERE AI.Year=@year or AI.Year=@year-1\r\n\r\n-- calculate the results and add them to the results table\r\n\r\nINSERT INTO DemoViewFinalResults(GroupID,AccountLevel,RowType,Value,[Year],[Month],(...),CostCenterID,CostCenterName)\r\nSELECT D.GroupID, D.GroupOrder,D.RowOrder,D.AccountID,A.[Description],D.AccountSign, D.AccountLevel,4,\r\n(SELECT SUM(F2.Value) FROM DemoViewFinalResults F2 \r\n\t\t\t\t\t\tWHERE F2.AccountID IN (SELECT C.AccountID FROM CalcRowAssociatedAccountID C WHERE C.RefRowID=D.RowID)\r\n\t\t\t\t\t\tAND F2.GroupID = D.GroupID AND F2.Month=@month and F2.Year=@year AND F2.CostCenterID=C.CCID) As Value,\r\n\t\t@year,@month,NULL,D.OriginalRow, C.CCID,C.CCName\r\nFROM DemoViewAccountsAssociatedRowModel D INNER JOIN Accounts A ON A.Id = D.AccountId, #CC C\r\nWHERE D.RowType=2 AND D.IncludeAccount=1;\r\n\r\n(...)<\/pre>\r\nThe stored procedure was being executed multiples times, one for each month needed, with a simple loop, as illustrated below.\r\n<pre class=\"lang:tsql decode:true\" title=\"Executing the initial stored proc\">DECLARE @month int\r\nDECLARE @year int\r\n\r\nSET @month = 1\r\n\r\nWHILE @month &lt; 8\r\nBEGIN\r\n\texec myProc @year,@month\r\nEND<\/pre>\r\nSome investment had been done making sure each query running in the stored procedure was optimized. The queries are used to implement a datasheet recursive model, so it was rather hard to avoid correlated subqueries and the individual query performance was good enough. The problem was that, in some cases, we needed to execute the stored procedure for multiple months and the overall execution time for was unsatisfactory, so we need to do much better in such situations.\r\n\r\nThe solution for the performance issue, was of course, in taking advantage of set operations. Instead of looping for each month, we could just execute the code once, if we had a way to have all the required year \/ month pairs in a table, just as we had done for the cost centers (you can, in line #30 see that the temporary table <strong>#CC<\/strong> is cross-joined into the query).\r\n\r\nThe approach used in the final stored procedure is shown below. Lines #23-34 show the new temporary table and the code used to add all the needed records, which are the ones resulting from a cross-join between the &#8220;previous version&#8221; table and a new temporary table containing all the month \/ year pairs.\u00a0<b>\r\n<\/b>\r\n<pre class=\"height:900 lang:tsql mark:23-33,40-40 decode:true\" title=\"Final stored procedure\">CREATE PROC myProcForMultipleMonths(@year int, @startMonth int)\r\nAs\r\n\r\nCREATE TABLE #Months_Years (\r\n\t[Month] int,\r\n\t[Year] int\r\n\t);\r\n\r\nDECLARE @tMonth int,\r\n\t\t@tYear int\r\n\r\nSET @tMonth = 1\r\n\r\nWHILE @tMonth &lt;= @startMonth \r\nBEGIN\r\n\tINSERT INTO #Months_Years\r\n\t\tSELECT @tMonth,@year\r\n\r\n\tSET @tMonth = @tMonth + 1\r\nEND\r\n\r\nCREATE TABLE #CC (\r\n\t CCID int,\r\n\t CCName varchar(1000),\r\n\t [Month] int,\r\n\t [Year] int\r\n\t );\r\n\r\nINSERT INTO #CC\r\nSELECT DISTINCT C.Id, C.Name, MY.[Month],MY.[Year]\r\n\tFROM AccountsInCostCenters AI INNER JOIN CostCenters C ON AI.CostCenterId=C.Id, #Months_Years MY\r\n\t WHERE AI.Year=@year or AI.Year=@year-1\r\n\r\nINSERT INTO DemoViewFinalResults \r\n\t(GroupID,GroupOrder,RowOrder,AccountID,AccountDescription,AccountSign,AccountLevel,RowType,Value,[Year],[Month],Reference,OriginalRow,CostCenterID,CostCenterName)\r\n\tSELECT D.GroupID, D.GroupOrder,D.RowOrder,D.AccountID,A.[Description],D.AccountSign, D.AccountLevel,4,\r\n\t(SELECT SUM(IsNULL(F2.Value,0)) FROM DemoViewFinalResults F2 \r\n\t\t\t\t\t\t  WHERE F2.AccountID IN (SELECT C.AccountID FROM CalcRowAssociatedAccountID C WHERE C.RefRowID=D.RowID)\r\n\t\t\t\t\t\t  AND F2.GroupID = D.GroupID AND F2.Month=C.month and F2.Year=C.year AND F2.CostCenterID=C.CCID) As Value,\r\n\t\t   C.year,C.month,NULL,D.OriginalRow, C.CCID,C.CCName\r\n\tFROM DemoViewAccountsAssociatedRowModel D INNER JOIN Accounts A ON A.Id = D.AccountId, #CC C\r\n\tWHERE D.RowType=2 AND D.IncludeAccount=1;\r\n\r\n(...)<\/pre>\r\nThis approach resulted in a very obvious performance improvement, since execution time was reduced to less than a third of the time needed to execute the first version of the procedure for the same number of months. The execution time is obviously reduced because operations that were being performed as many times as there were months, now are done just once, regardless of the number of months. The join including the year and month ensures that.\r\n\r\nThis major improvement was essential to validate this approach as a feasible solution to to initial performance problem we set to solve.\u00a0This overall result showed, yet again, the advantages of adopting a set based &#8220;thinking&#8221; and it made clear that we can avoid cursors and still fall into a similar performance trap. \u00a0The solution here \u00a0came from a temporary table, used in a cross-join. A simple solution with a remarkable impact on performance.","protected":false},"excerpt":{"rendered":"It&#8217;s quite frequent to read about the efficiency of relational database systems when executing set based operations. Relational DBMSs are, indeed, efficient when executing set based operations especially when compared to cursor based operations. Of course, the debate can become quite heated, leading some people to\u00a0excommunicate\u00a0cursor based operations as the devil incarnate of\u00a0inefficiency. As in&#8230;","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[58,57,55,47,54,253,56],"class_list":["post-97","post","type-post","status-publish","format-standard","hentry","category-sql-server","tag-cross-join","tag-cursor","tag-optimization","tag-relational-databases","tag-set-operations","tag-sql-server","tag-temporary-table"],"_links":{"self":[{"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/posts\/97","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/comments?post=97"}],"version-history":[{"count":64,"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/posts\/97\/revisions"}],"predecessor-version":[{"id":105,"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/posts\/97\/revisions\/105"}],"wp:attachment":[{"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/media?parent=97"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/categories?post=97"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ruirib.net\/index.php\/wp-json\/wp\/v2\/tags?post=97"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}