The Force.com Query Optimizer
The Force.com query optimizer is an engine that sits between your SOQL, reports, and list views and the database itself. Because of salesforce.com’s multitenancy , the optimizer gathers its own statistics instead of relying on the underlying database statistics. Using both these statistics and pre-queries, the optimizer generates the most optimized SQL to fetch your data. It looks at each filter in your WHERE clause to determine which index, if any, should drive your query.
To determine if an index should be used to drive a query, the Force.com query optimizer checks the number of records targeted by the filter against selectivity thresholds. So if you had 2.5 million accounts, and your SOQL contained a filter on a standard index, that index would drive your query if the filter targeted fewer than 525,000 accounts.
(30% of 1 to 1 million targeted records) + (15% of 1 million to 2.5 million targeted records) = 300,000 + 225,000 = 525,000
For a custom index, the selectivity threshold is 10 percent of the first million targeted records and 5 percent all records after that first million. In addition, the selectivity threshold for a custom index maxes out at 333,333 targeted records, which you could reach only if you had more than 5.6 million records.
If the previous query were changed so that it used a filter on a field with a custom index, the threshold for 2.5 million accounts would be 175,000.
(10% of 1 to 1 million targeted records) + (5% of 1 million to 2.5 million targeted records) = 100,000 + 75,000 = 175,000
In these standard index and custom index examples, the Force.com query optimizer does use the standard and custom indexes, as each number of targeted records falls below the appropriate selectivity threshold. If, on the other hand, the number of targeted records exceeds an index’s selectivity threshold, the Force.com query optimizer does not use that index to drive the query.
Common Causes of Non-Selective SOQL Queries
There are several factors that can prevent your SOQL queries from being selective.
Having Too Much Data
Whether you’re displaying a list of records through a Visualforce page or through a list view, it’s important to consider the user experience. You might not have this much data in your current implementation, but if you don’t have enough selective filters, these long lists can easily become an issue as your data grows. Design your SOQL, reports, and list views with large data volumes in mind.
Performing Large Data Loads
Large data loads and deletions can affect query performance. The Force.com query optimizer uses the total number of records as part of the calculation for its selectivity threshold.
This number takes into account your recently deleted records. A deleted record remains in the Recycle Bin for 15 days—or even less time if you exceed your storage limit, and the record has been in the Recycle Bin for at least two hours—and then that record is actually removed from the Recycle Bin or flagged for a physical delete. When the Force.com query optimizer judges returned records against its thresholds, all of the records that appear in the Recycle Bin or are marked for physical delete do still count against your total number of records.
From our earlier example of accounts and a custom indexed field, the selectivity threshold was 175,000, and the total number of records was 2.5 million.
If your data loads cause the records targeted by your filter to exceed the selectivity threshold, you might need to include additional filters to make your queries selective again.
Using Leading % Wildcards
A LIKE condition with a leading % wildcard does not use an index.
This is the type of query that would normally work better with SOSL. However, if you need real-time results, an alternative is to create a custom search page, which restricts leading % wildcards and adds governance on the search string(s).
Note: Within a report/list view, the CONTAINS clause translates into ‘%string%’.
Using NOT and !=
When your filter uses != or NOT—which includes using NOT EQUALS/CONTAINS for reports, even if the field is indexed—the Force.com query optimizer can’t use the index to drive the query. For better performance, filter using = or IN, and the reciprocal values.
Consider this example. You have 1 million cases and a custom index on the Status field, which has the following values and distribution of returned records.
This query won’t use the index on the Status field because of the !=. Use the reciprocal values instead. Also note that if you take the four Status values from the updated query, the record count (50,000 + 20,000 + 30,000 + 20,000) meets the selectivity threshold.
The following query won’t use the index on the Status field because of the NOT. The problem with this query is that even if you change it to use the reciprocal values, the index on the Status field won’t meet the selectivity threshold because of the Closed value. In this case, you do want to add additional filters to reduce the number of records retrieved.
Note: Using a filter on an indexed field such as CreatedDate is always recommended, but this field was not included in the original query so that we could make a point about the selectivity threshold.
Using Complex Joins
Complex AND/OR conditions and sub-queries require the Force.com query optimizer to produce a query that is optimized for the join, but might not perform as well as multiple issued queries would. This is especially true with the OR condition. For Force.com to use an index for an OR condition, all of the fields in the condition must be indexed and meet the selectivity threshold. If the fields in the OR condition are in multiple objects, and one or more of those fields does not meet a selectivity threshold, the query can be expensive.
Filters on formula fields that are non-deterministic can’t be indexed and result in additional joins. Common formula field practices include transforming a numeric value in a related object into a text string or using a complex transformation involving multiple related objects. In both cases, if you filter on this formula field, the Force.com query optimizer must join the related objects.
If you have large data volumes and are planning to use this formula field in several queries, creating a separate field to hold the value will perform better than following either of the previous common practices. You’ll need to create a workflow rule or trigger to update this second field, have this new field indexed, and use it in your queries.
“Explore – Techila Global Services, A Salesforce development company”