SOQL's Seven Deadly Sins
This article is not about how to be a better SOQL programmer. It is about pitfalls in SOQL that are obscure and often difficult to diagnose and understand when encountered.- The SOQL WHERE clause cannot be trusted.
- The SOQL LIMIT parameter is not always optional.
- Timestamps returned by SOQL believe in time travel.
- SOQL can return field data that is invalid based on the field
definition.
- SOQL has a love/hate relationship with tables containing lots
of custom fields.
- SOQL does not treat all table equally – not knowing the
exceptions can take down your org.
- SOQL chokes on tables with lots of functional columns.
Sin 1: The SOQL WHERE clause cannot be trusted.
Here is a trick question. Will the following SOQL statement return any records that were modified today?
SELECT id, name FROM Account WHERE
systemModStamp < LAST_YEAR
The answer is......usually not. However it will return a record modified today when:
- The record fit the WHERE criteria when the record set to
return was determined.
- The record is modified before it was fetched from Salesforce.
- In the web browser open the Salesforce GUI on the record that
will be the most recent one to match the LAST_YEAR criteria.
- Start a Python script that fetches records using:
SELECT
id,name,systemModStamp
FROM Account
WHERE systemModStamp < LAST_YEAR
ORDER BY systemModStamp ASC
FROM Account
WHERE systemModStamp < LAST_YEAR
ORDER BY systemModStamp ASC
- Modify the record in the Salesforce GUI after the Python
script starts fetching records.
- One of the last records you read from Salesforce will have a
modification date of today
Sin 2: The SOQL LIMIT parameter is not always optional.
Though the SOQL documentation says that the LIMIT parameter is optional there are a handful of Salesforce tables where it is nearly impossible to run a large query without the dreaded Query Timeout error. If you are not a believer then try running these queries on a mature Salesforce instance.- SELECT id, accountId, description FROM Task ORDER BY
SystemModStamp ASC
- SELECT application,browser,loginTime FROM LoginHistory ORDER
BY loginTime ASC
Table Name (or name pattern) | Good LIMIT Value |
.*History | 5000 |
.*History__c | 5000 |
.*Feed | 5000 |
.*Share | 5000 |
FeedComment | 5000 |
Task | 5000 |
EmailMessage | 5000 |
CaseComment | 50 |
FieldPermissions | 2000 |
LoginHistory | 2000 |
Sin 3: Timestamps returned by SOQL believe in time travel.
Here's another SOQL quiz. In the following code snippet, is it possible for the Exception to be thrown? For the quiz assume that all Account records have unique timestamp values and no records are deleted.
Account accts1[] = [SELECT id,
systemmodstamp FROM Account ORDER BY systemmodstamp ASC;
Account maxAcct = accts1[acct1.size() -
1]
Account accts2[] = [SELECT id,
systemmodstamp FROM Account WHERE systemmodstamp
<=:maxAcct.systemmodstamp ORDER BY systemmodstamp ASC;
if( accts2.size() > accts2.size()) {
throw new Exception('Timewarp – new
account created in the past!');
}
The answer....usually no. However, Salesforce does not guarantee
that records are committed in timestamp order.It is possible for a record with a SystemModStamp of 2014-12-31T00:12:12.000Z to be committed to Salesforce after a record in the same table with a SystemModStamp of 2014-12-31T00:13:12.000Z.
This is a hard “sin” to reproduce because it depends on the current load of the Salesforce servers. We discovered the problem by monitoring a high volume production instance for three weeks – the problem occurred just a couple of times during this period. It was sort of like searching for a Higgs particle.
The moral: Do not assume that the MAX(SystemModStamp) or MAX(LastModifiedDate) value is a good constraint to find all values that are older.
The solution: Do not rely on timestamps that are fresh. Give Salesforce a little while to commit all recent transactions. In practice this odd state seems to settle out on Salesforce servers after 10-120 seconds.
Sin 4: SOQL can return invalid field data.
This quiz should be easy for experienced SOQL users.- Is it possible for an invalid email address to be stored and
returned in a Salesforce Email field?
- Is is possible to store a 132 character string in a 128
character Salesforce field?
The answer to #2 is also “possible.” In fact Salesforce creates an email template with an invalid length subject when Sites is enabled.
There is other “invalid” data (related to currency values) lurking in the developer force instance – you will have to find this one yourself.
The morals:
- Just because Salesforce meta data says a field is 80
characters long does not mean it may have a longer value.
- Just because Salesforce meta data says a number is
DECIMAL(5,2) do not trust it. An API program may have inserted data
that does not match the spec.
- Do not be surprised if you read data from one Salesforce
instance and it fails to upsert into another instance – the source
data may be bad.
Sin 5: SOQL has a love/hate relationship with custom fields.
Though Salesforce allows many custom fields to be created on a table, using too many columns in a SOQL statement will cause a QUERY_TOO_COMPLICATED error. At least this “sin” is one Salesforce documents but after 10 years one would assume they would have a work-around in the back-end.If you really need to do the equivalent of “SELECT * FROM Account” and SOQL will not allow the read in a single statement what needs to be done? There are two problem to solve.
- Figure out the minimum number of SOQL queries to run. If none
of the columns are formulas then this can usually be done by using
500 columns for each query. If formula columns are involved the
problem is much harder – in this case see Sin #7.[
- Join the results of multiple queries together.
Sin 6: SOQL does not treat all tables equally.
The transaction cost of querying a table with SOQL varies a lot. The first time you query a high cost table and cause your production org to run out of transactions will probably be the last time.Here a SOQL quiz. How many Salesforce transactions does it take to run the following SOQL statement?
SELECT id, name FROM TableName LIMIT
2000
The “1 transaction per record” rule for Attachment records is documented. The same rule applies to QuoteDocument and ContentVersion but we have learned this from experience rather than documentation. In general 2000 and 1000 are the common number of records per transaction but the only way to know for certain is the write a bit of code and see how many come back in a bunch.
Sin 7: SOQL chokes on tables with many functional columns.
When Salesforce runs a SOQL query it turns it into SQL first. Once in SQL there is a 64K character query limit. If the 64K SQL query limit is passed then SOQL will return a QUERY_TOO_COMPLICATED exception.Here a SOQL quiz: Given a SOQL statement how can you accurately (or inaccurately even) guess that it will generated a SQL statement less than 64 characters?
The answer is......if table has a lot of formula columns you probably cannot guess accurately.
In practice, the addition of formula columns in a SOQL statement can cause the size of the resultant SQL to explode – sometimes. It all depends on the complexity and number of tables referenced by the formula.
- The complexity of the formula seems to matter. My guess is
that Saleforce is in-lining the formula calculation into the
generated SQL.
- A formula that references other tables likely causes the
table expression and predicate portion of the generated SQL to grow
rapidly.
- Sort column in ascending order by the length of their name or
length of their corresponding formula.
- Set maxColumn = total # of columns
- Run a query run using columns 1 thru maxColumn with a LIMIT
1.
- If the query succeeds then save the query, remove the query's
columns from the list of available columns.
- If the query fails then set maxColumn = maxColumn – 1 and
start step 3 again.
- Set maxColumn = # remaining columns
- If maxColumn > 0 goto Step #3, otherwise all queries have
been determined.
Credits
The “SOQL Sins” described in this article were accumulated over a number years while developing:- CopyStorm – keep a hot backup of your Salesforce
instance in SQL/Server, Oracle, MySQL, PostgreSQL, and H2
- CopyStorm/Restore – restore a backup to any
production, development, or sandbox instance. This is a restore tool
even your manager could use.
- SQLForce – a standard Python module for Salesforce.
Works with Python 2.8. 3.3, and 3.4. Use SQLForce when Apex is just
too hard.
No comments:
Post a Comment