Monday, May 26, 2014

Populate a Sandbox in 5 minutes and learn some graph theory.

Every time I populate a Developer Sandbox with a snapshot of production data I smile. Today, in tracking down an interesting bug, I populated a sandbox with all won Opportunities since 1-Dec-2013 plus their associated

  • Accounts
  • Contacts,
  • Opportunity Line Items
  • Quotes & Quote Line Items
  • Quote PDF Documents 
  • plus a handful of other tables. 
How long did this take: 4 minutes, 27 seconds.

I restored: + all the records + all the links between the records + the attachments & notes
The tool I used was CopyStorm/Restore -- this is not a product promo but a discussion of the graph theory problem faced in restoring Salesforce data.

Anyone who has use data loaders (Salesforce or Jitterbit) knows that the routine for importing Account and Contact records is:

  • Import Accounts
  • Take the generated Ids from importing the Accounts and patch the Contact import file. 
  • Import the Contacts 
Of course, if there are Account->Account relationships or Contact->ReportsTo relationships then the problem is quite a bit more difficult.

The problem is inherently difficult because Salesforce supports almost arbitrary relationships between records.

Finding and restoring all of the relationships is fundamentally a well known graph theory problem (minimal spanning subtree) and has the unfortunate property of being an NP-complete problem.

Roughly, NP-complete means that the complexity of the problem grows at a rate faster than any polynomial as more records and relationships are added.

An exhaustive search is rarely possible for this class of problem if the # of records and relationships is very large -- an algorithm has to "know" something about the problem domain and efficiently "prune" the search space.

Oddly enough the "Salesforce Restore" problem is similar to a problem in computational chemistry. The problem: Given part of a molecular structure, find all molecules that match the pattern. Ring system make this a difficult problem in chemistry -- knowledge about possible ring systems make it do-able.