We have a client that uses multi-tenant database where each database holds data for each of their customers. Whenever a new customer is added, a service dynamically creates a new database. In order to seed this new database we were tasked to implement a feature to copy data from existing “demo” database.
The “demo” database is actually a live client where sales team does demo. This ensures that the data that is copied is fresh and not stale.
We implemented a solution where we simply listed all the tables in namespace and used activerecord-import
to copy the table data.
activerecord-import gem to keep code agnostic of underlying database as we used different databases in development from production.
Production is “SQL Server” and development database is “PostgreSQL”.
Why this project ended up having different database in development and in production
is worthy of a separate blog.
When we started using the above mentioned strategy then we quickly ran into a problem. Inserts for some tables were failing.
The issue was we had foreign key constraints on some tables and “dependent” table was being processed before the “main” table.
So initially we thought of simply hard coding the sequence in which to process the tables. It means if any new table is added then we will have to update the service to include the newly added table. So we needed a way to identify the foreign key dependencies and determine the sequence to copy the tables at runtime. To resolve this issue, we thought of using Topological Sorting.
To get started we need the list of dependencies of “main” and “dependent” tables. In Postgresql, this sql query fetches the table dependencies.
The above query fetches all the dependencies for only the tables have namespace or the tables we are interested in.
The output of above query was
[[dependent_table1, main_table1], [dependent_table2, main_table2]].
Ruby has a
TSort module that for implementing topological sorts.
So we needed to run the topological sort on the dependencies. So we inserted the dependencies into a hash and included the
TSort functionality into the hash. Following is the way to include the
TSort module into hash by subclassing the
Then we simply added all the tables to dependency hash, as below
The output above, is the dependency resolved sequence of tables.
Topological sorting is pretty useful in situations where we need to resolve dependencies and Ruby provides a really helpful tool
TSort to implement it without going into implementation details. Although I did spend time in understanding the underlying algorithm for fun.