December 12, 2014

SchemaCrawler [1]: Notes & Lint by Sualeh Fatehi

I'm very proud to present my blog's 1st guest.

You can reach Sualeh Fatehi through mail at:

 Sualeh Fatehi 

Sualeh is a fine developer and today he will
briefly introduce you to an open source brainchild of his.

In my opinion one of the best ever written as well as documented
pieces of open source software
that I personally use
on a daily basis.


A well-deserved stage.

So Sualeh, it is yours now..

SchemaCrawler - Notes

SchemaCrawler is intended to be a database schema exploration tool, serving many types of users.

For programmers, there is a rich Java API for metadata about database objects, and integration with Maven repositories. SchemaCrawler is also packaged as an OSI bundle. You can also script with a number of scripting languages, such as JavaScript, Python, Ruby, Scala, Groovy and so on. For continuous builds, there is ant and Maven build integration, along with reports for Maven-generated sites. Programmers can also build plugins for lint, and extend support for database specific metadata.

QA engineers can automate database tests by outputting expected results for metadata as well as data in SchemaCrawler’s easy to diff format, and compare against their test runs. This technique can be used to compare schemas between development and production servers as well, and also help keep track of schema changes over time.

Database architects and DBAs can quickly explore the structure of a database, using powerful grep functionality based on regular expressions. Then, they can use the results of their search to create ERDs. SchemaCrawler’s ERDs are very useful when schemas keep changing. You do not need to know what the changes are, but you can simply rerun the same SchemaCrawler command to find out. For example, if a team member added a new customer related column to a table when you were on vacation, you could add it to your database diagram simply by rerunning a SchemaCrawler command that searches for all tables with columns called CUSTOMER_ID. Of course, database architects would find the lint capability really useful to keep track of bad schema design practices, and even add some of their own custom linters.

SchemaCrawler tools are available using a command-line console application, modeled on regular Unix applications.

(Each one of the notes above can be expanded into its own blog post. An expansion on SchemaCrawler lint is below.)

SchemaCrawler Lint

A lot of database schema designers do not follow good design practices, simply because these are not documented. Everyone knows about normalization, and people generally take care to either normalize their tables, or deliberately denormalize them for performance reasons. However, what about “design smells”, such as tables that have spaces in table or column names, or names that are reserved words in the ANSI SQL standard, such as a column called COUNT? Or, foreign key and primary key columns that have different data types? SchemaCrawler can detect and report these kinds of design flaws, and many others. These are listed  HERE  . In addition, organizations may have their own design practices, for example that names of all tables relating to customers (that is, all tables with CUSTOMER_ID column) are prefixed with CUST_. SchemaCrawler allows you to write your own lint plugins to detect these. SchemaCrawler can be run in automated builds, and you can write tests that fail your build if a developer violates your rules.

thank you so much
4 being my 1st guest,
& wishing all of you
a nice weekend

PS: remarks & suggestions - as always - are most welcome