February 17, 2015

SchemaCrawler [2]: Database Metadata Discovery

An Exclusive Guest Post by Sualeh Fatehi

Follow-up on a promise in my previous post on Friday, February 13th.

I am happy to have SchemaCrawler author/developer Sualeh Fatehi as a guest blogger for the 2nd time.

Q: Where does the word schema come from?

A: Its origin is the Greek word σχήμα (skhēma) which means shape, or more generally, plan. The plural is σχήματα (skhēmata). In English, both schemas and schemata are used as plural forms.

Explore Sualeh's fine piece of OSS to its many deeper levels.
Whenever you have some spare time left.
It'll be worth your investment.
Enjoy reading 4 now.

How do you explore a new database? Chances are that you have started a new job, and are quickly trying to get up to speed on how the data is laid out. Or, you are a consultant, trying to look worth the money, and speak intelligently about a client’s schema design? Maybe you just returned from vacation, and want to see what changes other developers have made to the database.

SchemaCrawler allows you to get to the heart of what you need very quickly. Let us say that you know that there is a piece of the application that deals with customers. You would expect that there would be columns called customer or something like that. You can ask SchemaCrawler to connect to the database, and use a regular expression such as .*CUSTOMER.*, to find all tables that have a column with the word “customer” in it. Then, you can find related tables, based on foreign key relationships. Finally, you can graph the result in a database diagram. Using this technique, you will find that you are very quickly able to get schema information that you are interested in – usually in a matter of minutes. SchemaCrawler is unique in the features that it offers for this kind of drill-down.

In a large database, you can have hundreds of tables. Most people try to print out this entire schema in the form of a diagram, and paste it on a blank wall for reference. This is not a productive way of getting to information, particularly in the modern age where people are used to Google searches. SchemaCrawler is a sort of “Google search” for database metadata. Then, instead of having one giant diagram on the wall, you would have many smaller diagrams, created on demand, for the particular research task at hand.

Consider the case when you came back from vacation, and the developers had been very active. You would not be able to find any of the new tables and columns that they would have created on the diagram on your wall. However, if you had saved your SchemaCrawler command-line from before your vacation, all you would have to do would be to rerun SchemaCrawler, and you would get a brand new diagram. SchemaCrawler would have found and accounted for all of the new tables and columns that you were not going to know about.

Here are some concrete examples, using Microsoft’s AdventureWorks database:
  1. Search for tables that have a column called VendorID (use -grepcolumns=.*\.VendorID)

    sc -host=**** -database=AdventureWorks -schemas=AdventureWorks\..* -user=**** -password=**** -infolevel=standard -routines= -tabletypes=TABLE -grepcolumns=.*\.VendorID -command=schema

  2. Plot a database diagram of tables that have a column called VendorID, with related tables (use -parents=1 -children=1)

    sc -host=**** -database=AdventureWorks -schemas=AdventureWorks\..* -user=**** -password=**** -infolevel=standard -routines= -infolevel=maximum -tabletypes=TABLE -grepcolumns=.*\.VendorID -parents=1 -children=1 -only-matching -command=graph -outputformat=pdf

Thank you so much for sharing, Sualeh!


1 comment: