February 17, 2015

SchemaCrawler [2]: Database Metadata Discovery

An Exclusive Guest Post by Sualeh Fatehi

Follow-up on a promise in my previous post on Friday, February 13th.

I am happy to have SchemaCrawler author/developer Sualeh Fatehi as a guest blogger for the 2nd time.

Q: Where does the word schema come from?

A: Its origin is the Greek word σχήμα (skhēma) which means shape, or more generally, plan. The plural is σχήματα (skhēmata). In English, both schemas and schemata are used as plural forms.

Explore Sualeh's fine piece of OSS to its many deeper levels.
Whenever you have some spare time left.
It'll be worth your investment.
Enjoy reading 4 now.

How do you explore a new database? Chances are that you have started a new job, and are quickly trying to get up to speed on how the data is laid out. Or, you are a consultant, trying to look worth the money, and speak intelligently about a client’s schema design? Maybe you just returned from vacation, and want to see what changes other developers have made to the database.

SchemaCrawler allows you to get to the heart of what you need very quickly. Let us say that you know that there is a piece of the application that deals with customers. You would expect that there would be columns called customer or something like that. You can ask SchemaCrawler to connect to the database, and use a regular expression such as .*CUSTOMER.*, to find all tables that have a column with the word “customer” in it. Then, you can find related tables, based on foreign key relationships. Finally, you can graph the result in a database diagram. Using this technique, you will find that you are very quickly able to get schema information that you are interested in – usually in a matter of minutes. SchemaCrawler is unique in the features that it offers for this kind of drill-down.

In a large database, you can have hundreds of tables. Most people try to print out this entire schema in the form of a diagram, and paste it on a blank wall for reference. This is not a productive way of getting to information, particularly in the modern age where people are used to Google searches. SchemaCrawler is a sort of “Google search” for database metadata. Then, instead of having one giant diagram on the wall, you would have many smaller diagrams, created on demand, for the particular research task at hand.

Consider the case when you came back from vacation, and the developers had been very active. You would not be able to find any of the new tables and columns that they would have created on the diagram on your wall. However, if you had saved your SchemaCrawler command-line from before your vacation, all you would have to do would be to rerun SchemaCrawler, and you would get a brand new diagram. SchemaCrawler would have found and accounted for all of the new tables and columns that you were not going to know about.

Here are some concrete examples, using Microsoft’s AdventureWorks database:
  1. Search for tables that have a column called VendorID (use -grepcolumns=.*\.VendorID)

    sc -host=**** -database=AdventureWorks -schemas=AdventureWorks\..* -user=**** -password=**** -infolevel=standard -routines= -tabletypes=TABLE -grepcolumns=.*\.VendorID -command=schema

  2. Plot a database diagram of tables that have a column called VendorID, with related tables (use -parents=1 -children=1)

    sc -host=**** -database=AdventureWorks -schemas=AdventureWorks\..* -user=**** -password=**** -infolevel=standard -routines= -infolevel=maximum -tabletypes=TABLE -grepcolumns=.*\.VendorID -parents=1 -children=1 -only-matching -command=graph -outputformat=pdf

Thank you so much for sharing, Sualeh!


February 13, 2015

A Happy Bloggy 2015 - Late? Nah! ;=O)

Hi 2Uall2 again,

Been away not posting for two months. Primarily caused by a happy private life, great peer networking and fine business while least expecting Murphy's Law & the whole enchilada.

Won't bother you with all that 4 2 long as 1ce in a while we all do receive our fair share of the famous pie..

1st, referring to this post's title, check out this tweet as a quickie.

2nd, I want to point you to 3 fine Oracle bloggers whom I've added to my blog list in the right margin:

  1. Ann Sjökvist on Oracle DB Standard Edition. She's doing a fabulous DBA job from Turku, Finland promoting this often overlooked edition. See you next month at #OUGN15 in Oslo, Ann!
  2. Matthijs Bruggink, a solid Dutch DBA posting on High Availability, migration strategies, (Logical) Data Guard, RAC and RMAN.
  3. Ronald Vargas, a fine blog, as in: "Blog Hispano Americano de Tecnologías Oracle, San José, Costa Rica, La adquisición de cualquier conocimiento es siempre útil al intelecto, que sabrá descartar lo malo y conservar lo bueno."
Follow these inspiring bloggers.

You won't be disappointed.

This post is short as the next is w8ing 2 B published...

Yep, SchemaCrawler, part 2..

By Sualeh Fatehi himself.

Bye 4 now


December 12, 2014

SchemaCrawler [1]: Notes & Lint by Sualeh Fatehi

I'm very proud to present my blog's 1st guest.

You can reach Sualeh Fatehi through mail at: sualeh.fatehi@gmail.com

 Sualeh Fatehi 

Sualeh is a fine developer and today he will
briefly introduce you to an open source brainchild of his.

In my opinion one of the best ever written as well as documented
pieces of open source software
that I personally use
on a daily basis.


A well-deserved stage.

So Sualeh, it is yours now..

SchemaCrawler - Notes

SchemaCrawler is intended to be a database schema exploration tool, serving many types of users.

For programmers, there is a rich Java API for metadata about database objects, and integration with Maven repositories. SchemaCrawler is also packaged as an OSI bundle. You can also script with a number of scripting languages, such as JavaScript, Python, Ruby, Scala, Groovy and so on. For continuous builds, there is ant and Maven build integration, along with reports for Maven-generated sites. Programmers can also build plugins for lint, and extend support for database specific metadata.

QA engineers can automate database tests by outputting expected results for metadata as well as data in SchemaCrawler’s easy to diff format, and compare against their test runs. This technique can be used to compare schemas between development and production servers as well, and also help keep track of schema changes over time.

Database architects and DBAs can quickly explore the structure of a database, using powerful grep functionality based on regular expressions. Then, they can use the results of their search to create ERDs. SchemaCrawler’s ERDs are very useful when schemas keep changing. You do not need to know what the changes are, but you can simply rerun the same SchemaCrawler command to find out. For example, if a team member added a new customer related column to a table when you were on vacation, you could add it to your database diagram simply by rerunning a SchemaCrawler command that searches for all tables with columns called CUSTOMER_ID. Of course, database architects would find the lint capability really useful to keep track of bad schema design practices, and even add some of their own custom linters.

SchemaCrawler tools are available using a command-line console application, modeled on regular Unix applications.

(Each one of the notes above can be expanded into its own blog post. An expansion on SchemaCrawler lint is below.)

SchemaCrawler Lint

A lot of database schema designers do not follow good design practices, simply because these are not documented. Everyone knows about normalization, and people generally take care to either normalize their tables, or deliberately denormalize them for performance reasons. However, what about “design smells”, such as tables that have spaces in table or column names, or names that are reserved words in the ANSI SQL standard, such as a column called COUNT? Or, foreign key and primary key columns that have different data types? SchemaCrawler can detect and report these kinds of design flaws, and many others. These are listed  HERE  . In addition, organizations may have their own design practices, for example that names of all tables relating to customers (that is, all tables with CUSTOMER_ID column) are prefixed with CUST_. SchemaCrawler allows you to write your own lint plugins to detect these. SchemaCrawler can be run in automated builds, and you can write tests that fail your build if a developer violates your rules.

thank you so much
4 being my 1st guest,
& wishing all of you
a nice weekend

PS: remarks & suggestions - as always - are most welcome

November 23, 2014

Oracle (Undoc'd) Parameters - Through Bash?

Short answer: "Yash!"

.. which you could already have interpreted as "Yuck! Why Bash?"
.. Shean Connery aka 007 shaying "Yesh" to lovely Missh Moneypenny
.. or perhaps even as "Bafh!", pronounfed with a mouffful

Anyway, none of the aforementioned interpretations do apply.

I simply felt like wrapping SQL*Plus in a flexible bash script.

Play with it or learn from code, which can obviously be improved.

Curious to see if this thingy finds its way to your toolbox.

PS: remarks & suggestions are most welcome

November 21, 2014

Topics I Expect 2 Publish

Lectori Salutem,

1. Welcome to my first blog.
2. Thanks for visiting it.

Enjoying an early afternoon in NL now, I want to List some topics I have had on my mind, lately. For starters, I will publish an article every 2 weeks for a few months to come.

Please, bear with me if I can not.. I'll at least try to do "the full monty".


  • Solr/Lucene, an open source enterprise search platform
  • ...

Well, That's All Folks! Happy reading, blogging or whatever U feel comfortable with.

Have a nice weekend.

November 18, 2014