The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the category “Data Modeling”

Oracle OpenWorld 2012: Day 4

It was another beautiful, sunny day in San Francisco. I started the day, again, with some morning Chi Gung, then I enjoyed the morning keynote watching the big screen in Yerba Buena Gardens. Quite a pleasant way to listen to these talks.

It was a light day session-wise for me, but it did set off a few light bulbs.

The best session of the day (and for me the whole conference), was Gwen Shapira’s  (from Pythian) talk on building an integrated data warehouse with Hadoop. Gwen did a superb job of explaining what Big Data is and what it isn’t.

Her simple, and straightforward, definition:

Big Data Defined

The “cheaply” part seems to be the key. Oracle, and other databases, can handle really HUGE amounts of data. Petabytes in fact. But putting all that data into an RDBMS can cost a lot more money than having it stored in a less sophisticated file system on commodity drives (like HDFS).

So just having lots of data in your warehouse does not mean you have Big Data, you just have a Very Large Data Warehouse (VLDW).

She went on to expand the definition:

Big Data Defined 2

This part shed even more light on Big Data for me. This really helped clarify even more when you might be dealing with Big Data.

The talk was filled with lots of technical details, limitations,  and tools ( Sqoop, Flume,  Fuse-DFS) you can look at for integrating Hadoop into Oracle. Of course there are Oracle’s offerings as well, like Oracle Loader for Hadoop and Oracle Direct Connector for HDFS.

Gwen also gave use several use case examples that illustrated when to use Hadoop. Bottom line – learn to use Hadoop appropriately, not just because it is cool. With tech we can:

Make the impossible, possible. That might no make the possible easy.

If you went to OOW, find and download Gwen’s slides. And follow her on twitter (@gwenshap).

It was Big Data day for me. My other session was  Ian Abramson’s session on Agile & Data. Two of my favorite topics.

Ian discussed the Agile Manifesto and Big Data and how he has been able to use agile techniques to make his projects successful.

To start, here is his simple definition on Big Data:

What is Big Data?

Ian had a nice picture of the overall architecture as well:

Another Big Data Picture

To be successful in applying agile to data projects, Ian has determined the projects must be driven by data value – that is the sprint priorities are set based on the data that can best help the customer achieve their goals. To stay on track and keep velocity, it is important to have daily touch-points with the team members as well. Ian does a daily stand-up for 15 minutes.

Ian shared lots of details and answered a lot of my annoying questions too. He came up with a great tree graphic to illustrate important factors in having a high performance project:

Agile Tree

Again, find and download the slides once Oracle uploads them. In the meantime, follow Ian on twitter (@iabramson). A data-centric agilest is hard to find. For more on agile and data warehousing check out my classic white paper on the subject.

After Ian’s session I got to go to my first Oracle blogger meet up. It was nice to put more faces to names. Thanks to Pythian and OTN for sponsoring it.

Blogger Meetup

Then back to the hotel to pack and then stand inline (for an hour!) to get to the appreciation event and see Pearl Jam live. It was a good concert. Hard to beat live music outdoors!

Huge crowd for Pearl Jam

Pearl Jam Live!

Well that’s it for me on OOW2012. I am back home in Houston now and heading into the office tomorrow. Then I need to write another abstract or two for KScope13 and RMOUG TD2013. Then it will be time to plan for OOW2013 and The America’s Cup finals…

Nap time.

Kent

Oracle OpenWorld Day 3

Another one for the books…

As before you can see a bunch of my activities on twitter (@kentgraziano and #oow). I posted a few pictures throughout the day.

For the first time I got to watch two keynotes live without being in the cavern of a hall for four hours. Actually sat outside in the shade in Yerba Buena Gardens and watched it all on a really big screen.

Lots of  “cloud” talk from Oracle and its partners. CEO Larry E did a cool presentation or real-time analytics on 4.9 billion tweets gathered for the 5 days following the summer Olympic games. It was quite interesting to see the analysis of structured and unstructured data together in a drillable dashboard environment.

And it was…you guessed it…all in the cloud.

Big Data meets Big Iron

Pretty good slogan…guess for doing big data analysis you will need some pretty hefty hardware too…

Also attended two of Jeff Smith’s (@thatJeffSmith) presentations. One on collaborative model development with SQL Developer Data Modeler (my personal favorite) and one on SQL Developer Tips and Tricks (mostly tips).

Source control in SDDM

I also got in the tail end of a session on Pluggable Databases (PDB). Got some really nice summary slides (be sure to click on them and zoom so you can read them). Lots of great details on how PDB works and what it can and cannot do.

All about pluggable database

 

The details on PDB

After the second keynote it was over to the OTN Lounge for the first ever Tweet Meet. It was designed to let people meet the person behind the twitter handle. I think it was a good success with a great turn out. We even got one of the OTN guys to create his own twitter account so he could follow the Oracle Aces and Ace Directors more easily.

Finally dinner in North Beach with some friends at the Stinking Rose. This has become an annual tradition.

After dinner I did manage to catch the last set from Joss Stone performing in Union Square. That pretty much rocked. Glad I went.

Tomorrow night the BIG appreciation event: Pearl Jam (if I remember to pick up my wristband to get in).

Kent

 

Five Days Only – Get it Free: A Check List for Doing Data Model Design Reviews

Later this week I travel to Oracle HQ for my first product briefing as an Oracle ACE Director. In celebration of this momentous event, I have decided to give all me readers and followers a gift:

For the next five days (Sept 24 – 28, 2012), my first solo Kindle book will be ON SALE for the low, low price of FREE!

Don’t delay. You can get it here: A Check List for Doing Data Model Design Reviews: Kent Graziano: Kindle Store.

In case you missed my earlier post about the book, here is a brief description:

Tired of crappy data models and whiney data modelers? Need to deliver a high quality design in a short period of time? Need a better way to enforce standards? As part of trying to be more “agile” in my approach to developing databases, I have adopted a concept from the agile world: peer reviews. Before any data model moves from analysis (logical model) into development (physical model), the development team needs to gather to review what the modeler has done. If the model passes the review (almost never on the first round), the physical model is constructed. The physical model is then subjected to a rigorous review as well (including metadata). Only then can DDL be produced and deployed. This guide book will discuss the actual modeling and design process I follow and give you a check list of questions to ask in any model review session. This is a “take no prisoners” approach that has left many a would-be data modeler in a withering heap, but in the end you will have solid models and designs that deliver value.

The book has been doing pretty good (sells for $2.99 normally) but it could do better. 😉 Currently it is #32 if you search for Data Modeling under Kindle ebooks.

Will you help me get it into the top 10?

[ Update: as of Sept 24, 2012 at 12:45 PM CDT the book is now #2 in the Kindle store for Databases! Thanks everyone. Let’s keep it rollin’]

[ Update #2: as of Sept 25, 2012 at 12:45 PM CDT the book is now #1 in the Kindle store for Databases! How long can we keep it there?]

Head on over to Amazon and get it today: A Check List for Doing Data Model Design Reviews.

Thanks a bunch. Hope you can put the information to good use.

Oracle ACE Director

Kent

P.S. Do me another favor? After you get the book (for FREE), please log back into Amazon and leave a review so other data modelers know if it is a worthwhile book for them to read.

P.P.S. Don’t forget to like this post! And click the Follow button (upper right) if you want to get my posts sent to your email directly.

Five ways to make Data Modeling Fun

While on my recent family vacation, I happened to mention I needed ideas for a blog post.

My son, all of nine years old, suggested the above title.

Hmmm…I said…not bad. That might work.

After all most people think data modeling booooorrring, right?

But for a few of us, it is kind of fun.

So then I asked him if he had any ideas how we could make it fun.

My son does not actually know how to do any data modeling (yet), but he has looked over my shoulder a few times and knows I draw pictures with boxes and connecting lines and words in the boxes.

With that bit of knowledge, he did come up with a few good ideas that really could make data model review sessions, a bit more fun, and maybe more effective.

Here they are:

Word Search

Put up a large version of a data model on the wall. Give the reviewers a list of words to find on the model diagram (you produce the list from your data dictionary).  Have them go to the diagram to highlight or circle the words one their list.

This will help get everyone familiar with the model and the layout of the diagram.

For more fun – form teams and keep score! Maybe even add a time limit per word.

Silly Sentences

If you don’t know how this works, you start with sentences with blanks in strategic areas. So the sentences may be missing nouns, verbs, adverbs, etc. You have someone fill in the blanks out of context – you ask for a noun but they have no idea what the sentence looks like until after you fill in all the blanks. (This game is in my son’s Nat Geo magazine) It can be quite funny.

One of the hardest parts of a logical model is naming the relationships.  Use this game to figure out the right sentences.

Start by writing the relationships with completely silly or even wrong verbs:

Each Customer must be found squatting at one or more Addresses.

Use your creativity to come up with goofy verbs for the relationships. Then get the users to “validate” the sentences.

I am sure they will be more than willing to correct your errors. 😉

Jeopardy

You all know how this game works – you get the answer and have to come up with the questions.

This is an interesting way to validate your entity and attribute definitions. Use entity definitions as the answers. Users have to guess the entity name.

For example: What is a customer?

Of course it will be really interesting to see if they can link definitions you got from them with the entity names in the model. You might get some clarifications in the process.

Data Model Haiku

You can do this with definitions or maybe relationship sentences. Trying to put the words in a specific form will make you really think about your understanding of the concepts (and force you to be succinct).

Each customer may

Be contacted by one or

More customer reps

Note for my  friends in the UK: Feel free to do Sonnets in Iambic pentameter.

Data Model Telephone

This is pretty much what happens anyway – you attend a meeting with the customer, they give you requirements, you take notes then try to build a model from those notes. You write out definitions and get them to review those. Chances are good you did not get it quite right.

So for fun, and to make a point about recording details carefully, get your team in a room and start at one end whispering a definition to the first person and have them pass it on. Write down the end result to compare to the definition in the model.

If the result is really funny, tell the customer at the next review meeting.

So what do you think? Can we make data modeling more fun?

Let me know your thoughts in the comments below.

If you have any fun ideas, please share those too!

Game on!

Kent

P.S. If you would like some other ideas on how to get better data models, check out my recent Kindle book on best practices for data model design reviews.

Best Practice: How to Create the Best Data Model Ever

A good data model, done right the first time, can save you time and money.

We have all seen the charts on the increasing cost of finding a mistake/bug/error late in a software development cycle.

Would you like to reduce, or even eliminate, your risk of finding one of those errors late in the game?

Of course you would! Who wouldn’t?  Nobody plans to miss a requirement or make a bad design decision (well nobody sane anyway).

No modeler worth their salt wants to leave a model incomplete or incorrect.

So what can you do to minimize the risk?

Well, if you are designing relational database or data warehouse systems, you can do your part by implementing a best practice approach to developing your data models.

What you need is a simple, repeatable process for reviewing your models.

Conceptual. Logical. Physical.

Years ago, a client asked me to help them develop a review process for their new data architecture committee. One that even a non-modeler could follow.

It had to be easy to follow and repeatable.

A checklist of what to look for and what to ask the modeler to make sure they got the best possible model.

It worked like a charm.

I have been using and refining that check list ever since.

It is amazing how many issues I have found over the years using this approach.

And I usually found them in early stages. They were also usually pretty small issues that were easy to fix at that stage.

A missing attribute definition.

A missing business key.

Incorrect cardinality or optionality on a relationship.

Small, but they would have been costly to fix if we had built the database with the original design and started coding the application, then found the mistake.

I imagine that you could probably benefit from using my process and  having this checklist handy to set up your very own data model design review process. Am I right?

So I decided to publish it and make it available to all my loyal readers and followers (even you lurkers out there!). 😉

As of today you can get your very own copy of the process details,  pre-review questions, and the review checklist for both logical and physical models in the convenient Kindle format for a crazy low price.

This is way less than you would pay for me or any other data model consultant to build one for you.

Even better, if you have Amazon Prime you can get it for free via the lending library. So try before you buy (you really do want your own copy to keep, honest).

So head on over to  Amazon and check it out.

Will you do me a favor?

If you like it and think it can help your friends and colleagues at other companies, then please post a review and be sure to tell them about over email, LinkedIn, or Twitter.

BTW – You don’t have to own a Kindle to get my book. You can download a FREE Kindle reader to your PC, MAC, iPhone, or Android device. So don’t worry…just get the book and tell your friends.

Happy Modeling!

Kent

P.S., If you have any ideas for other little reports I could provide, leave me a comment in the blog. Thanks!

Post Navigation