The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “Data Warehouse”

Oracle OpenWorld 2012: Day 4

It was another beautiful, sunny day in San Francisco. I started the day, again, with some morning Chi Gung, then I enjoyed the morning keynote watching the big screen in Yerba Buena Gardens. Quite a pleasant way to listen to these talks.

It was a light day session-wise for me, but it did set off a few light bulbs.

The best session of the day (and for me the whole conference), was Gwen Shapira’s  (from Pythian) talk on building an integrated data warehouse with Hadoop. Gwen did a superb job of explaining what Big Data is and what it isn’t.

Her simple, and straightforward, definition:

Big Data Defined

The “cheaply” part seems to be the key. Oracle, and other databases, can handle really HUGE amounts of data. Petabytes in fact. But putting all that data into an RDBMS can cost a lot more money than having it stored in a less sophisticated file system on commodity drives (like HDFS).

So just having lots of data in your warehouse does not mean you have Big Data, you just have a Very Large Data Warehouse (VLDW).

She went on to expand the definition:

Big Data Defined 2

This part shed even more light on Big Data for me. This really helped clarify even more when you might be dealing with Big Data.

The talk was filled with lots of technical details, limitations,  and tools ( Sqoop, Flume,  Fuse-DFS) you can look at for integrating Hadoop into Oracle. Of course there are Oracle’s offerings as well, like Oracle Loader for Hadoop and Oracle Direct Connector for HDFS.

Gwen also gave use several use case examples that illustrated when to use Hadoop. Bottom line – learn to use Hadoop appropriately, not just because it is cool. With tech we can:

Make the impossible, possible. That might no make the possible easy.

If you went to OOW, find and download Gwen’s slides. And follow her on twitter (@gwenshap).

It was Big Data day for me. My other session was  Ian Abramson’s session on Agile & Data. Two of my favorite topics.

Ian discussed the Agile Manifesto and Big Data and how he has been able to use agile techniques to make his projects successful.

To start, here is his simple definition on Big Data:

What is Big Data?

Ian had a nice picture of the overall architecture as well:

Another Big Data Picture

To be successful in applying agile to data projects, Ian has determined the projects must be driven by data value – that is the sprint priorities are set based on the data that can best help the customer achieve their goals. To stay on track and keep velocity, it is important to have daily touch-points with the team members as well. Ian does a daily stand-up for 15 minutes.

Ian shared lots of details and answered a lot of my annoying questions too. He came up with a great tree graphic to illustrate important factors in having a high performance project:

Agile Tree

Again, find and download the slides once Oracle uploads them. In the meantime, follow Ian on twitter (@iabramson). A data-centric agilest is hard to find. For more on agile and data warehousing check out my classic white paper on the subject.

After Ian’s session I got to go to my first Oracle blogger meet up. It was nice to put more faces to names. Thanks to Pythian and OTN for sponsoring it.

Blogger Meetup

Then back to the hotel to pack and then stand inline (for an hour!) to get to the appreciation event and see Pearl Jam live. It was a good concert. Hard to beat live music outdoors!

Huge crowd for Pearl Jam

Pearl Jam Live!

Well that’s it for me on OOW2012. I am back home in Houston now and heading into the office tomorrow. Then I need to write another abstract or two for KScope13 and RMOUG TD2013. Then it will be time to plan for OOW2013 and The America’s Cup finals…

Nap time.

Kent

Oracle OpenWorld 2012: User Group Sunday

Yes, today was the first day for #OOW 2012. Affectionately known to many of us as User Group Sunday. Along with a ton of other activities, this is the day the various Oracle user groups get to “own” the agenda and put together the sessions they think Oracle customers, and their members, might want to see.

By users; for users.

For the 2nd year,  ODTUG asked me to curate their agenda. I was fortunate enough to “recruit” some great track leads who invited and vetted speakers and sessions to fill five rooms for most of the day. It was quite successful. (Thanks for the hard work guys.)

I attended quite a few myself and captured a few photos and thoughts. I was tweeting all day so you can also go to Twitter and search on @Kentgraziano to see my twitter stream.

After checking in at the User Group kiosk, I went to my first session given by Gwen Shapira and Robyn Sands who spoke about Flexible Design and Data Modeling. Great topic. They gave some very practical advice on do’s and don’t if you want to be more agile.

“Just good enough” does not scale.

Plan for Change

Worst Practices for Database Design

If you want some more modeling best practices, check out my ebook on Amazon: http://www.amazon.com/Check-Doing-Design-Reviews-ebook/dp/B008RG9L5E/.

Next I went on to see Kellyn Pot’vin and Stewart Bryson do a DBA vs Developers show down with No Surprises Development.

Release Planning Questions

Best advice – practice your deployments several times before going live…

Next: Guy Harrison talked about Hadoop, Bug Data, and Exadata. This was a very helpful intro talk about the space. I have been trying to wrap my mind around Hadoop, NoSQL, unstructured data, etc. and how we deal with it. Lots of great diagrams and examples to help explain.

Google’s Software Architecture

The Hadoop Ecosystem

Sigh…more to learn.

Next was a very interesting session by Mark Rittman about the Oracle Endeca software and how it can be used in a BI environment and how it compliments OBIEE.

This gives a quick view of what is involved with the Oracle Endeca Platform.

Oracle Endeca Information Discovery Platform

It looks like a very interesting platform that uses key value pairs to store the data. This enables search and analytics on some realtively unstructured data stores (i.e., not relational tables)

Final talk of the day (for me) was Jon Mead telling us about how they helped a customer develop event driven analytics using ODI and OBIEE and the Oracle Reference Architecture for data warehousing.

After all this, a  little break and networking, then on to the opening keynote.

It started with the Corporate Sr VP of Fujitsu  who talked about some cloud applications they have deployed in Japan. They have the Agricultural Cloud project to help farmers be more efficient and bring more and better crops to market. They also have developed a Healthcare Cloud Service for optimizing patient care and early diagnosis.

Very cool cloud applications.

Last up was CEO, Lary Ellison who announced Oracle 12c and Pluggable Databases (to support cloud deployments). I had heard about these (under NDA) at the Ace Directors meeting so now I can share a few pictures related to those since it is now public information.

Oracle 12c

Bigger, badder, faster…

Oracle Cloud Ecosystem

Pluggable Database Architecture

With PDB, you can develop a plug and play database. Many cool applications for this one.

To end out the day, I went to the 9th annual Oracle ACE dinner hosted by Oracle at the St Francis Yacht Club. Great food, drinks, and networking was had by all. Then back to the hotel to write this blog post.

Now off to bed so I can swim the bay with some other crazy people tomorrow morning. Wish me luck. Brrr.

Later.

Kent

ODTUG KScope: Day 5 – Happy Trails

Well the final day of KScope12 finally arrived and it was another hot one with the final sessions and the Texas heat. Another bright red sunrise greeted us as it has all week.

image

Today I managed to get a picture of the group that showed up for Chi Gung every day at 7 AM. We even had some new people today (officially the last day). They all enjoyed the sessions and learned (hopefully) enough to practice a bit once they return home.

I am grateful to all the participants for showing up early each morning with enthusiasm and a willingness to try something new. It made my job to lead them much easier. (There will be a You Tube video sometime next week for people to review, so stay tuned)

The first order of business for the day (after Chi Gung) was the official KScope closing session. Even though there were still two sessions to go afterward we had the closing at 9:45 AM. We were entertained, yet again, with some photo and video footage taken throughout the week, including one interview with me! We also learned who got the presenter awards for each track and for the entire event.

Then we all got beads to remind us to go to KScope13 in New Orleans.

Next was my final session for the event: Reverse Engineering (and Re-Engineering) an Existing Database with Oracle SQL Developer Data Modeler.

I had a surprising number of people for the last day after the closing session. I think there was about 70 people wanting to learn more about SDDM. Apparently most people are unaware of the features of the tool (which I have written about on several posts).

So, that was nice.

Finally I went to JP Dicjks talk about Big Data and Predicting the Future.

His basic premise is that we should now never throw away any data as it all can be used to extend the depth of analytics. We can react to events in real time and proactively change outcomes of those events.

The diagram above shows the basics of one way that data moves through the world and into the Hadoop file systems. I am oversimplifying but it is a cool diagram.

Part of the challenge is uncovering un-modeled data. I guess that is where the recent Oracle acquisition, Endeca, comes in with their Data Discovery tool (again oversimplifying) .

And that was pretty much it for the show. It was a great week with lots of learning and networking (and tweeting). We all had a good time and learned enough to make our heads explode.

I look forward to meeting folks again next year at KScope13 in New Orleans.

Kent

ODTUG KScope12: Day 4 – Another Day in Paradise

Well folks, it is really late/early so, for now I am just putting up some pictures without a lot of detail.

This was the sunrise that greeted me on the way to Morning Chi Gung. Going to be another hot one!

image

This is the view of the bluff and waterfall that we see every day when we practice Chi Gung on the lawn. Very soothing and relaxing. It definitely enhances the experience and feeling of connectedness to the earth.

First session of the day was Maria Colgan (the optimizer queen) talking about Tuning SQL in a Data Warehouse. A huge amount of information to digest. Mostly over my head but very useful for a data warehouse dba.

She did however forget her glasses this morning and could not really see the people in the back row too well. 🙂

Next up was Mr. Kevin McGinley (BI Track Lead) giving us his thoughts about Exalytics and what is meant by “Speed of Thought”.

This is a picture of his four kids before the session. They did Kevin’s introduction today. Quite cute.

Not sure where they got it from (just say’n). 😉

After Kevin’s entertaining talk, I went to see John Jeffries talk about Oracle Golden Gate. John is one of the world experts on Golden Gate having published the Oracle Golden Gate 11g Implementer’s Guide.

John had a nice diagram (below)  of what you can use Golden Gate for. Very useful.

After lunch I went to see Dr. Holger Friedrich who gave us a comparison between ODI 11g (Oracle Data Integrator) and OWB (Oracle Warehouse Builder). OWB is going away in the not too distant future so it is important for OWB shops to get a handle on it and start to lean about ODI. This presentation was a great start.

Holger is a very interesting and intelligent guy. He is from Switzerland and holds a PhD in Robotics and Machine Learning.

Not sure how he ended up doing Oracle data warehousing.

Tonight was our BIG EVENT: Dinner and Rodeo and Dancing at the Knibbe Ranch. It was hoot!

Here I am with my armadillo. I actually “won” an armadillo race.

We had a great BBQ dinner and a great country band to listen to. Hard find a bad band in this neck of the woods.

After dinner was the main event: rodeo. This is the big show: The board of directors for ODTUG got to ride into the arena on horseback for the opening ceremonies of the rodeo. It did appear that Edward Roske (conference chair) really knew how to ride a horse.

That’s it for now. I have to sleep a little before Morning Chi Gung. Then I have my last presentation tomorrow morning.

Good thing I tested everything and got ready before going to the special event.

Check back in a day or so as I will fill in some details on the technical presentations.

Adios for now.

Kent

ODTUG KScope12: Day 1 Symposium Sunday

Wow. What a day!

Started off with leading a Chi Gung class at 7 AM to about 18 attendees. Great start to the day.

Then it was off to the races with the kick off of the BI Symposium, chaired by Kevin McGinley. I got to be “interviewed” about my  Data Vault Modeling session on Monday ( I will report on that tomorrow) , along with several other presenters. That was followed by a lively talk show-style discussion led by Kevin and Stewart
Bryson. Below see the room and audience in attendance at 9:00 AM on a Sunday. (pretty good turn out – way better than last year!)

image

The panel discussion was followed by a series of talks from Oracle BI product management. There was lots of talk about mobile BI, Oracle’s acquisition of Endeca and of course BI in the Cloud.

(At this point I switched tracks to the Db development symposium chaired by Chet Justice aka @Oraclenerd)

The next talk I attended was by Kris Rice (@krisrice) who gave an intro to Oracle SQL Developer Data Modeler. (Nicely he plugged my Data Modeler talk on Thursday)

Some review (for me) and some new stuff too. I learned his trick for showing the joins between views – use the view to table utility to convert the views to tables, add PKs, then use the Discover Foreign Keys feature. This creates FKs based on column names and know PKs.

Cool trick. Just gotta remember to set “generate DDL” to “No”.

Quick switch back to the BI Symposium to see some screen shots of a new look and feel for OBIEE with modern mobile themes.

More coolness…especially if you are an iPad sort of geek.

Back to DB dev land (is it lunch yet?) to hear Oracle product manager Jeff Smith (@thatjeffsmith) take about full lifecycle development using SQL Developer.

Lots of great tips from Jeff about generating table api’s, using version control, doing schema diffs, and unit testing.

SQL Developer definitely has lots of features I did not know about. Being able to define unit tests inside the tool seems like a valuable option. I will be getting folks at my client site to try it out next week!

Oh yeah – he also mentioned DB Doc for creating HTML documentation  on your code because code is never really self-documenting. Gotta check into that more too…

<Lunch break – yummy Italian selection of salads and food>

Post-lunch back to BI and Mike Donohue from Oracle talking about reporting on data from “beyond the data warehouse.”

Heaven forbid! (well I guess we gotta deal with it now)

So, Mike talked a bit about how Endeca Information Discovery can be used to gain understanding and build analytics on big and unstructured data. Mentioned “faceted data model” and generating a key value store. Sounds cool. Have to look into that too.

Mike also discussed using BI Publisher to allow users access to local data (in Excel, XML, OLAP, etc)  so they can build their own reports. Scary thought but, in some businesses it will make sense because in reality not all data is in an ERP system or a well built RDBMS.

Whata gonna do?

<Back to DB Dev>

No to hear the world-famous Tom Kyte (of Ask Tom fame) talk about his approach to tuning. It was, as expected, a full house.

Tom’s main point was not to necessarily tune the specific problem query but more holistically to look at the overall algorithm (or approach) that was taken to solve the problem in the first place.

In his experience many queries can’t be tuned all that much when what was written was not even the best way to solve the problem. He gave quite a few eye-opening examples where there was simply a much better way to accomplish a task than the SQL that was originally written. Seems many situations really require re-engineering the solution.

A nice take away (that makes you go “duh”):

More code = More bugs

Less code = Less bugs

Moral of the story – find the simplest solution. If the code is really complex, you are probably wrong (or at least over complicating it). Try again.

Last symposium session for the day (for me) was Maria Colgan (Oracle) talking about tips to get the most out of the Oracle Cost Based Optimizer.

Maria is the queen of the optimizer. She explained what the optimizer will do in several situations and why and if it is wrong, what you need to change to get it right.

Okay – already on brain overload (and it is just day 1!).

Need sleep.

Have my own presentation tomorrow.

And Chi Gung at 7AM.

C ‘ya

Kent

P.S. There were lots of tweets all day with more pictures of the event. To see them look for #kscope and @ODTUG on Twitter (or follow me @kentgraziano).

Post Navigation