The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “big data”

#Kscope16 Blog Hop: #BigData and #AdvancedAnalytics Sessions Not to Miss

You are attending #KScope16 right?

Me too.

But there are so many sessions to choose from (mine included), which do you pick? How do you pick?

Well, I (and my fellow bloggers) are here to help you out with a Blog Hop. We are going to give you our top picks for for each track. In this post, I will give you my picks for the Big Data and Advanced Analytics track.

Big Data and Advanced Analytics Sessions

Why did I pick that this track? Really because it is a necessary adjunct to BI and Data Warehousing. In fact I find it hard to imagine that these two really won’t merge over the next few years (at my company, Snowflake, it really has already). Every company that is investing in BI/DW is also finding that they need to deal with Big Data too. And Advanced Analytics is, to me, the logical extension to BI.

So after looking at the agenda, really most of the sessions are of interest to me (sigh). But in reality I am sure I will not be able to attend them all, so here are my top 5 picks to see at KScope16:

  1. How to Build an Internet of Things Data Pipeline presented by Rex Eng
  2. Oracle Big Data Discovery: Extending into Machine Learning and Advanced Visualizations presented by Mark Rittman
  3. Introduction to Apache Kafka and Real-Time ETL presented by Gwen Shapira
  4. Getting Started with a Data Discovery Lab: You Don’t Have to Go Big to Gain Big presented by Kathryn Watson
  5. Getting Started with Oracle R and OBIEE presented by Kevin McGinley
 Why those? Simply because they hit on all the top issues and topics that see being discussed (or written about) in the field, and I need to get a better grip on these things:
  • IoT – it is here already
  • Machine Learning – I am pretty clueless about this one so far
  • Kafka – ETL/ELT in the cloud
  • Data Discovery – the next step beyond BI
  • R – the language of choice for data scientists

And I actually know all of but one of the presenters, so am sure they will be very informative and lively talks.

The rest of the blog hop:

Thanks for attending this ODTUG blog hop!

Looking for some other juicy cross-track sessions to make your Kscope16 experience more educational? Check out the following session recommendations from fellow experts!

I hope this gives you some great ideas on what to see at KScope16!

See you in Chicago.


The Data Warrior

P.S. Don’t forget to make time to attend my Morning Chi Gung sessions down by the river to get each day started right with a clear mind and strong heart. Look for signs at the hotel.


KScope13 Day Four: Agile, Big Data, and a Very Special Event

Mid-week. Hump day. The day of the BIG event for KScope13.

Lots of anticipation for the annual Special Event… (which I will write about in a minute or so)

Morning Chi Gung as usual, but with 24 people showing up. Biggest group this week. We even have a few locals joining us now. Everyone seems to be enjoying these sessions.

KScope attendees starting the day with Morning Chi Gung on the plaza in front of Harahs casino.

KScope attendees starting the day with Morning Chi Gung on the plaza in front of Harrahs casino.

In fact, the Chi Gung class at KScope may be the original cross over session! Attendees are from across the spectrum from DBAs, to developers, to Hyperion/EPM folks to spouses of attendees.

There is something for everyone in Morning Chi Gung.

Kanban and Scrum

Everyone wants to be “agile” these days. Stew Stryker of Dartmouth University came to KScope to share with us his experience in applying first Kanban then Scrum to the software development life cycle in his IT department.

Stew Stryker, from Dartmouth College, discusses how his team has use Kanban, and now SCRUM, to improve their software development process.

Stew Stryker, from Dartmouth College, discusses how his team has used Kanban, and now SCRUM, to improve their software development process.

One of Stew’s insights was that to effectively implement a change in methodology like this and get adoption it is first necessary for the powers-that-be to recognize the current approach (usually water fall) is failing.

If you do not know you have a problem, there is no motivation to fix it, right?

A key recommendation he had was to get a consultant that knows and has implemented Kanban for database projects to come in and work with you. Don’t try to do it by just reading articles and books or going to training. There are too many nuances and organizational dynamics to account for.

A simple comparison of aspects of a traditional waterfall methodology compared to the Kanban approach.

A simple comparison of aspects of a traditional waterfall methodology compared to the Kanban approach.

Another key to succes was to prevent context switching – that is keep everyone focused on the task at hand for the duration of the interval (or sprint). He did a great little exercise with us that really showed how task switching costs a lot of time. In some case up to 10 times longer.

It was great to hear real world experiences that we could all take back to our offices and implement and discuss. His team has experienced some great success but with lots of lessons learned, which he shared.

They have now switch to SCRUM with even more success.

Hands On Lab #2

I attended my second lab of the conference to learn from Maria Colgan (@SQLMaria) on how to prevent sub-optimal plans on SQL Statements.

Oracle Senior Product Manager Maria Colgan walks us through how to analyze and and tune some queries.

Oracle Senior Product Manager Maria Colgan walks us through how to analyze and tune some queries.

It was a great session using the Oracle Demo Days virtual box image again (from OTN). Maria walked us through several queries with Explain Plans that did not seem quite right and showed us how to diagnose and fix the potential problems.

It was a little tough for those of us who have not used Linux/Unix or command line in a few years but I did learn a lot and should be able to apply that knowledge when we have poor performing queries at my clients. Worse case, I can always start up the vm again and run through the lab.

Inside the Oracle 12c Opimizer

Another killer session from Maria showing us enhancements and new features to the query optimizer in the recently released Oracle 12c.

Overview of how adaptive query optimization works on Oracle 12c

Overview of how adaptive query optimization works on Oracle 12c

How the new Adaptive Execution Plans work in Oracle 12c

How the new Adaptive Execution Plans work in Oracle 12c

The key phrase for 12c “self-healing” and “adaptive”. Remember when there were just 17 rules for the optimizer that we could control with the syntax of the query?

Long ago.

I guess this is better, but there are still rules to know to make the optimizer work well.

And Maria definitely knows them!

Big Data

These days every tech event has to talk about big data. KScope13 is no different.

Alex Shlepakov, from Accenture’s Oracle BI practice, gave a nice talk about integrating Hadoop with OBIEE using ODI.

He did a really nice job explaining all the concepts and moving parts and how Oracle addressed these things.

Alex presented about doing big data analysis using Oracle BI tools.

Alex presented about doing big data analysis using Oracle BI tools.

All the Oracle products that support the analysis of data in a Hadoop environment

All the Oracle products that support the analysis of data in a Hadoop environment

Pretty sure these products cost lots of money too! But if you want to get value out of your big data, you may have to spend big money for the tools to help (unless you have a lot of programers with really big brains).

My main take away from this session is that the tools to support Hadoop and big data analysis are evolving to make it easier for most programmers to get to the data without having to be Map Reduce programmers.

But it will still be pretty hard, so you better have a good business case for digging into it.

Special Event (aka the big party)

As in past years, ODTUG really did it up right. This was truly a special event to remember – we went to Mardi Gras World!

The annual Special Event was held at Mardi Gras World where we got to see some of the big floats from the famous parade.

The annual Special Event was held at Mardi Gras World where we got to see some of the big floats from the famous parade.

What a treat to see some of the big floats used in the famous parade. I even found a full scale replica of the Bat Boat tucked away in the back. (There was a huge Batman statue as well)

The Oracle Data Warrior finds Batman's boat!

The Oracle Data Warrior finds Batman’s boat!

The tour of the Mardi Gras warehouse included plenty of bead throwing from the floats by the board of directors and the various KScope vendors. This was followed by a nice evening of drinks and a buffet dinner with lots of great food (even some gluten free and vegetarian options).   There was plenty of dancing to great cover band called The Mixed Nuts.

We finished the evening with a spectacular fire works display (which seems to becoming a standard at this event).

We had a spectacular fireworks display (shot off a barge) at the annual KScope Big Event

We had a spectacular fireworks display (shot off a barge) at the annual KScope Big Event

Over too soon, it was last call, last dance, then back to the buses and a short ride to the hotel.

And then there were the after parties….

Stay tuned for my notes on our final day in New Orleans.



Oracle OpenWorld 2012: Day 4

It was another beautiful, sunny day in San Francisco. I started the day, again, with some morning Chi Gung, then I enjoyed the morning keynote watching the big screen in Yerba Buena Gardens. Quite a pleasant way to listen to these talks.

It was a light day session-wise for me, but it did set off a few light bulbs.

The best session of the day (and for me the whole conference), was Gwen Shapira’s  (from Pythian) talk on building an integrated data warehouse with Hadoop. Gwen did a superb job of explaining what Big Data is and what it isn’t.

Her simple, and straightforward, definition:

Big Data Defined

The “cheaply” part seems to be the key. Oracle, and other databases, can handle really HUGE amounts of data. Petabytes in fact. But putting all that data into an RDBMS can cost a lot more money than having it stored in a less sophisticated file system on commodity drives (like HDFS).

So just having lots of data in your warehouse does not mean you have Big Data, you just have a Very Large Data Warehouse (VLDW).

She went on to expand the definition:

Big Data Defined 2

This part shed even more light on Big Data for me. This really helped clarify even more when you might be dealing with Big Data.

The talk was filled with lots of technical details, limitations,  and tools ( Sqoop, Flume,  Fuse-DFS) you can look at for integrating Hadoop into Oracle. Of course there are Oracle’s offerings as well, like Oracle Loader for Hadoop and Oracle Direct Connector for HDFS.

Gwen also gave use several use case examples that illustrated when to use Hadoop. Bottom line – learn to use Hadoop appropriately, not just because it is cool. With tech we can:

Make the impossible, possible. That might no make the possible easy.

If you went to OOW, find and download Gwen’s slides. And follow her on twitter (@gwenshap).

It was Big Data day for me. My other session was  Ian Abramson’s session on Agile & Data. Two of my favorite topics.

Ian discussed the Agile Manifesto and Big Data and how he has been able to use agile techniques to make his projects successful.

To start, here is his simple definition on Big Data:

What is Big Data?

Ian had a nice picture of the overall architecture as well:

Another Big Data Picture

To be successful in applying agile to data projects, Ian has determined the projects must be driven by data value – that is the sprint priorities are set based on the data that can best help the customer achieve their goals. To stay on track and keep velocity, it is important to have daily touch-points with the team members as well. Ian does a daily stand-up for 15 minutes.

Ian shared lots of details and answered a lot of my annoying questions too. He came up with a great tree graphic to illustrate important factors in having a high performance project:

Agile Tree

Again, find and download the slides once Oracle uploads them. In the meantime, follow Ian on twitter (@iabramson). A data-centric agilest is hard to find. For more on agile and data warehousing check out my classic white paper on the subject.

After Ian’s session I got to go to my first Oracle blogger meet up. It was nice to put more faces to names. Thanks to Pythian and OTN for sponsoring it.

Blogger Meetup

Then back to the hotel to pack and then stand inline (for an hour!) to get to the appreciation event and see Pearl Jam live. It was a good concert. Hard to beat live music outdoors!

Huge crowd for Pearl Jam

Pearl Jam Live!

Well that’s it for me on OOW2012. I am back home in Houston now and heading into the office tomorrow. Then I need to write another abstract or two for KScope13 and RMOUG TD2013. Then it will be time to plan for OOW2013 and The America’s Cup finals…

Nap time.


Oracle OpenWorld 2012: User Group Sunday

Yes, today was the first day for #OOW 2012. Affectionately known to many of us as User Group Sunday. Along with a ton of other activities, this is the day the various Oracle user groups get to “own” the agenda and put together the sessions they think Oracle customers, and their members, might want to see.

By users; for users.

For the 2nd year,  ODTUG asked me to curate their agenda. I was fortunate enough to “recruit” some great track leads who invited and vetted speakers and sessions to fill five rooms for most of the day. It was quite successful. (Thanks for the hard work guys.)

I attended quite a few myself and captured a few photos and thoughts. I was tweeting all day so you can also go to Twitter and search on @Kentgraziano to see my twitter stream.

After checking in at the User Group kiosk, I went to my first session given by Gwen Shapira and Robyn Sands who spoke about Flexible Design and Data Modeling. Great topic. They gave some very practical advice on do’s and don’t if you want to be more agile.

“Just good enough” does not scale.

Plan for Change

Worst Practices for Database Design

If you want some more modeling best practices, check out my ebook on Amazon:

Next I went on to see Kellyn Pot’vin and Stewart Bryson do a DBA vs Developers show down with No Surprises Development.

Release Planning Questions

Best advice – practice your deployments several times before going live…

Next: Guy Harrison talked about Hadoop, Bug Data, and Exadata. This was a very helpful intro talk about the space. I have been trying to wrap my mind around Hadoop, NoSQL, unstructured data, etc. and how we deal with it. Lots of great diagrams and examples to help explain.

Google’s Software Architecture

The Hadoop Ecosystem

Sigh…more to learn.

Next was a very interesting session by Mark Rittman about the Oracle Endeca software and how it can be used in a BI environment and how it compliments OBIEE.

This gives a quick view of what is involved with the Oracle Endeca Platform.

Oracle Endeca Information Discovery Platform

It looks like a very interesting platform that uses key value pairs to store the data. This enables search and analytics on some realtively unstructured data stores (i.e., not relational tables)

Final talk of the day (for me) was Jon Mead telling us about how they helped a customer develop event driven analytics using ODI and OBIEE and the Oracle Reference Architecture for data warehousing.

After all this, a  little break and networking, then on to the opening keynote.

It started with the Corporate Sr VP of Fujitsu  who talked about some cloud applications they have deployed in Japan. They have the Agricultural Cloud project to help farmers be more efficient and bring more and better crops to market. They also have developed a Healthcare Cloud Service for optimizing patient care and early diagnosis.

Very cool cloud applications.

Last up was CEO, Lary Ellison who announced Oracle 12c and Pluggable Databases (to support cloud deployments). I had heard about these (under NDA) at the Ace Directors meeting so now I can share a few pictures related to those since it is now public information.

Oracle 12c

Bigger, badder, faster…

Oracle Cloud Ecosystem

Pluggable Database Architecture

With PDB, you can develop a plug and play database. Many cool applications for this one.

To end out the day, I went to the 9th annual Oracle ACE dinner hosted by Oracle at the St Francis Yacht Club. Great food, drinks, and networking was had by all. Then back to the hotel to write this blog post.

Now off to bed so I can swim the bay with some other crazy people tomorrow morning. Wish me luck. Brrr.



Oracle ACE Director Meeting: Day 2

So the 2nd day was as filled with brainiac conversations as day one…and most hush hush under NDA.

However, I was able to get shots of a few more slides today that were not under NDA.

First, here is the agenda. Sadly the Exadata session got canceled. 😦

Day 2 Agenda

Oracle has been recognized in the Gartner Magic quadrants in several areas related to integration & SOA.

You will see this as a major theme at OOW2012 – “Integration Everywhere”. Lots of sessions related to apps and the cloud.

Well, you have to have something about Big Data. What is Big Data? Opinions vary. At OOW you will hear that there is Big and there is Fast. Oracle believes they have constructed solutions that handle both.

Have you heard about Oracle Golden Gate? I have. My current client even did a pretty decent POC on it. Basically it is the newest approach to streaming and replicating data for a variety of use cases. It does replace Oracle Streams as the approach of choice for real and near-real time data movement.

The evolution of Oracle Portal – meet Oracle WebCenter.

My good friend JP came an talked to us about Big Data and did this great drawing about the architectures and where all the parts, tools, and engineered systems fit in the overall picture. Pretty cool. As you see below – no PowerPoint slides were used. Instead he used a tool called Paper Show and drew as he talked (like the old days with transparencies).

JP and his Big Data Picture

Here is just a shot of all the ACE Directors hard at work listening to a session, asking questions, pondering, taking notes, pictures, and of course Tweeting up a storm.

Oracle ACE Directors at Work

So that is it for my very first ACE Director meeting. Thanks to OTN and Oracle for putting this on and letting me participate. Quite the learning experience.

On to OOW and the Oracle Music Festival (with a rest for my brain first on Saturday).

I’ll give you another update in a few days (and maybe fill in a few blanks)



ACE Director and Oracle Data Warrior

Post Navigation

%d bloggers like this: