I recently had an opportunity to present at a Microsoft technology roadmap event here in the Denver area on PowerBI. In preparing for this presentation I had to take a stroll down memory lane… well more like a memory dark alley where monsters I’d rather forget are still lurking.
Back in the day, all of fifteen years ago, the process for delivering reports – the terms “BI” and “Business Intelligence” had yet to filter into regular use – was at best a clunky process. Reports were delivered through one of several “processes”
In the first process an executive would decide they needed reports on some topic. The IT department had a backlog of projects and could not meet this department or executives needs in a timely manner, so they found someone with basic Microsoft Access skills. The Access person would then ask for significant access (the puns just write themselves) to the production transactional systems.
As an aside, why were they writing reports off the OLTP system? Because so few of us were using OLAP back then. SQL Server 7 had only the most rudimentary OLAP capabilities.
In any event, the DBA would say no to the Access access request (the puns just keep coming). In my case the response was some variation of “over my rotting corpse”. The Access person would go to his or her executive leadership, who would then go to the CIO who would, in turn, order the DBA to grant access. The Access person would then start creating reports, usually accomplishing three things in the “process”:
- They would invariably crash the production database with a malformed query with a so many joins the query was as long as a Stephen King novel,
- Lock key tables causing errors in user applications and filling up file servers with thousands of Access databases, each containing one off reports that they want to keep in case these were called for at a later date,
- Get the DBA chewed out because the databases had become so slow and unresponsive.
Another “process” – by now you see I’m using that term loosely – would be the “never ending project.” IT would recognize they could not give unfettered access to untrained business analysts. Someone, typically a junior DBA, would be assigned to create reports. This was before reporting services so items like drilldowns were difficult to handle, so the process was iterative. The junior DBA would create a report; the business would review the report and come back with additional questions which required the DBA to create a new report and so on, repeating for a month then starting the process over again with the new month’s numbers.
We have, of course, evolved since these bad old days. It is a rare enterprise that has not built out more robust BI systems to generate reports using SQL Server, SSAS and SSRS or their equivalents in other RDMS. These reports have drill down capability and rich graphics. Microsoft began empowering self-service BI with the introduction of PowerPivot for both SharePoint and Excel. With the release of the full suite of PowerBI tools and the incorporation of those tools into Office 365, so many of the monsters lurking in the dark alleys of my memory have become powerless.
The Power of Familiar Tools
It would be difficult to identify a tool that has more familiarity to business users than Excel. My own mother, who views her computer as an electric typewriter that shows pictures of her grandchildren, understands Excel. By incorporating PowerBI into one of the most popular business tools available and enhancing SQL Server so that it can scale to handle multiple petabytes of data quickly and efficiently, Microsoft has significantly altered the terrain of BI.
There is, of course, still a need for “traditional” BI, including “canned” reports that are surfaced through SSRS or SharePoint. These reports are the metrics by which the health of the business is measured by executives over extended periods of time such as year over year sales. They must be surfaced in a consistent format on a regularly scheduled basis. Often these are reports that are given to investors, so it is vital that the underlying calculations are accurate, consistent and repeatable so investors can review performance over time.
Outside of these relatively few reports, however, there is a need to provide the business with a flexible set of tools that will allow them to explore, ask questions and present data in the most effective way possible for a given situation. Access, while useful, does not fit the bill. Access requires some ability to write code. Moreover, this specialized knowledge was usually held by the one “Access person” in a department. When that person left, the department would turn to IT to support the Access application, a chore most in IT found dull at best. The massive amounts of space taken up by access databases and the associated storage, backup and security overhead for these applications did not endear them to CIOs either.
Microsoft has addressed these issues with new tools for Excel and Office 365, collectively called PowerBI. Power Query allows users to gather data from multiple, disparate sources both internal and external to the enterprise. PowerPivot stores this data in a compressed column store format that requires remarkably small amounts of storage space for the data held.
Defining New Roles
These new tools create an interesting and, it seems to me, far more logical and efficient methodology for delivering BI. With the exception of a relatively few standardized reports exposed though dashboards and portals, IT can get out of the report writing business and do what it does best, provision data and computing power for users. The business can do what they are in a far better position to do, analyze that data.
This is, after all, a logical division of labor that plays to each group’s strengths. IT understands servers, data structures, data security, SQL, OLAP, MDX and a host of other technologies required to provide data to users. Each business department understands their relevant data far better and deeper than most IT professionals can. By giving IT a robust suite of tools to manage data in the SQL Server stack, while giving the business user a robust set of tools to analyze this data, Microsoft lets both the business and IT do what they do best while getting out of each other’s way.
Tools in Action
To illustrate just how easy it is for a user to create a report, let us create a sample report on internet sales by geography using the Adventure Works sample data warehouse. This will assume that the PowerBI plugins have been activated on Excel 2013. This particular user must create a report showing internet sales in the United States by geography and must present this data on a map.
In the past this would have been an involved report even for an experienced programmer. With PowerBI and some simple string concatenation, this report can be generated within minutes.
First, open Excel and navigate to a new spreadsheet. Navigate to the PowerPivot menu and select “Manage”
This will open the PowerPivot window. This can be used for importing data into Excel, relating the tables, if needed and managing the resulting database.
In the PowerPivot window select “From Database”
Since we are using AdventureWorksDW2012 as our data source select “From SQL Server” and a wizard will appear. Complete the database connection information and click “Next”
This needs to be as easy as possible for the user community. Generally, they do not know how to write SQL scripts, nor do they want to, so select the “Select from a list of tables and views to choose the data to import” radio button, then select “Next”.
We are required to create a report on total internet sales by location within the United States. Select the “FactInternetSales”, “DimCustomer” and “DimGeography” tables so that we can relate the sale to the buyer’s location then select “Preview & Filter”, making sure the “DimGeography” table is highlighted.
The report deals with sales in the USA only, so we want to limit the geography to the United States. The preview window will look very familiar to the user, as it appears the same as Excel when one is filtering data there. In the “CountryRegionCode” field, click on the drop down arrow and make sure “US” is the only country selected. Alternatively, select the “EnglishCountryRegionName” field and select “United States”.
Select “OK” then “Finish”. Once the import wizard is completed, click “Close”.
The user will now see the data from these three tables displayed in tabs similar to Excel. In addition, two areas are shown where one can add calculated measures, additional data or data derived from the existing data. We will be using the Bing mapping service (more on that later) to create the map display. This service requires geographic data in a proper format, so we will handle that requirement now in PowerPivot.
Select the “DimGeography” tab and scroll to the right where you will see the phrase “Add Column”. This is where we will concatenate the city and state fields for the Bing mapping service. On a very joyful note, we will not need latitude and longitude data for the map (and there was much rejoicing).
Click on “Add Column” so that the column is highlighted. In the function field type the following:
=CONCATENATE(CONCATENATE([City], “, “), [StateProvinceCode])
This will concatenate the fields containing the city name and state abbreviation to properly format the data for the Bing mapping service (e.g. Denver, CO). Rename the resulting column to “CityState” so we know what data is in there later.
One additional note before we leave the PowerPivot window. If you select the “Diagram View” button you will see that PowerPivot has already related the tables, capturing the relationships from the foreign key constraints that exist in the database already. I cannot emphasize enough the need for proper database design and enforcement of relational integrity for the data provided to the user community. This is where IT plays a key role. Expecting a business user to create and enforce these relationship is unreasonable and setting the user up to fail. Remember the division of labor here. IT provisions the data and the users create the reports or otherwise consume the data.
We have our data, so now we must create our visualization. Click on the “Pivot Table” button.
Select either “New Worksheet or “Existing Worksheet” and select “OK”. This will bring up a worksheet with the pivot table fields on the right.
We will measure internet sales by sales amount so select that field from the factInternetSales table. Note that Excel has placed this in the “Values” area and automatically totaled all sales. We want these sales by location so select the “CityState” field we created in PowerPivot. Note that we will not use any data from the customer dimension. That table is included to provide the relationship between the sales and the geography. Once selected the “City, State” values will appear as row labels in the spreadsheet.
You have no doubt observed a flaw in this report. It is really extremely boring. We need some graphical display to liven this up. Since this is sales by geography, we should add a map.
On the “Insert” menu item select “Map” and “Launch Power Map”
Again you will see the data fields on the right. Select the “CityState” field we created earlier. Note the locations are displayed on the map. Select “Next”. Here we can select our layers and the type of display we want. The default is a bar chart with heights relative to total sales for that location. You might also try bubble or histograms and I encourage you to play with this. I will accept the default for this purpose.
Select “Capture Screen” then past the resulting map into your spreadsheet.
Conclusion
Creating visually interesting reports is now easy with the array of powerful tools Microsoft has made available to both IT and the user community.