Cloud-based modern data analytics platforms make it easy to ingest massive volumes of data. It’s also all too easy to fall into the trap of neglecting how all this data is organised.
Imagine moving from a small apartment, where storage is at a premium and you need to organise everything meticulously, to a large house. You quickly fill the attic and garage with unmarked boxes of unsorted items. Six months later, you’re asking yourself: "Exactly where did I put the base for the Christmas tree?"
In other words, your data must be organised so that you can easily exploit it, whether you’re creating dashboards and visualisations for business users, feeding it into a recommendation engine, or using it to optimise processes such as production and delivery schedules.
Good organisation also helps you to create and maintain stringent governance and privacy controls, as well as making it easier to implement effective fine-grained security at the table, column and row level. In other words, with good organisation, you’ll find it more straightforward to prevent data losses and easier to comply with legal obligations such as GDPR.
What is a good organisation model?
We recommend following a factory-style model, with a multi-layered approach that allows the raw materials (source data) to be delivered, checked (cleaned), stored (archived), and finally manufactured (transformed) into high-value products (data assets).
Organising data for consumption or presentation is especially important. Data assets prepared for consumption typically consist of facts and dimensions:
- Facts describe business events, such as orders, ratings, and enquiries. Many organisations add millions of new facts each day to their data analytics platform and can quickly acquire Petabytes of data.
- Dimensions describe entities that change less frequently, such as people, places and products. It’s generally good practice to track history on dimensions to support easy point-in-time reporting.
How to transform cleansed data for analysis?
It takes skill and knowledge to transform cleansed data into the right set of data assets to support your organisation’s particular needs. For example, it’s usual to organise data assets in the presentation layer according to the business process they support (such as sales, logistics, marketing and operations), with a general pool of common assets, such as customers, employees and dates.
For Ancoris customer Causeway, however, we took a different approach. Causeway provides cloud solutions to the construction industry and we determined that, to provide insights to both Causeway’s product managers and the customers who use its systems, it was more appropriate to group assets around each of the company’s cloud products, rather than its business processes.
To find out more about how good data organisations fits into building a successful modern data analytics platform, why not download our white paper on the 7 rules for a successful modern data platform. Or come and talk to the experts in our Data Analytics team about your specific challenges and the opportunities that are open to you.