There is more than one kind of data lineage

What is data lineage?

Data lineage draws a picture of where your data starts and where it ends up. This picture could contain different levels of detail depending on the audience that needs it. Data lineage can be created manually or with software – a key objective is to do them somehow.  

This concept means different things to different teams – and all of them are important. It’s important to understand why you need various data lineages to help your data governance initiative.   

Data governance includes the rules and procedures that firm’s use to manage and control data.  Data Lineage is an essential part of data governance because it identifies (and documents) where data starts, how it gets produced, how it transforms and moves through the firm’s systems, and how it gets to the end-users.  

There are different levels of detail within data lineage. Which level you use will depend on who uses the lineage and what they use it for.  

A business user will have a data lineage picture that focuses more on the who, when and how of the processes for producing, moving and using data. (This is also sometimes called data provenance.)  

A technical (as in I.T.) user will have a data lineage picture with granular details of the specific data tables, labels and fields and how they move, transform and get used across the firm.  

You need both kinds of data lineage for a complete picture (the data ‘map’) of what’s going on with the firm’s data.  

What do firm’s use data lineage for? 

Data lineage helps the firm ensure that reliable data is being used to drive business decisions. Without data lineage, it is nearly impossible for you to understand whether or not the correct data is being used, what it means, where it comes from, and whether or not it is complete.  

Data lineage can also help you fix issues or perform system migrations; it also enables you to ensure the confidentiality and integrity of data by tracking changes, how they were performed, and who made them.   

Some firms create these lineage documents manually by interviewing stakeholders and interrogating code; some firms use data visualisation (or lineage) software that examines the code for them.  

However your firm creates them, these lineage documents are the magic ingredient that helps you achieve trusted data and support for data governance.  

What is the most common data lineage mistake? 

It’s disheartening to speak to IT teams in law firms about data lineage. They are 100% confident that they’ve got it covered. But there is no easy way for me to tell them they’re right – but also wrong.  

When someone says ‘data lineage’, they could mean 1 of 4 different things:  

  • Business data lineage – to help teams understand data’s journey through the firm by linking it to business processes.  
  • Conceptual data lineage – to help senior stakeholders understand data and the decision they’re being asked to take at a super high level.  
  • Logical data lineage – to describe the information you need around a business term (like ‘client’)  
  • Physical data lineage – mapping each piece of data to the actual rows and columns in the firm’s databases.  

Law firm IT departments rarely have more than 1 or 2 of these – and almost never the business data lineage! 

How does data lineage help law firm’s? 

Data lineage enables business users to better understand processed data by viewing how it got transformed as it moved through the firm’s systems. This helps improve business operations and make improvements to client services.  

Data lineage also helps firms track different datasets because of evolving collection techniques and technologies. This allows the firm to make optimal use of old and new datasets.  

I.T. Teams are helped to upgrade systems, migrate data or fix system issues because data lineage helps them understand the location and lifecycle of data and data sources.   

Another impact is that data lineage helps data governance. That is because lineage provides detailed visibility for the data lifecycle. This allows the firm to manage risks, comply with regulations and perform audits of its data.  

Data lineage also helps identify the root cause of data errors for business intelligence teams. For example, the H.R. system and the Finance system have different headcount numbers. Data lineage can help provide a reasonable explanation for these numbers and see if modifications made in the processing are to blame.  

The final impact of data lineage on the firm is that it can help with change impact analysis. A detailed lineage lets you identify the data elements, affected downstream systems and reports, key stakeholders and affected end-users before you do anything. Assessing the impact of the change helps you decide what steps to take to make that change effectively – or even if you should do it at all.  

Why is understanding data lineage so important? 

Data lineage has an impact on several areas of the firm. Still, it’s often forgotten as the creation and maintenance of lineage documentation (especially in the early days of a data governance initiative when this is being done manually) can fall between teams and be seen as too difficult to achieve.  

But it’s crucial that you grasp the thistle and figure out how to get data lineage done because it will positively impact six areas of the firm’s activities.  Data lineage:  

  1. Enables business users to better understand processed data by viewing how it is transformed through the firm’s systems. This helps improve business operations and make improvements to client services. 
  2. Helps the firm track different datasets because of evolving collection techniques and technologies. This allows the firm to make optimal use of old and new datasets.  
  3. Supports IT teams when upgrading systems, migrating data, or fixing system issues because data lineage helps them understand the location and lifecycle of data and data sources.   
  4. Provides detailed visibility for the data lifecycle. This allows the firm to manage risks, comply with regulations and perform audits of its data.  
  5. Helps identify the root cause of data errors for business intelligence teams. For example, the HR and Finance systems have different headcount numbers; data lineage can help provide a reasonable explanation for these numbers and see if inconsistencies in the processing of the data are to blame.  
  6. Can help the firm with change impact analysis. A detailed data lineage lets you identify the data elements, affected downstream systems and reports, key stakeholders and affected end-users, all before you do anything. Assessing the impact of the change helps you decide what steps to take to make that change effective.  

So should we do it now if we don’t have it? 

Documenting data lineage from both a technical and business perspective has such a big impact on the overall management of data that it’s difficult to see how you can successfully manage data as an asset without it. So many teams, processes, projects, integrations, and decisions will depend on having clarity of data lineage that it’s a critical foundation for doing more and better with data. 

If you haven’t got it already, you should start by building out the lineage for the critical use-cases in flight or planned to support your data strategy and data governance efforts.

HINT: a lot of this lineage stuff will be in people’s heads – its usually a case of getting it down on paper and sharing it more broadly. 

Innovative law firms have big goals for improving the client experience through data innovation.

Through our extensive law firm background, we have developed a unique data governance road-mapping approach to help law firm leaders launch the proper foundation for data governance.

If you want to chat confidentially about how Iron Carrot can help your firm with its Data Governance initiatives, then why not book a call to talk to us?