Getting Started with Azure Purview: Unified Data Governance Solution

  1. Home
  2. Microsoft Azure
  3. Getting Started with Azure Purview: Unified Data Governance Solution
Azure Purview: Unified Data Governance Solution

GDPR and CCPA made many organisations realise that if they don’t know what data they have, where it’s housed, who has access to it, and where it’s being transferred to, they are taking an unknown amount of risk. Most businesses see their data as a strategic asset, and they are looking for the correct technology to minimise the risks involved with storing, mining, and managing it. Hence, the need for Azure Purview is growing.

Documents and emails stored on SharePoint Online, OneDrive for Business, and Exchange Online, as well as file shares and SharePoint server on-premises, are protected by Information Protection and Data Loss Prevention in Microsoft 365 (M365).

Let us Get Started with Azure Purview: Unified Data Governance Solution!

What is Data Governance?

“Data governance is the specification of decision rights and an accountability framework to assure acceptable behaviour in the valuation, generation, consumption, and management of data and analytics,” according to Gartner.

Data governance ensures that data is useable, accessible, and secure. It also aids in more educated data analytics by allowing a business to reach a well-informed judgement. Data governance also enhances data consistency, eliminates redundancies, and assists in making sense of junk data, which can rescue an organisation from a major decision-making dilemma.

Azure Purview: Unified Data Governance Solution Glossary

Here is a glossary of some common terms related to Azure Purview, a unified data governance solution:

  1. Understanding Data Catalog: A database of metadata that describes the data assets in an organization, including their structure, location, and usage.
  2. Data Classification: The process of categorizing data based on its sensitivity, so that appropriate security measures can be applied to protect it.
  3. Learn Data Governance: The set of policies, procedures, and standards that define how data is managed, stored, and used within an organization.
  4. Data Lineage: The process of tracking the flow of data from its origin to its current location, including all transformations and processing steps.
  5. Data Profiling: The process of analyzing and understanding the quality and characteristics of data, such as its completeness, accuracy, and consistency.
  6. Glossary: A database of business terms and definitions that are used in an organization, to ensure that everyone is speaking the same language.
  7. Purview Studio: The web-based user interface for Azure Purview, where users can create, manage, and analyze data assets.
  8. Scanner: A tool that automatically discovers and catalogs data assets within an organization, and extracts metadata from them.
Benefits of Data Governance

Organizations are also benefit from data governance as they: Consistency of data Data management expenses have been reduced, Increased data availability for all parties involved in order to make better data-driven decisions, enhanced employee experience (thus higher engagement level and Productivity). Improved customer experience by facilitating 360-degree views to deliver tailored experiences at scale and allowing quicker insights into consumer behavior/patterns, Brand value in general.

Why Data Governance?

The new currency of the digital era is data. However, data within businesses is increasing at an exponential rate. 90 percent of today’s data was produced in the previous two years. By 2025, unstructured data will account for 80% of all data. This surge of data has multiplied the organization’s and issues’ size by a factor of ten.

To derive true business value from data, the company must understand:
  • What data is available inside the organisation?
  • Who is the owner of the data?
  • Who has access to the information?
  • For what objectives can they properly and ethically use the Data?
  • Data tracing (traceability of data flow and its usage in solutions)
  • Furthermore, Data duplication
  • Data quality and shared taxonomy
  • Additionally, Data capture security and compliance
  • Where and how is the data save or archive? (and overall lifespan of data)

What’s Microsoft Azure Purview?

Microsoft Azure Purview is a fully manage, unified data governance solution that assists you in managing and governing your data across on-premises, multi-cloud, and SaaS environments. Purview uses automated data discovery, sensitive data classification, and end-to-end data lineage to produce a comprehensive, up-to-date map of your data environment. Purview enables data consumers to locate relevant, reliable data.

It is based on Apache Atlas, an open-source project for metadata management and data asset governance. Azure also includes a data sharing method that securely distributes data with external business partners without the need for additional FTP nodes or the creation of duplicate huge datasets. Azure Purview does not transfer or keep client data outside of the area in which it is install.

Purview is Available for Public Preview

Purview presently has no licencing fees; you just pay for what you use. Microsoft’s pay-per-use approach, available as part of Public Preview, is appealing to Microsoft customers that want to move swiftly without having to establish a business case to acquire more funding. Azure Purview saves money in a variety of ways, including reducing human and bespoke efforts to locate and categorise data and removing the hidden and explicit expenses of maintaining homemade systems and Excel-based solutions.

Data Sources Supported by Azure Purview

At the time of writing, it supported the following data sources: On-premises SQL Server, Gen1 Azure Data Lake Storage, Gen2 Azure Data Lake Storage, Microsoft Azure Blob Storage Azure Data Explorer, Azure SQL Database, Managed Azure SQL DB Instance Azure Synapse Analytics (formerly SQL DW), Azure Cosmos DB Power BI, Teradata, SAP S/4 HANA and SAP ECC are two examples of ERP sources, and Oracle DB.

Image Source – Microsoft

Critical Capabilities of Azure Purview

Azure Purview has the following essential features:

Azure Purview Data Map

The Azure Purview Data Map lays the groundwork for successful data governance and data discovery. It is a cloud-native PaaS solution that gathers metadata. About corporate data in analytics and operational systems on-premises and in the cloud. Purview Data Map’s built-in automated scanning and categorization technology keeps it up to date automatically. Business users may setup and utilise the Purview Data Map via an easy UI, and developers can interact with the Data Map programmatically using open-source Apache Atlas 2.0 APIs.

Purview Data Map is the engine that drives the Purview Data Catalog and Purview Data insights. As unified experiences within the Purview Studio. Azure Purview builds an automated system to manage your metadata from hybrid and disparate sources, while also utilising built-in data classifiers and data protection to guarantee sensitive data is not exploited. It does this through the use of a technology known as Microsoft Information Protection sensitive labels.

Purview Data catalog

Purview’s Data Catalog offers rich data discovery by allowing users to search business and technical terms and comprehend data by viewing related technical, business, semantic, and operational metadata. The Azure Purview Data Catalog feature enables you to run a Semantic search for your data and present it in such a way that comprehending it is quick and easy while validating if the data of interest originates from a trusted source while keeping the sensitivity of data labels.

The data catalogue, coupled with data source metadata and interactive data lineage visualisation, provides data scientists, engineers, and analysts with business context to drive BI, analytics, AI, and machine learning activities. Purview assists businesses in comprehending their data supply chain, from raw data to business insights. Purview now supports the following data lineages:

  • With a few clicks, scan your Power BI environment and Azure Synapse Analytics workspaces and instantly publish all detected assets and lineage to the Purview Data Map.
  • Furthermore, Connect Azure Purview to Azure Data Factory instances to obtain data integration lineage automatically. Determine which analytics and reports currently exist without having to recreate the wheel.
Purview Data Insights

Purview Data Insights allows data officers and security officers to receive a bird’s eye view and comprehend what data is being scan, where critical data is, and how it flows at a look. By swiftly defining which analytics and reports are save, the data governance component gives users a bird’s-eye perspective of your organization’s data environment. It enables stakeholders to easily manage and use an organization’s data, whether it currently exists or not. This view provides critical information such as data distribution across environments, data movement, and sensitive data storage.

Purview Studio

Purview Studio is simply an environment built after creating an account for you to work with the Azure purview services. This studio serves as a central control point for developers, administrators, and end users to interact with Purview. This tool is the next phase in the Azure Purview process.

Challenges of Azure Purview

Azure Purview is still in its early stages, and there are a few holes that need to be fill. Here are a few of Azure Purview’s limitations:

  • Purview has a limited set of data sources; even most Azure data services, much alone other complex management systems and BI tools, are inaccessible for scanning.
  • The data catalogue lacks fundamental data management features in the user interface. For example, once assets have been categorise, they cannot be erase via the UI.
  • There is no support for categorising zip file content.
  • Additionally, There is no support for Data Marketplace.
  • There is no support for automation or alerting.
  • Furthermore, Asset relationships are formed manually, and the kind or nature of the relationship cannot be define.
  • The maximum length for an asset and categorization name is just 4 KB.
  • In addition, Currently, Azure Purview only provides 10GB storage capacity for four capacity unit platforms. And 40GB storage capacity for sixteen capacity unit platforms.

While Azure Purview is presently not a one-stop shop for enterprise-level data governance capabilities, based on the roadmap presented, it won’t be long before the Purview team pulls up their socks and covers enough ground to make Azure Purview an enterprise-grade Data governance package.

How Azure Purview helps with Data as Asset

Azure Purview is design to help you better manage your data, and here’s how it can help you process and transform your data into an asset:

a) Inventory – Azure Purview enables you to organise your data and apply a personalised tag to it, allowing you, the end-user, to better identify and comprehend it.

b) Quality assurance – It also assists you in maintaining Data Quality when your data must be comprehensive, unique, valid, accurate, consistent, relevant, dependable, and accessible. Governance tools, such as the data catalogue, will aid you in this endeavour.

c) Security Observance – It is your responsibility as an organisation to ensure the highest level of security for end-user data. According to government rules and data mandates, end-users can demand that their data be removed from company servers and even changed at any time; Azure Purview allows you to design an automated procedure that will speed these service requests. And deliver the paperwork necessary by law.

d) A Single Roadmap – It creates a single map of all of your data assets. Furthermore, This aids in the development of an efficient data governance framework.

Additionally,

e) Allows for Semantic Search – You may do searches using technical, business, or operational terms. The sensitivity level of the data can be identified, and the interactive data lineage may be understood.

f) Constant Data Update Flow – Through the System Get real-time updates on the location of your data and insight into its mobility across your multi-layered data ecosystem. In addition, Azure Purview offers features such as a data catalogue and a business vocabulary.

g) Catalog of Data – It is a key component of any data governance software that can scan all data sources, identify, index, link, and categorise the data sets of registered users.

h) Business Dictionary – It is a set of terms with brief meanings that are link to other concepts. It is feasible to utilise Business Glossary to automate the process of categorising data sets. And annotating them with correct business words so that end users may comprehend them more easily. Any business glossary serves as the foundation of the semantic layer that a company use to define a medium of communication for its business.

With capabilities like these, Microsoft Azure Purview transforms your data into a valuable asset.

Summary

Data Governance is a must-have solution plan for any businesses that want to leverage data as an asset. Furthermore, Data Governance is a difficult solution that is also a critical pillar in every enterprise’s data journey. Data governance contributes to the responsible democratization of data at scale by making business data accessible, trustworthy, and integrated. Microsoft Azure Purview is a wonderful place to start when looking for Cloud-native Data Governance solutions. Azure Purview answers the who, what, when, how, where, and why questions about data.

Exam tests
Menu