Sharing is caring, Azure Data Share

We all know this situation very well; we just have delivering or attending a workshop and agreed to follow-up with some action items. A very typical one is about sharing a specific document, or other related data which was discussed or touched on.

E-Mail or FTP is still a preferred tool to share information, but services like Dropbox, Google Drive, OneDrive and the likes have also become very popular. But isn’t one of the biggest challenges with those tools, and especially E-Mail, that you have no control who has access to your data shared, or even see who has done what? Whole sharing is caring, it should be fit for purpose.

GIF sharing

Azure Data Share

With Azure Data Share has been released already some time ago during Inspire 2019, but it was rather kept silent around this service. Looking into the details, what is this new data sharing service out of Azure?

The new service hits the market at a time when many businesses have serious concerns about the use and monetization of data by tech companies. And sharing, both inside and outside the enterprise can be complicated, depending on the volume of data. Also, with COVID-19, there is real need to support remote workers and partners with proper tooling to share data securely.

Azure Data Share helps you to safely share your data and meet enterprise compliance, regulatory, control, and data privacy needs. The sharing service works through the Azure Portal and doesn’t require any additional infrastructure to grant access to another organization. Huh, an organization? Yes, there is a different approach compared to consume-oriented data sharing tools. Azure Data Share uses managed identities for Azure Resources and integrates with Azure Active Directory (Azure AD) to manage credentials and permissions.

So how does it work?

Coming as a Platform-as-a-Service (PaaS) offering, there is no additional infrastructure required to setup or manage, everything comes out of Azure.

Azure Data Share currently offers “Snapshot-based sharing” and “In-place sharing”. You want to think in scenarios where you have the Data Provider and the Data Consumer. In the snapshot-based sharing data moves from the data providers Azure subscription to the data consumers Azure subscription. A receiver of your data sharing invitation gets basically a full snapshot of the data selected to be shared, which lands in the storage account of the Data Consumer.

Data Consumers can even receive regular or incremental updates to the data shared, so they have always the latest version of the data. Snapshot schedules are offered on an hourly or a daily basis, which is beneficial where the shared data is updated on a regular basis.

With in-place sharing, data providers can share data where it resides without copying the data. After sharing relationship is established through the invitation flow, a symbolic link is created between the data provider’s source data store and the data consumer’s target data store.

At the moment Blob Storage, Data Lake, SQL Database and Synapse Analytics (formerly Azure SQL DW) are supported for Snapshot-based sharing. Azure Data Explorer is in public preview for Azure Data Explorer. Check this article for latest changes on the support matrix.

The setup is very straight-forward, basically selecting your data storage to share and include the recipients. An e-mail will be sent where the invitation must be accepted. This mail looks like this:

Invite to Azure Data Share

Access to the data is managed withing Azure and gives full transparency. In a nutshell, Azure Data Share allows to easily control what you share, who receives your data, and the terms of use. If you want to know more about this Azure Service, there is a podcast where Program Manager Jie Feng goes into more details.

There’s just one more thing…

Azure Data Share is not yet available in all Azure Regions, for example the Datacenters in Switzerland are not listed. You might want to check on the availability for your desired region before deciding on the strategy.

As an alternative, SFTP might also be another service of interest. Although not deployed as a native Azure Service, there is Docker templated which provides a good workaround for a cost-effective SFTP solution in Azure which is backed by durable persistent storage. Azure Container Instance (ACI) service is very inexpensive and requires very little maintenance, while data is stored in Azure Files which is a fully managed SMB service in cloud. Note, the cost of blob storage is ~40% lower than file storage.

Tags

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.