Edit

Share via


Set up your Azure Data Lake Storage Gen2 connection

This article outlines the steps to create an Azure Date Lake Storage Gen2 connection for pipelines and Dataflow Gen2 in Microsoft Fabric.

Supported authentication types

The Azure Date Lake Storage Gen2 connector supports the following authentication types for copy and Dataflow Gen2 respectively.

Authentication type Copy Dataflow Gen2
Account key
Organizational account
Service Principal
Shared Access Signature (SAS)
Workspace Identity

Set up your connection for Dataflow Gen2

You can connect Dataflow Gen2 to Azure Data Lake Storage Gen2 in Microsoft Fabric using Power Query connectors. Follow these steps to create your connection:

  1. Check capabilities, limitations, and considerations to make sure your scenario is supported.
  2. Complete prerequisites for Azure Data Lake Storage Gen2.
  3. Go to Get data.
  4. Connect to Azure Data Lake Storage Gen2.

Capabilities

  • Import
  • File System View
  • CDM Folder View

Prerequisites

  • An Azure subscription. Go to Get Azure free trial.

  • A storage account that has a hierarchical namespace. To create one, follow the instructions at Create a storage account. This article assumes that you created a storage account named myadlsg2.

  • Ensure you're granted one of the following roles for the storage account: Blob Data Reader, Blob Data Contributor, or Blob Data Owner.

  • A sample data file named Drivers.txt located in your storage account. You can download this sample from Azure Data Lake Git Repository, and then upload that file to your storage account.

Get data

To get data in Data Factory:

  1. On the left side of Data Factory, select Workspaces.

  2. From your Data Factory workspace, select New > Dataflow Gen2 to create a new dataflow.

    Screenshot showing the workspace where you choose to create a new dataflow.

  3. In Power Query, either select Get data in the ribbon or select Get data from another source in the current view.

    Screenshot showing the Power Query workspace with the Get data option emphasized.

  4. In the Choose data source page, use Search to search for the name of the connector, or select View more on the right hand side the connector to see a list of all the connectors available in Power BI service.

    Screenshot of the Data Factory Choose data source page with the search box and the view more selection emphasized.

  5. If you choose to view more connectors, you can still use Search to search for the name of the connector, or choose a category to see a list of connectors associated with that category.

    Screenshot of the Data Factory Choose data source page displayed after selecting view more, with the list of connectors.

Connect to Azure Data Lake Storage Gen2

  1. Select the Azure Data Lake Storage Gen2 option in the get data experience. Different apps have different ways of getting to the Power Query Online get data experience. For more information about how to get to the Power Query Online get data experience from your app, go to Where to get data.

    Screenshot of the get data window with Azure Data Lake Storage Gen2 emphasized.

  2. In Connect to data source, enter the URL to your Azure Data Lake Storage Gen2 account. Refer to limitations and considersations to determine the URL to use.

    Screenshot of the Connect to data source page for Azure Data Lake Storage Gen2, with the URL entered.

  3. Select whether you want to use the file system view or the Common Data Model folder view.

  4. If needed, select the on-premises data gateway in Data gateway.

  5. Select Sign in to sign into the Azure Data Lake Storage Gen2 account. You're redirected to your organization's sign-in page. Follow the prompts to sign in to the account.

  6. After you successfully sign in, select Next.

  7. The Choose data page shows all files under the URL you provided. Verify the information and then select Transform Data to transform the data in Power Query.

    Screenshot of the Choose data page, containing the data from the Drivers.txt file.

Limitations and considerations

  • Connections for trusted workspace access only work in OneLake shortcuts and pipelines.
  • Connections for trusted workspace access can't be created from the Manage Gateways and connections experience.
  • Existing connections that work for trusted workspace access can't be modified in the Manage Gateways and connections experience.
  • Connections to firewall-enabled Storage accounts have the status Offline in Manage connections and gateways.
  • Checking the status of a connection with workspace identity as the authentication method isn't supported.

Set up connections for trusted workspace access

  1. Configure a workspace identity in the workspace where the connection will be used. For more information, see Workspace identity.

  2. Grant the workspace identity, organizational account, or service principal access to the storage account. For more information, see Create a OneLake shortcut to storage account with trusted workspace access

  3. Configure a resource instance rule. For more information, see Resource instance rule.

  4. Follow steps from Set up your connection to create the connection.

Set up your connection for a pipeline

The following table contains a summary of the properties needed for a pipeline connection:

Name Description Required Property Copy
Connection name A name for your connection. Yes
Connection type Select a type for your connection. Yes
Server Enter the name of Azure Data Lake Storage Gen2 server, for example, https://contosoadlscdm.dfs.core.windows.net. Yes
Full path Enter the full path of your Azure Data Lake Storage Gen2 container name. Yes
Authentication Go to Authentication. Yes Go to Authentication.
Privacy Level The privacy level that you want to apply. Allowed values are Organizational, Privacy, and Public. Yes

For specific instructions to set up your connection in a pipeline, follow these steps:

  1. From the page header in Data Integration service, select Settings > Manage connections and gateways

    Screenshot showing how to open manage gateway.

  2. Select New at the top of the ribbon to add a new data source.

    Screenshot showing the new page.

    The New connection pane shows up on the left side of the page.

    Screenshot showing the New connection pane.

  3. In the New connection pane, choose Cloud, and specify the following fields:

    Screenshot showing how to set a new connection.

    • Connection name: Specify a name for your connection.
    • Connection type: Select a type for your connection.
    • Server: Enter your Azure Data Lake Storage Gen2 server name. For example, https://contosoadlscdm.dfs.core.windows.net. Specify your Azure Data Lake Storage Gen2 server name. Go to your Azure Data Lake Storage Gen2 account interface, browse to the Endpoints section, and get your Azure Data Lake Storage Gen2.
    • Full path: Enter the full path to your Azure Data Lake Storage Gen2 container name.
  4. Under Authentication method, select your authentication from the drop-down list and complete the related configuration. The Azure Data Lake Storage Gen2 connector supports the following authentication types:

    Screenshot showing the authentication method for Azure Data Lake Storage Gen2.

  5. Optionally, set the privacy level that you want to apply. Allowed values are Organizational, Privacy, and Public. For more information, see privacy levels in the Power Query documentation.

  6. Select Create. Your creation is successfully tested and saved if all the credentials are correct. If not correct, the creation fails with errors.

Set up your connection in any Fabric item

  1. In any Fabric item, select the Azure Data Lake Storage Gen2 option in the Get Data selection, and then select Connect.

    Screenshot showing the Connect to data source page of a Fabric item for Azure Data Lake Storage Gen2 of a Fabric item, with the URL entered.

  2. You can select the data source you created in the previous steps, or create a new connection by selecting Azure Data Lake Storage Gen2.

  3. In Connect to data source, enter the URL to your Azure Data Lake Storage Gen2 account. Refer to Limitations to determine the URL to use.

  4. Select whether you want to use the file system view or the Common Data Model folder view.

  5. If needed, select the on-premises data gateway in Data gateway (only supported in Dataflow Gen1, Dataflow Gen2, and Semantic Models).

  6. Select Sign in to sign into the Azure Data Lake Storage Gen2 account. You are redirected to your organization's sign-in page. Follow the prompts to sign in to the account.

  7. After you've successfully signed in, select Next.

Authentication

The Azure Data Lake Storage Gen2 connector supports the following authentication types:

Key authentication

Account key: Specify your Azure Data Lake Storage Gen2 account key. Go to your Azure Data Lake Storage Gen2 account interface, browse to the Access key section, and get your account key.

Screenshot showing that key authentication method for Azure Data Lake Storage Gen2.

Organizational account authentication

Screenshot showing that OAuth2 authentication method for Azure Data Lake Storage Gen2.

Open Edit credentials. The sign-in interface opens. Enter your account and password to sign in to your account. After signing in, you'll come back to the New connection page.

Grant the organizational account proper permission. For examples of how permission works in Azure Data Lake Storage Gen2, go to Access control lists on files and directories.

  • As source, in Storage Explorer, grant at least Execute permission for all upstream folders and the file system, along with Read permission for the files to copy. Alternatively, in Access control (IAM), grant at least the Storage Blob Data Reader role.
  • As destination, in Storage Explorer, grant at least Execute permission for all upstream folders and the file system, along with Write permission for the destination folder. Alternatively, in Access control (IAM), grant at least the Storage Blob Data Contributor role.

Shared access signature authentication

Screenshot showing that shared access signature authentication method for Azure Data Lake Storage Gen2.

SAS token: Specify the shared access signature token for your Azure Data Lake Storage Gen2 container.

If you don’t have a SAS token, switch to Shared access signature in your Azure Data Lake Storage Gen2 account interface. Under Allowed resource types, select Container, and then select Generate SAS and connection string. You can get your SAS token from the generated content that appears. The shared access signature is a URI that encompasses in its query parameters all the information necessary for authenticated access to a storage resource. To access storage resources with the shared access signature, the client only needs to pass in the shared access signature to the appropriate constructor or method. For more information about shared access signatures, go to Shared access signatures: Understand the shared access signature model.

Service principal authentication

Screenshot showing that service principal authentication method for Azure Data Lake Storage Gen2.

  • Tenant Id: Specify the tenant information (domain name or tenant ID) under which your application resides. Retrieve it by hovering over the upper-right corner of the Azure portal.
  • Service principal ID: Specify the application (client) ID.
  • Service principal key: Specify your application's key.

To use service principal authentication, follow these steps:

  1. Register an application entity in Microsoft Entra ID by following Register your application with a Microsoft Entra tenant. Make note of these values, which you use to define the connection:

    • Tenant ID
    • Application ID
    • Application key
  2. Grant the service principal proper permission. For examples of how permission works in Azure Data Lake Storage Gen2, go to Access control lists on files and directories.

    • As source, in Storage Explorer, grant at least Execute permission for all upstream folders and the file system, along with Read permission for the files to copy. Alternatively, in Access control (IAM), grant at least the Storage Blob Data Reader role.
    • As destination, in Storage Explorer, grant at least Execute permission for all upstream folders and the file system, along with Write permission for the destination folder. Alternatively, in Access control (IAM), grant at least the Storage Blob Data Contributor role.

    Note

    If you use a UI to author and the service principal isn't set with the "Storage Blob Data Reader/Contributor" role in IAM, when doing a test connection or browsing/navigating folders, choose Test connection to file path or Browse from specified path, and then specify a path with Read + Execute permission to continue.

Workspace identity authentication

Workspace identity: Select workspace identity from the authentication method dropdown. A Fabric workspace identity is an automatically managed service principal that can be associated with a Fabric workspace. Fabric workspaces with a workspace identity can securely read or write to Azure Data Lake Storage Gen2 accounts through OneLake shortcuts and pipelines. When selecting this option in the connector, make sure that the workspace has a workspace identity and that the identity has the ability to read or write to the intended Azure Data Lake Storage Gen2 account. For more information, see Workspace identity

Note

Connections with workspace identity has the status Offline in Manage connections and gateways. Checking the status of a connection with workspace identity isn't supported.