Skip to content

Commit

Permalink
Merge pull request #37 from InfuseAI/feature/azureblob-support
Browse files Browse the repository at this point in the history
Support azure blob storage
  • Loading branch information
popcornylu authored Apr 12, 2022
2 parents 6848e7a + 22fbf71 commit 0475d74
Show file tree
Hide file tree
Showing 11 changed files with 483 additions and 9 deletions.
2 changes: 1 addition & 1 deletion cmd/clone.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ var cloneCommand = &cobra.Command{
return
}

if strings.HasPrefix(repo, "http") {
if strings.HasPrefix(repo, "http") && !repository.IsAzureStorageUrl(repo) {
exitWithError(errors.New("clone not support under http(s) repo"))
return
}
Expand Down
2 changes: 1 addition & 1 deletion cmd/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ var configCommand = &cobra.Command{
key := args[0]
value := args[1]
if key == "repo.url" {
if strings.HasPrefix(value, "http") {
if strings.HasPrefix(value, "http") && !repository.IsAzureStorageUrl(value) {
exitWithError(errors.New("http(s) repository is not supported"))
return
}
Expand Down
2 changes: 1 addition & 1 deletion cmd/init.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ var initCommand = &cobra.Command{
return
}

if strings.HasPrefix(repo, "http") {
if strings.HasPrefix(repo, "http") && !repository.IsAzureStorageUrl(repo) {
exitWithError(errors.New("init not support under http(s) repo"))
return
}
Expand Down
8 changes: 4 additions & 4 deletions docs/content/en/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,19 +45,19 @@ ArtiVC is a CLI tool. No server or gateway is required to install and operate.

### Multiple backends support

ArtiVC natively supports local filesystem, remote filesystem (by SSH), S3, GCS as backend. And 40+ backends are supported through [Rclone](backends/rclone/) integration. [Learn more](backends/)
ArtiVC natively supports local filesystem, remote filesystem (by SSH), AWS S3, Google Cloud Storage, Azure Blob Storage as backend. And 40+ backends are supported through [Rclone](backends/rclone/) integration. [Learn more](backends/)

<--->

### Expose your data to public
### Painless Configuration

To serve a repository as a public HTTP endpoint, the repository turn to a http repository right away. Then the data consumer can download your data with an one-line command. [Learn more](use-cases/expose/)
No one like to configure. So we leverage the original configuraion as much as possible. Use `.ssh/config` for ssh access, and use `aws configure`, `gcloud auth application-default login`, `az login` for the cloud platforms.

<--->

### Efficient storage and transfer

The file structure of repository is storage and transfer effiecntly by [design](design/how-it-works/). It prevent from storing duplicated content and minimum the round-trip time to determine change set to transfer. [Learn more](design/benchmark/)
The file structure of repository is storage and transfer effiecntly by [design](design/how-it-works/). It prevents from storing duplicated content and minimum the number of files to upload when pushing a new version. [Learn more](design/benchmark/)


{{< /columns >}}
1 change: 1 addition & 0 deletions docs/content/en/backends/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ weight: 2
| Remote Filesystem (SSH) | `<host>:path/to/data` | [{{< icon "gdoc_link" >}}](ssh) |
| AWS S3 [{{< icon "gdoc_language" >}}](https://aws.amazon.com/s3/) | `s3://<bucket>/path/to/data` | [{{< icon "gdoc_link" >}}](s3) |
| Google Cloud Storage [{{< icon "gdoc_language" >}}](https://cloud.google.com/storage) | `gs://<bucket>/path/to/data` | [{{< icon "gdoc_link" >}}](gcs) |
| Azure Blob Storage [{{< icon "gdoc_language" >}}](https://azure.microsoft.com/services/storage/blobs/) | `https://<storageaccount>.blob.core.windows.net/<container>/path/to/data` | [{{< icon "gdoc_link" >}}](azureblob) |
| Rclone [{{< icon "gdoc_language" >}}](https://rclone.org/) | `rclone://<remote>/path/to/data` | [{{< icon "gdoc_link" >}}](rclone) |

90 changes: 90 additions & 0 deletions docs/content/en/backends/azureblob.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
title: Azure Blob Storage
weight: 13
---

Use [Azure Blob Storage](https://azure.microsoft.com/services/storage/blobs/) as the repository backend.

## Configuration

Before using the backend, you have to setup the credential. There are two methods to configure.

- **Use Azure CLI to login:** Suitable for development environment.
- **Use environment variables:** Suitable for production or CI environment


{{< hint warning >}}
**Assign the Permission**\
The logged-in account requires **Storage Blob Data Contributor** role to the storage account. Assign it in the **Azure Portal**

*Storage Accounts* > *my account* > *Access Control (IAM)* > *Role assignments*

For more information, please see https://docs.microsoft.com/azure/storage/blobs/assign-azure-role-data-access
{{< /hint >}}

### Use Azure CLI to login

This backend supports to use [Azure CLI](https://docs.microsoft.com/cli/azure/install-azure-cli) to configure the login account. It will open the browser and start the login process.

```
az login
```

It also supports other login options provided by az login, such as

```
az login --service-principal -u <client-id> -p <client-password> -t <tenant-id>
```

### Use Environment Variables

- Service principal with a secret

| Name | Description
| --- | --- |
AZURE_TENANT_ID | ID of the application's Azure AD tenant
AZURE_CLIENT_ID | Application ID of an Azure service principal
AZURE_CLIENT_SECRET | Password of the Azure service principal

- Service principal with certificate

| Name | Description
| --- | --- |
AZURE_TENANT_ID | ID of the application's Azure AD tenant
AZURE_CLIENT_ID | ID of an Azure AD application
AZURE_CLIENT_CERTIFICATE_PATH | Path to a certificate file including private key (without password protection)

- Username and password

| Name | Description
| --- | --- |
AZURE_CLIENT_ID | ID of an Azure AD application
AZURE_USERNAME | A username (usually an email address)
AZURE_PASSWORD | That user's password

- Managed identity

[Managed identities](https://docs.microsoft.com/azure/active-directory/managed-identities-azure-resources/overview) eliminate the need for developers to manage credentials. By connecting to resources that support Azure AD authentication, applications can use Azure AD tokens instead of credentials.

| Name | Description
| --- | --- |
AZURE_CLIENT_ID | User assigned managed identity client id

- Storage account key

| Name | Description
| --- | --- |
AZURE_STORAGE_ACCOUNT_KEY | The access key of the storage account

## Usage

Init a workspace
```shell
avc init https://mystorageaccount.blob.core.windows.net/mycontainer/path/to/mydataset
```

Clone a repository
```shell
avc clone https://mystorageaccount.blob.core.windows.net/mycontainer/path/to/mydataset
cd mydataset/
```
11 changes: 10 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@ require (
cloud.google.com/go v0.100.2 // indirect
cloud.google.com/go/compute v1.2.0 // indirect
cloud.google.com/go/iam v0.1.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azcore v0.21.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azidentity v0.13.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/internal v0.9.2 // indirect
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v0.3.0 // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v0.4.0 // indirect
github.com/aws/aws-sdk-go-v2 v1.13.0 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.2.0 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.8.0 // indirect
Expand All @@ -37,20 +42,24 @@ require (
github.com/aws/smithy-go v1.10.0 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.1 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/golang-jwt/jwt v3.2.1+incompatible // indirect
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/golang/protobuf v1.5.2 // indirect
github.com/google/go-cmp v0.5.7 // indirect
github.com/google/uuid v1.2.0 // indirect
github.com/googleapis/gax-go/v2 v2.1.1 // indirect
github.com/inconshreveable/mousetrap v1.0.0 // indirect
github.com/jmespath/go-jmespath v0.4.0 // indirect
github.com/kr/fs v0.1.0 // indirect
github.com/kylelemons/godebug v1.1.0 // indirect
github.com/mattn/go-colorable v0.1.12 // indirect
github.com/mattn/go-isatty v0.0.14 // indirect
github.com/pkg/browser v0.0.0-20210115035449-ce105d075bb4 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/russross/blackfriday/v2 v2.1.0 // indirect
github.com/spf13/pflag v1.0.5 // indirect
go.opencensus.io v0.23.0 // indirect
golang.org/x/net v0.0.0-20220127200216-cd36cc0744dd // indirect
golang.org/x/net v0.0.0-20220407224826-aac1ed45d8e3 // indirect
golang.org/x/oauth2 v0.0.0-20211104180415-d3ed0bb246c8 // indirect
golang.org/x/sys v0.0.0-20220209214540-3681064d5158 // indirect
golang.org/x/text v0.3.7 // indirect
Expand Down
Loading

0 comments on commit 0475d74

Please sign in to comment.