Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SCP docs #929

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Skyplane is a tool for blazingly fast bulk data transfers between object stores
Skyplane is:
1. 🔥 Blazing fast ([110x faster than AWS DataSync](https://skyplane.org/en/latest/benchmark.html))
2. 🤑 Cheap (4x cheaper than rsync)
3. 🌐 Universal (AWS, Azure, IBM and GCP)
3. 🌐 Universal (AWS, Azure, IBM, SCP and GCP)

You can use Skyplane to transfer data:
* between object stores within a cloud provider (e.g. AWS us-east-1 to AWS us-west-2)
Expand All @@ -45,6 +45,7 @@ Skyplane currently supports the following source and destination endpoints (any
| Google Storage | :white_check_mark: | :white_check_mark: |
| Azure Blob Storage | :white_check_mark: | :white_check_mark: |
| IBM Cloud Object Storage | :white_check_mark: | :white_check_mark: |
| Samsung Cloud Platform Object Storage | :white_check_mark: | :white_check_mark: |
| Local Disk | :white_check_mark: | (in progress) |

Skyplane is an actively developed project. It will have 🔪 SHARP EDGES 🔪. Please file an issue or ask the contributors via [the #help channel on our Slack](https://join.slack.com/t/skyplaneworkspace/shared_invite/zt-1cxmedcuc-GwIXLGyHTyOYELq7KoOl6Q) if you encounter bugs.
Expand All @@ -67,10 +68,11 @@ $ pip install "skyplane[aws]"
# $ pip install "skyplane[azure]"
# $ pip install "skyplane[gcp]"
# $ pip install "skyplane[ibmcloud]"
# $ pip install "skyplane[scp]"
# $ pip install "skyplane[all]"
```

Skyplane supports AWS, Azure, IBM and GCP. You can install Skyplane with support for one or more of these clouds by specifying the corresponding extras. To install two out of three clouds, you can run `pip install "skyplane[aws,azure]"`.
Skyplane supports AWS, Azure, IBM, SCP and GCP. You can install Skyplane with support for one or more of these clouds by specifying the corresponding extras. To install two out of three clouds, you can run `pip install "skyplane[aws,azure]"`.

*GCP support on the M1 Mac*: If you are using an M1 Mac with the arm64 architecture and want to install GCP support for Skyplane, you will need to install as follows
`GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=1 GRPC_PYTHON_BUILD_SYSTEM_ZLIB=1 pip install "skyplane[aws,gcp]"`
Expand Down Expand Up @@ -110,10 +112,10 @@ IBM IAM key and credentials to your IBM Cloud object storage
---> For SCP:
$ # Create directory if required
$ mkdir -p ~/.scp
$ # Add the lines for "access_key", "secret_key", and "project_id" to scp_credential file
$ echo "access_key = <your_access_key>" >> ~/.scp/scp_credential
$ echo "secret_key = <your_secret_key>" >> ~/.scp/scp_credential
$ echo "project_id = <your_project_id>" >> ~/.scp/scp_credential
$ # Add the lines for "scp_access_key", "scp_secret_key", and "scp_project_id" to scp_credential file
$ echo "scp_access_key = <your_access_key>" >> ~/.scp/scp_credential
$ echo "scp_secret_key = <your_secret_key>" >> ~/.scp/scp_credential
$ echo "scp_project_id = <your_project_id>" >> ~/.scp/scp_credential

```
After authenticating with each cloud provider, you can run `skyplane init` to create a configuration file for Skyplane.
Expand Down
1 change: 1 addition & 0 deletions docs/_api/skyplane.api.config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
AzureConfig
GCPConfig
IBMCloudConfig
SCPConfig
TransferConfig


Expand Down
3 changes: 3 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,8 @@ While GCP VPCs are Global, in AWS for every region that is involved in a transfe

Firewall support for Azure is in the roadmap.

### SCP
Similar to AWS, in SCP(Samsung Cloud Platform) for every location(region) that is involved in a transfer, Skyplane creates a `skyplane` [VPC](https://github.com/skyplane-project/skyplane/blob/main/skyplane/compute/scp/scp_network.py#L267), subnet, security group(SG). When the Gateway instance is created, port 22 is initially [registered](https://github.com/skyplane-project/skyplane/blob/main/skyplane/compute/scp/scp_cloud_provider.py#L194), and subsequently, firewall rules for all IPs are [registered](https://github.com/skyplane-project/skyplane/blob/main/skyplane/compute/scp/scp_cloud_provider.py#L225) for data relay. After the transfer, the firewalls are [deleted](https://github.com/skyplane-project/skyplane/blob/main/skyplane/compute/scp/scp_cloud_provider.py#L228).

## Large Objects
Skyplane breaks large objects into smaller sub-parts (currently AWS and GCP only) to improve transfer parallelism (also known as [striping](https://ieeexplore.ieee.org/document/1560006)).
3 changes: 2 additions & 1 deletion docs/configure.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@ By default, Skyplane will use a maximum of 1 VM in each region. This limit is co
* `ibmcloud_use_spot_instances`: If set, IBM will use spot instances instead of on-demand instances. (default False)
* `ibmcloud_default_region`: IBM region to use for provisioning. (default us-east)
* `ibmcloud_instance_class`: IBM instance class to use for provisioning. (default bx2-2x8)

* `scp_instance_class`: SCP instance class to use for provisioning. (default h1v32m128)
* `scp_default_region`: SCP region to use for provisioning. (default KR-WEST-1)

<!-- ### Transfer Chunk Sizes
* Skyplane will break up large objects into smaller chunk sizes to parallelize transfers more efficiently (AWS and GCP only).
Expand Down
9 changes: 9 additions & 0 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ We're ready to install Skyplane. It's as easy as:
# install support for other clouds as needed:
# $ pip install "skyplane[azure]"
# $ pip install "skyplane[gcp]"
# $ pip install "skyplane[scp]"
# $ pip install "skyplane[all]"

.. dropdown for M1 Macbook users
Expand Down Expand Up @@ -51,6 +52,14 @@ Once you have the CLI tools setup, log into each cloud provider's CLI:
---> For Azure:
$ az login

---> For SCP:
$ # Create directory if required
$ mkdir -p ~/.scp
$ # Add the lines for "scp_access_key", "scp_secret_key", and "scp_project_id" to scp_credential file
$ echo "scp_access_key = <your_access_key>" >> ~/.scp/scp_credential
$ echo "scp_secret_key = <your_secret_key>" >> ~/.scp/scp_credential
$ echo "scp_project_id = <your_project_id>" >> ~/.scp/scp_credential

Now, you can initialize Skyplane with your desired cloud providers. Skyplane autodetects cloud credentials and valid regions from your CLI environment.

.. code-block:: bash
Expand Down
8 changes: 8 additions & 0 deletions docs/permissions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,12 @@ Your Azure account must have the following roles:
Within Azure, it is not sufficient to have just the :code:`Owner` role to be able to access and write to containers in storage. The VMs that Skyplane provisions are assigned the sufficient storage permissions, but to be able to interact with Azure storage locally, check to make sure your personal Azure account has the roles listed above.


SCP
-----------------------------
To ensure smooth authentication key usage and access to Object Storage, please verify the following permissions:

- :code:`Access allowance IP for Your SCP account: Disabled`
- :code:`Object Storage access control feature: Disabled`

Configure the newly created Gateway instances to enable authentication and Object Storage access.

5 changes: 3 additions & 2 deletions docs/summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@
```
pip install skyplane[aws]
skyplane init
skyplane [sync/cp] [local/s3/gs/azure]://mybucket/big_dataset [local/s3/gs/azure]://mybucket2/
skyplane [sync/cp] [local/s3/gs/azure/scp]://mybucket/big_dataset [local/s3/gs/azure/scp]://mybucket2/
```

Skyplane is a tool for blazingly fast bulk data transfers between object stores in the cloud. It provisions a fleet of VMs in the cloud to transfer data in parallel while using compression and bandwidth tiering to reduce cost.

Skyplane is:
1. 🔥 Blazing fast ([110x faster than AWS DataSync](https://skyplane.org/en/latest/benchmark.html))
2. 🤑 Cheap (4x cheaper than rsync)
3. 🌐 Universal (AWS, Azure and GCP)
3. 🌐 Universal (AWS, Azure, IBM, SCP and GCP)

You can use Skyplane to transfer data:
* between object stores within a cloud provider (e.g. AWS us-east-1 to AWS us-west-2)
Expand All @@ -26,6 +26,7 @@ Skyplane currently supports the following source and destination endpoints (any
| Google Storage | ✅ | ✅ |
| Azure Blob Storage | ✅ | ✅ |
| Cloudflare R2 | ✅ | ✅ |
| Samsung Cloud Platform Object Storage | ✅ | ✅ |
| Local Disk | ✅ | (in progress) |

Skyplane is an actively developed project. It will have 🔪 SHARP EDGES 🔪. Please file an issue or ask the contributors via [the #help channel on our Slack](https://join.slack.com/t/skyplaneworkspace/shared_invite/zt-1cxmedcuc-GwIXLGyHTyOYELq7KoOl6Q) if you encounter bugs.