Skip to content

Commit

Permalink
docs: remove unused files from being shown (#104)
Browse files Browse the repository at this point in the history
* docs: remove unused files from being shown

* docs: remove version

* fix(linting): code formatting

---------

Co-authored-by: Fabiana Clemente <[email protected]>
Co-authored-by: Azory YData Bot <[email protected]>
  • Loading branch information
3 people committed Jun 21, 2024
1 parent 862c18c commit 2fe118e
Show file tree
Hide file tree
Showing 5 changed files with 234 additions and 11 deletions.
204 changes: 204 additions & 0 deletions docs/deployment_and_security/deployment/google/pre_deploy_checklist.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
# Checklist and Prerequisites

The deployment will be executed using terraform, and it is fully automated.
It is triggered by YData’s team and the progress can be monitored on the client side.

As a pre-condition, the client must create a service account and share it with YData’s team.
The required permissions will be shared in this document.

The bastion host will be used to provide technical support to the team in case of issues and troubleshooting with the usage
of the platform, and this access will only be used for this purpose.

!!! Note "Prerequisites"

If you don't have an GCP subscription, create a ^^[free account](https://console.cloud.google.com/welcome?project=ydatasynthetic)^^ before you begin.

### Observations & prerequisites

- The deployment will create one public and private key to establish the connection to the bastion host.
- With this deployment, a security group allowing YData’s IP to establish the connection to the bastion host
via SSH will be created. This should be deleted after the deployment and added in case it is needed.
- The Bastion host can be stopped after the deployment to prevent any charges and created/started to give support.
- The private subnets will have a NAT Gateway attached – this is needed since the GKE needs access to the public internet
to connect the Data Sources and to pull images from the public registries.

## Basic Configuration

- **Project**: where the platform will be installed.
- **Location**: where to install the YData fabric inside the project.

## Enable API's

- Please check if the following API’s for the chosen project are enabled:
- ^^[API Keys API](https://console.cloud.google.com/apis/library/apikeys.googleapis.com)^^
- ^^[Artifact Registry API](https://console.cloud.google.com/apis/library/artifactregistry.googleapis.com)^^
- ^^[Certificate Manager API](https://console.cloud.google.com/apis/library/certificatemanager.googleapis.com)^^
- ^^[Cloud Resource Manager API](https://console.cloud.google.com/apis/library/cloudresourcemanager.googleapis.com)^^
- ^^[Cloud Key Management Service (KMS) API](https://console.cloud.google.com/apis/library/cloudkms.googleapis.com)^^
- ^^[Compute Engine API](https://console.cloud.google.com/apis/library/compute.googleapis.com)^^
- ^^[Kubernetes Engine API](https://console.cloud.google.com/apis/library/container.googleapis.com)^^
- ^^[Cloud DNS API](https://console.cloud.google.com/apis/library/dns.googleapis.com?project=ydata-410315)
- ^^[Cloud Filestore API](https://console.cloud.google.com/apis/library/file.googleapis.com)^^
- ^^[Cloud Run API](https://console.cloud.google.com/apis/library/run.googleapis.com)^^
- ^^[Identity and Access Management (IAM) API](https://console.cloud.google.com/apis/library/iam.googleapis.com)^^
- ^^[Services Networking API](https://console.cloud.google.com/apis/library/servicenetworking.googleapis.com)^^
- ^^[Cloud SQL Admin API](https://console.cloud.google.com/apis/library/sqladmin.googleapis.com)^^
- ^^[Cloud Storage](https://console.cloud.google.com/apis/library/storage-component.googleapis.com)^^
- ^^[Serverless VPC Access API](https://console.cloud.google.com/apis/library/vpcaccess.googleapis.com)^^
- ^^[Secret Manager API](https://console.cloud.google.com/apis/library/secretmanager.googleapis.com)^^
- ^^[Cloud Scheduler API](https://console.cloud.google.com/apis/library/cloudscheduler.googleapis.com)^^

## Permissions

The following service account should be created and transferred to YData so the deployment can be triggered.
It is recommended (but not required) that you create a new project for the YData platform. This will make it easier to control
costs and to ensure that YData only have access to their resources.
You can create the service account using the provided commands using the gcloud cli (recommended) or create the service manually
using the google cloud UI.

### GCloud CLI

The following commands will create a new service account with the required permissions to complete the deployment.
The generated JSON file must be sent to YData.

1. Download the following file: https://raw.githubusercontent.com/ydataai/gcp-deploy-permissions/main/clients_custom_role.yaml
2. Create the new SA for the deployment

``` shell
export PROJECT_ID=
export SERVICE_ACCOUNT_NAME=

gcloud config set project $PROJECT_ID
```

- Create a new SA

```shell
gcloud iam service-accounts create $SERVICE_ACCOUNT_NAME --display-name "GCP Service Account for the Ydata platform"
```

- Get the new key file for the created SA

```shell
export SA_EMAIL=$(gcloud iam service-accounts list --filter $SERVICE_ACCOUNT_NAME --format 'value(email)')

gcloud iam service-accounts keys create gcp-ydata-platform-service-account.json --iam-account $SA_EMAIL
```

- Create a new role and associate this role to the new SA

```shell
gcloud iam roles create ydata_platform_gcp_iam_role --project $PROJECT_ID --file clients_custom_role.yaml

gcloud projects add-iam-policy-binding $PROJECT_ID --member "serviceAccount:$SA_EMAIL" --role "projects/$PROJECT_ID/roles/ydata_platform_gcp_iam_role"
```

- Activate the new SA locally

```shell
gcloud auth activate-service-account --project=$PROJECT_ID --key-file=gcp-ydata-platform-service-account.json
```

- Test the new SA by setting the new account

```shell
gcloud config set account $SA_EMAIL
gcloud config set project $PROJECT_ID
```

- Check if you are logged in with the new SA:

```shell
gcloud auth list
```

- Try a command.

```shell
gcloud container clusters list
```

### GCP Console

Go to IAM -> Service Accounts -> Create Service Account
Choose a name for the service account and click *“Create and Continue”*.
For the Roles add the following ones (you can search by these terms and select the resulting role):

- `roles/container.admin`
- `roles/compute.admin`
- `roles/iam.serviceAccountAdmin`
- `roles/dns.admin`
- `roles/iam.roleAdmin`
- `roles/resourcemanager.projectIamAdmin`
- `roles/cloudsql.admin`
- `roles/servicenetworking.networksAdmin`
- `roles/iam.serviceAccountKeyAdmin`
- `roles/serviceusage.serviceUsageAdmin`
- `roles/file.editor`
- `roles/storage.admin`
- `roles/cloudkms.admin`
- `roles/serviceusage.apiKeysAdmin`
- `roles/artifactregistry.admin`
- `roles/secretmanager.admin`
- `roles/vpcaccess.admin`
- `roles/run.admin`
- `roles/deploymentmanager.editor`
- `roles/cloudscheduler.admin`

After it finished, click Continue and Done.
Open the service account and create a new JSON key:
The transferred key will be used by YData.

## Resource Compute Quotas

Check and set (if needed) new quotas for the region where Fabric will be installed.

- Go to IAM & Admin
- Click “Quotas & System Limits” on the left
- Filter by your region and check for the following quotas

| Quota | Recommended | |
| --- | --- | --- |
| CPUs (all regions) | >200** | |
| C2D CPUs | 200** | |
| N2D CPUs | 24** | |
| Zonal & Regional 1-10 TiB (Enterprise) capacity (GB) per region | 1024GiB | |
***Each limit will depend on the platform usage and each client requirements.*

- If needed, request for a new limit to the Google's support team:
![google support request](../../../assets/deployment_security/google/google_supporteam_request.png){: style="width:35%"}
## Network configuration
Choose how you want to connect to the platform.
In GCP, it’s possible to connect to YData Fabric using your own DNS custom domain, for example: ydatafabric.yourdomain.com.
(It’s necessary to have a domain registered).
### Domain Name and GCP Cloud DNS Zone
If you have your domain registered in GCP Cloud DNS, you can use the Zone Name and the Domain Name, and the Deployment will
create a Managed Certificate and the Cloud DNS record pointing to the Load Balancer used to connect the platform.
Otherwise, if you have the domain registered in another provider, it is recommended to create a *Public Cloud DNS Zone*
and point and create a new record in your provider pointing to the NS of Google and pass this Zone Name and Domain name, so
the deployment occurs without any issues.
If you don’t want to create the Public Cloud DNS Zone you can point your to the IP available after the installation creating
an A record.
These parameters will be used during the deployment process.
## Login Provider
Choose how you want to login to the platform.
You can log in to our app currently using the following providers - at **least one is required**,
but you can choose multiple ones:
- Google
- Microsoft
- Cognito
- GitHub
You can find detailed instructions for each type of login provider in the [Login Providers page](../login_support/login_providers.md)
After configuring your login provider, please save the values. This values will be used during the deployment process.
If you required another authentication method, please fill up a support case at ^^[support.ydata.ai](http://support.ydata.ai)^^.
27 changes: 27 additions & 0 deletions docs/deployment_and_security/deployment/login_support/support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Support

The YData Fabric support ticketing mechanism is designed to ensure that our users receive timely and efficient assistance
for any issues they encounter while using our platform. This guide provides an in-depth overview of how the support ticketing
system works, including how to submit a ticket and communicate with our support team.

## Submitting a Support Ticket

While logged into your YData Fabric instance, navigate to the **Support** section from the main dashboard,
as shown in the image below.

![Fabric support](../../../assets/deployment_security/login_support/support_ticket.webp){: style="width:75%"}

To create a new ticket, make sure to fill in the following fields:

- **Subject**: The subject summary of your problem
- **Description**: The detailed description of your issue. Please make sure to be thorough in your description, as it will
help the team to provide you with better support. If you can describe the steps that you've made until you've found and
issue or the blocker that you are asking support for.
- **Fabric Modules**: Optionally, but highly recommend. If the issue happened while creating or interacting with the *Data Catalog*,
*Labs* or *Synthetic Data* generation module, users can attach the operational logs (which the platform collects).
The logs are fully operational and relate only to the selected component. Include no user data whatsoever (for instance, datasets are never sent).
The files are uploaded in the background to a location accessible by YData’s support team (private Amazon S3 Storage bucket in eu-west-1 region).

Considerably increase the ability of YData’s support team to offer timely and effective support.
After receiving the ticket (and any attached logs), YData’s support team will diagnose the issue and follow-up via e-mail as soon as possible.
E-mail is used as the default communication channel from that moment onwards.
8 changes: 3 additions & 5 deletions docs/sdk/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,13 @@
![Pythonversion](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10-blue)
[![downloads](https://pepy.tech/badge/ydata-sdk/month)](https://pepy.tech/project/ydata-sdk)

The *Fabric SDK* is an ecosystem of methods that allows users to, through a python interface, adopt data development focused on improving the quality of the data.
The solution includes a set of integrated components for data ingestion, standardized data quality evaluation and data improvement, such as *synthetic data generation*, allowing an iterative improvement of the datasets used in high-impact business applications.

!!! note "YData Fabric SDK for improved data quality everywhere!"

To start using create a Fabric community account at ^^[ydata.ai/register](https://ydata.ai/ydata-fabric-free-trial)^^

## Overview

The *Fabric SDK* is an ecosystem of methods that allows users to, through a python interface, adopt data development focused on improving the quality of the data.
The solution includes a set of integrated components for data ingestion, standardized data quality evaluation and data improvement, such as *synthetic data generation*, allowing an iterative improvement of the datasets used in high-impact business applications.

## Benefits

Fabric SDK interface enables the ability to integrate data quality tooling with other platforms offering several beneficts in the realm of
Expand Down
Empty file removed docs/sdk/reference/changelog.md
Empty file.
6 changes: 0 additions & 6 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ nav:
- 'data_catalog/connectors/index.md'
- How to create a connector?: 'data_catalog/connectors/create_connector.md'
- Supported connectors: 'data_catalog/connectors/supported_connections.md'
- Use a connector in Labs: 'data_catalog/connectors/use_in_labs.md'
- Datasources:
- 'data_catalog/datasources/index.md'
- Warnings: 'data_catalog/datasources/warnings.md'
Expand Down Expand Up @@ -61,7 +60,6 @@ nav:
- Privacy Level: "sdk/examples/synthesize_with_privacy_control.md"
- Conditional Sampling: "sdk/examples/synthesize_with_conditional_sampling.md"
- Reference:
- Changelog: 'sdk/reference/changelog.md'
- API:
- Client: 'sdk/reference/api/common/client.md'
- Connectors:
Expand All @@ -74,9 +72,7 @@ nav:
- Regular: 'sdk/reference/api/synthesizers/regular.md'
- TimeSeries: 'sdk/reference/api/synthesizers/timeseries.md'
- MultiTable: 'sdk/reference/api/synthesizers/multitable.md'
- Types: 'sdk/reference/api/common/types.md'
- Deployment & Security:
- 'deployment_and_security/index.md'
- Deployment:
- AWS:
- 🕒 Pre-deploy checklist: 'deployment_and_security/deployment/aws/pre_deploy_checklist.md'
Expand Down Expand Up @@ -178,8 +174,6 @@ extra_javascript:
- https://polyfill.io/v3/polyfill.min.js?features=es6
- https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
extra:
version:
provider: mike
generator: false
social:
- icon: fontawesome/brands/linkedin
Expand Down

0 comments on commit 2fe118e

Please sign in to comment.