-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update genai notebooks - Cameron J. #94
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please address the comments I've left and let me know if you have any questions!
@@ -0,0 +1,234 @@ | |||
# Setting Up Azure Environment for Azure GenAI Cloud Lab | |||
|
|||
Welcome! This guide will help you set up your Azure environment to complete the activities in the [Azure GenAI](../) directory of the NIH Cloud Lab. We will walk you through the steps required to configure PowerShell, deploy necessary resources using an ARM template, upload local files to Azure Storage Account, and acquire keys and secrets for `.env` variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that if the user is using Azure Machine Learning and creates a notebook there then the CLI is already installed there. It might be the same for the VMs created in Azure. If so, then that would be a helpful note to add here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added note to Prerequisites, advising the user of such circumstance. If using such environment, user is encouraged to skip step 1 and move directly to step 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposing we also incorporate steps to deploying Azure Resources manually from the Azure portal in this tutorial as well. Therefore, we can have a unified landing zone for users to configure their working environments for all tutorials in GenAI notebook using either the ARM file as an option or manual deployments as an option. In each tutorial for prerequisites we can then link users to this landing zone if they have not yet configured the environment and resources.
## Prerequisites | ||
|
||
- An active Azure subscription | ||
- PowerShell installed on your machine (option 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a particular reason why the user should use powershell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choosing between Azure CLI and PowerShell comes down to personal preference and the working environment. Brief overview added to the Prerequisites section for helping users choose.
@@ -0,0 +1,70 @@ | |||
import streamlit as st |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential Blocker: I couldn't seem to get the Streamlit UI to launch in my Azure account. It could be that the NIH environment blocks this. I'll be testing this out some more and will let you know if the issue persists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Streamlit expects to run the demo application locally on port 8501, explaining the Streamlit site will not launch when running the demo from Azure ML. A workaround involves utilizing Ngrok, which provides a secure tunnel for the Streamlit demo app, exposing the application to the internet. This is a secure and efficient alternative to accessing the Streamlit app without needing to run locally from port 8501.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ngrock script has been created and successfully tested with the Streamlit demo in my Azure ML env. Demo now works as expected. Documentation added to readme, section "## Executing the Azure OpenAI Demo w/ Streamlit Frontend" on what this script does and how to execute.
# load in .env variables | ||
load_dotenv() | ||
|
||
# configure azure openai keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this isn't required I would advise deleting it so as to not confuse new users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is crucial for ensuring that all environment secrets and keys are imported into the Python script, keeping sensitive information secure by separating it from the codebase. The code has been updated with better detailed documentation. However, please note that this particular page of the demo may not execute as expected when launching from an Azure ML notebook or VM with the identified Ngrok workaround. This is because the Streamlit app will be launched to the internet, providing limited access to the local CSV file necessary to generate embeddings and chat over. I recommend adding a note that this particular page of the demo should be executed from a local machine. Alternatively, we should consider whether it’s best to archive this page from the demo site.
@@ -0,0 +1,234 @@ | |||
# Setting Up Azure Environment for Azure GenAI Cloud Lab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an estimate on the price of how much running this README in one setting could potentially cost a user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Executing the ARM template in this README does not incur any cost. However, the resources deployed in the ARM template will incur costs overtime, based on Azure's pricing for each resource. These resources are the same resources that were originally being used in the Cloud Lab around AOAI, such as AI Search, Blob Storage, and Azure OpenAI service. The ARM template serves as an automated way to deploy these resources to a resource group, rather than having to manually deploy each resource from the Azure portal. "## Resources and Cost Breakdown" added to readme with estimated costs from Azure Pricing Calculator based on SKUs found in the ARM template for each resource.
@@ -0,0 +1,234 @@ | |||
# Setting Up Azure Environment for Azure GenAI Cloud Lab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please structure the tutorial to follow the outline laid out in the tutorial checklist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tutorial has been structured to follow the outline laid out in checklist. Skill level identification has not yet been added. Where should skill level identification be placed in the tutorial structure?
|
||
- Navigate to the /GenAI directory: | ||
```sh | ||
cd .\notebooks\GenAI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change backslashes to forward slashes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backslashes have been changed to forward slashes.
|
||
4. Navigate to the /embeddings directory (location of the Streamlit demo) | ||
```sh | ||
cd .\notebooks\GenAI\embedding_demos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are already in the GenAI directory please change to cd ./embedding_demos
and fix slashes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrections have been made to directory path and slashes.
|
||
5. Execute the Streamlit demo | ||
```sh | ||
streamlit run Demo_Suite.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
streamlit wasn't able to launch in Jupyter Lab.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ngrock script has been created and successfully tested with the Streamlit demo in my Azure ML env. Demo now works as expected. Documentation added to readme, section "## Executing the Azure OpenAI Demo w/ Streamlit Frontend" on what this script does and how to execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the import error in the Jupyter notebook tutorial stopped me from running the rest of the notebook
" \n", | ||
"# For handling Azure credentials \n", | ||
"from azure.core.credentials import AzureKeyCredential \n", | ||
"from azure.identity import DefaultAzureCredential \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ran into an error with importing this python package: ImportError: cannot import name 'AccessTokenInfo' from 'azure.core.credentials' (/anaconda/envs/jupyter_env/lib/python3.8/site-packages/azure/core/credentials.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were you able to successfully execute the pip command in cell one to install packages in the kernel? If so, it's possible there can be a discrepancy between the Python version you are using and the Python version which the code was built on top of. The notebook is using Python 3.11.9. Please confirm if you are using a separate version from this.
"from dotenv import load_dotenv \n", | ||
" \n", | ||
"# For utilizing OpenAI functionalities within Azure \n", | ||
"from openai import AzureOpenAI \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another import error: ImportError: cannot import name 'Sequence' from 'typing_extensions' (/anaconda/envs/jupyter_env/lib/python3.8/site-packages/typing_extensions.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were you able to successfully execute the pip command in cell one to install packages in the kernel? If so, it's possible there can be a discrepancy between the Python version you are using and the Python version which the code was built on top of. The notebook is using Python 3.11.9. Please confirm if you are using a separate version from this.
df['embedding'] = df['text'].apply(lambda x: get_embedding(x)) | ||
df.to_csv('microsoft-earnings_embeddings.csv', index=False) | ||
df['embedding'] = df['text'].apply(lambda x:get_embedding(x, engine=os.getenv("AZURE_EMBEDDINGS_DEPLOYMENT"))) | ||
df.to_csv('.\\microsoft-earnings_embeddings.csv', index=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo please change .\\
to ../
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.join has been incorporated to ensure the output file can be saved to /example_scripts across different operating systems, including Azure ML environment. Loading pattern also added, to make user aware that the script is generating embeddings, rather than being stuck in a continuous loop.
|
||
#create cosine function | ||
def cosine_similarity(a, b): | ||
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) | ||
|
||
# read in the embeddings .csv | ||
# convert elements in 'embedding' column back to numpy array | ||
df = pd.read_csv('microsoft-earnings_embeddings.csv') | ||
df = pd.read_csv('.\\microsoft-earnings_embeddings.csv') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo please change .\\
to ../
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two example scripts my require a README to explain they are connected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.join has been incorporated to ensure the input file "'microsoft-earnings_embeddings.csv" can be read into the df across different operating systems, including Azure ML environment. I agree a README will enhance the awareness that the two scripts are connected. Proposing that we create a README for the /example_scripts directory, encompassing all scripts in this directory and identifying what each script does and any connections they may have.
|
||
### 5. Deploying the ARM Template | ||
|
||
Deploy the [ARM template](/notebooks/GenAI/azure_infra_setup/arm_resources.json) to create the Azure Storage Account, Azure AI Search, and Azure OpenAI resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link didn't work please change to arm_resources.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjustment made to arm_resources.json.
Pull Request Template
Description
This pull request includes several updates and improvements within the GenAI directory:
Assignee
*Assignees: @zbyosufzai *
PR checklist
Please ensure the following: