From d8f7bff96d1908158ae53ed94c2aecc81b75c0be Mon Sep 17 00:00:00 2001 From: yangj1211 Date: Wed, 13 Nov 2024 14:13:19 +0800 Subject: [PATCH] change doc of search-pic --- .../Overview/matrixone-introduction.md | 4 +- .../MatrixOne/Tutorial/search-picture-demo.md | 314 +++++++++--------- 2 files changed, 157 insertions(+), 161 deletions(-) diff --git a/docs/MatrixOne/Overview/matrixone-introduction.md b/docs/MatrixOne/Overview/matrixone-introduction.md index 2126857d4..9867c1f95 100644 --- a/docs/MatrixOne/Overview/matrixone-introduction.md +++ b/docs/MatrixOne/Overview/matrixone-introduction.md @@ -4,9 +4,9 @@ MatrixOne is a hyper-converged cloud & edge native distributed database with a s MatrixOne touts significant features, including real-time HTAP, multi-tenancy, stream computation, extreme scalability, cost-effectiveness, enterprise-grade availability, and extensive MySQL compatibility. MatrixOne unifies tasks traditionally performed by multiple databases into one system by offering a comprehensive ultra-hybrid data solution. This consolidation simplifies development and operations, minimizes data fragmentation, and boosts development agility. -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/archi-en-1.png?raw=true) +![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/architeture241113_en.png?raw=true) -MatrixOne is optimally suited for scenarios requiring real-time data input, large data scales, frequent load fluctuations, and a mix of procedural and analytical business operations. It caters to use cases such as mobile internet apps, IoT data applications, real-time data warehouses, SaaS platforms, and more. +MatrixOne is designed for scenarios that require real-time data ingestion, large-scale data management, fluctuating workloads, and multi-modal data management. It is particularly suited for environments that combine transactional and analytical workloads, such as generative AI applications, mobile internet applications, IoT data processing, real-time data warehouses, and SaaS platforms. ## **Key Features** diff --git a/docs/MatrixOne/Tutorial/search-picture-demo.md b/docs/MatrixOne/Tutorial/search-picture-demo.md index 7683cfac9..37d8b2b94 100644 --- a/docs/MatrixOne/Tutorial/search-picture-demo.md +++ b/docs/MatrixOne/Tutorial/search-picture-demo.md @@ -1,242 +1,238 @@ -# Example of application basis for graph search +# Basic example of image search application using images (text) -Currently, graphic and text search applications cover a wide range of areas. In e-commerce, users can search for goods by uploading images or text descriptions; in social media platforms, content can be found quickly through images or text to enhance the user's experience; and in copyright detection, image copyright can be identified and protected. In addition, text search is widely used in search engines to help users find specific images through keywords, while graphic search is used in machine learning and artificial intelligence for image recognition and classification tasks. +At present, related applications of image search and text search cover a wide range of fields. In e-commerce, users can search for products by uploading images or text descriptions; on social media platforms, users can quickly find relevant content through images or text. , enhance the user experience; in terms of copyright detection, it can help identify and protect image copyrights; in addition, text-based image search is also widely used in search engines to help users find specific images through keywords, while image-based image search is used in Used for image recognition and classification tasks in the field of machine learning and artificial intelligence. -The following is a flow chart of a graphic search: +The following is a flow chart for searching pictures using pictures (text):
- +
-As you can see, vectorized storage and retrieval of images is involved in building graph-to-text search applications, while MatrixOne's vector capabilities and multiple retrieval methods provide critical technical support for building graph-to-text search applications. +It can be seen that building an image (text) search application involves vectorized storage and retrieval of images, and MatrixOne has vector capabilities and provides a variety of retrieval methods, which provides a good foundation for building an image (text) image search application. The application provides critical technical support. +In this chapter, we will combine the vector capabilities of MatrixOne with Streamlit to build a simple web application for image (text) search. -In this chapter, we'll build a simple graphical (textual) search application based on MatrixOne's vector capabilities. +## Preparation before starting -## Prepare before you start +### Related knowledge -### Relevant knowledge +**Transformers**: Transformers is an open source natural language processing library that provides a wide range of pre-trained models. Through the Transformers library, researchers and developers can easily use and integrate CLIP models into their projects. -**Transformers**: Transformers is an open source natural language processing library that provides a wide range of pre-trained models through which researchers and developers can easily use and integrate CLIP models into their projects. +**CLIP**: The CLIP model is a deep learning model released by OpenAI. The core is to unify text and images through contrastive learning methods, so that tasks such as image classification can be completed through text-image similarity without the need for Optimize tasks directly. It can be combined with a vector database to build a tool for image (text) search. High-dimensional vector representations of images are extracted through the CLIP model to capture their semantic and perceptual features, and then these images are encoded into an embedding space. At query time, a sample image is passed through the same CLIP encoder to obtain its embedding, performing a vector similarity search to efficiently find the top k closest database image vectors. -**CLIP**: The CLIP model is a deep learning model published by OpenAI. At its core is the unified processing of text and images through contrastive learning, enabling tasks such as image classification to be accomplished through text-image similarity without the need for direct optimization tasks. It can be combined with a vector database to build tools to search graphs. High-dimensional vector representations of images are extracted through CLIP models to capture their semantic and perceptual features, and then encoded into an embedded space. At query time, the sample image gets its embedding through the same CLIP encoder, performing a vector similarity search to effectively find the first k closest database image vectors. +**Streamlit**: is an open source Python library designed to quickly build interactive and data-driven web applications. Its design goal is to be simple and easy to use. Developers can create interactive dashboards and interfaces with very little code, especially suitable for displaying machine learning models and data visualization. -### Software Installation +### Software installation -Before you begin, confirm that you have downloaded and installed the following software: +Before you begin, make sure you have downloaded and installed the following software: -- Verify that you have completed the [standalone deployment of](../Get-Started/install-standalone-matrixone.md) MatrixOne. +- Confirm that you have completed [Stand-alone Deployment of MatrixOne](../Get-Started/install-standalone-matrixone.md). -- Verify that you have finished installing [Python 3.8 (or plus)](https://www.python.org/downloads/). Verify that the installation was successful by checking the Python version with the following code: +- Make sure you have installed [Python 3.8(or plus) version](https://www.python.org/downloads/). Use the following code to check the Python version to confirm the installation was successful: ``` -python3 -V +python3 -V ``` -- Verify that you have completed installing the MySQL client. +- Confirm that you have completed installing the MySQL client. -- Download and install the `pymysql` tool. Download and install the `pymysql` tool using the following code: +- Download and install the `pymysql` tool. Use the following code to download and install the `pymysql` tool: ``` -pip install pymysql +pip install pymysql ``` -- Download and install the `transformers` library. Download and install the `transformers` library using the following code: +- Download and install the `transformers` library. Use the following code to download and install the `transformers` library: ``` -pip install transformers +pip install transformers ``` -- Download and install the `Pillow` library. Download and install the `Pillow` library using the following code: +- Download and install the `Pillow` library. Use the following code to download and install the `Pillow` library: ``` -pip install pillow +pip install pillow ``` -## Build your app +- Download and install the `streamlit` library. Use the following code to download and install the `Pillow` library: -### Building table +``` +pip install streamlit +``` + +## Build the application + +### Create table and enable vector index -Connect to MatrixOne and create a table called `pic_tab` to store picture path information and corresponding vector information. +Connect to MatrixOne and create a table named `pic_tab` to store picture path information and corresponding vector information. ```sql -create table pic_tab(pic_path varchar(200), embedding vecf64(512)); +create table pic_tab(pic_path varchar(200), embedding vecf64(512)); +SET GLOBAL experimental_ivf_index = 1; +create index idx_pic using ivfflat on pic_tab(embedding) lists=3 op_type "vector_l2_ops" ``` -### Load Model +### Build the application + +Create the python file pic_search_example.py and write the following content. This script mainly uses the CLIP model to extract the high-dimensional vector representation of the image, and then stores it in MatrixOne. At query time, a sample image is passed through the same CLIP encoder to obtain its embedding, performing a vector similarity search to efficiently find the top k closest database image vectors. ```python +import streamlit as st +importpymysql +from PIL import Image +import matplotlib.pyplot as plt +import matplotlib.image as mpimg from transformers import CLIPProcessor, CLIPModel +import os +from tqdm import tqdm + +# Database connection +conn = pymysql.connect( + host='127.0.0.1', + port=6001, + user='root', + password="111", + db='db1', + autocommit=True +) + +cursor = conn.cursor() # Load model from HuggingFace model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32") processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32") -``` - -### Traversing the Picture Path -The definition method `find_img_files` traverses the local images folder, where I pre-stored images of fruit in five categories, apple, banana, blueberry, cherry, and apricot, several in each category, in `.jpg` format. - -```python +# Traverse image path def find_img_files(directory): - img_files = [] # Used to store found .jpg file paths + img_files = [] for root, dirs, files in os.walk(directory): for file in files: - if file.lower().endswith('.jpg'): +if file.lower().endswith('.jpg'): full_path = os.path.join(root, file) - img_files.append(full_path) # Build the full file path + img_files.append(full_path) return img_files -``` - -- Image vectorized and stored in MatrixOne - -Define the method `storage_img` to map the picture into a vector, normalize it (not required) and store it in MatrixOne. MatrixOne supports L2 normalization of vectors using the `NORMALIZE_L2()` function. In some cases, features of the data may be distributed at different scales, which may cause some features to have a disproportionate effect on distance calculations. By normalizing, this effect can be reduced and the contribution of different characteristics to the end result more balanced. And when using the L2 distance measure, L2 normalization avoids vectors of different lengths affecting distance calculations. - -```python -import pymysql -from PIL import Image - -conn = pymysql.connect( - host = '127.0.0.1', - port = 6001, - user = 'root', - password = "111", - db = 'db1', - autocommit = True - ) - -cursor = conn.cursor() - -# Map the image into vectors and store them in MatrixOne -def storage_img(): - for file_path in jpg_files: - image = Image.open(file_path) - if image.mode != 'RGBA': - image = image.convert('RGBA') - inputs = processor(images=image, return_tensors="pt", padding=True) - img_features = model.get_image_features(inputs["pixel_values"]) # Using models to acquire image features - img_features = img_features .detach().tolist() # Separate tensor, convert to list - embeddings = img_features [0] - insert_sql = "insert into pic_tab(pic_path,embedding) values (%s, normalize_l2(%s))" - data_to_insert = (file_path, str(embeddings)) - cursor.execute(insert_sql, data_to_insert) - image.close() -``` -### View quantity in `pic_tab` table +# Map image to vector and store in MatrixOne +def storage_img(jpg_files): + for file_path in tqdm(jpg_files, total=len(jpg_files)): + image = Image.open(file_path) + if image.mode != 'RGBA': + image = image.convert('RGBA') + inputs = processor(images=image, return_tensors="pt", padding=True) +img_features = model.get_image_features(inputs["pixel_values"]) + img_features = img_features.detach().tolist() + embeddings = img_features[0] + insert_sql = "INSERT INTO pic_tab(pic_path, embedding) VALUES (%s, normalize_l2(%s))" + data_to_insert = (file_path, str(embeddings)) + cursor.execute(insert_sql, data_to_insert) + image.close() -```sql -mysql> select count(*) from pic_tab; -+----------+ -| count(*) | -+----------+ -| 4801 | -+----------+ -1 row in set (0.00 sec) -``` - -As you can see, the data was successfully stored into the database. - -### Build Vector Index - -MatrixOne supports vector indexing in IVF-FLAT, where each search requires recalculating the similarity between the query image and each image in the database without an index. The index, on the other hand, reduces the amount of computation necessary by performing similarity calculations only on images marked as "relevant" in the index. - -```python def create_idx(n): - cursor.execute('SET GLOBAL experimental_ivf_index = 1') - create_sql = 'create index idx_pic using ivfflat on pic_tab(embedding) lists=%s op_type "vector_l2_ops"' +create_sql = 'create index idx_pic using ivfflat on pic_tab(embedding) lists=%s op_type "vector_l2_ops"' cursor.execute(create_sql, n) -``` - -### Search in graphic (text) -Next, we define the methods `img_search_img` and `text_search_img` to implement graph and text search. MatrixOne has vector retrieval capabilities and supports multiple similarity searches, where we use `l2_distance` to retrieve. - -```python -# search for maps +# Image-to-image search def img_search_img(img_path, k): image = Image.open(img_path) inputs = processor(images=image, return_tensors="pt") img_features = model.get_image_features(**inputs) img_features = img_features.detach().tolist() img_features = img_features[0] - query_sql = "select pic_path from pic_tab order by l2_distance(embedding,normalize_l2(%s)) asc limit %s" +query_sql = "SELECT pic_path FROM pic_tab ORDER BY l2_distance(embedding, normalize_l2(%s)) ASC LIMIT %s" data_to_query = (str(img_features), k) cursor.execute(query_sql, data_to_query) - global data - data = cursor.fetchall() + return cursor.fetchall() -# search for pictures by writing -def text_search_img(text,k): +# Text-to-image search +def text_search_img(text, k): inputs = processor(text=text, return_tensors="pt", padding=True) text_features = model.get_text_features(inputs["input_ids"], inputs["attention_mask"]) embeddings = text_features.detach().tolist() - embeddings = embeddings[0] - query_sql = "select pic_path from pic_tab order by l2_distance(embedding,normalize_l2(%s)) asc limit %s" - data_to_query = (str(embeddings),k) +embeddings = embeddings[0] + query_sql = "SELECT pic_path FROM pic_tab ORDER BY l2_distance(embedding, normalize_l2(%s)) ASC LIMIT %s" + data_to_query = (str(embeddings), k) cursor.execute(query_sql, data_to_query) - global data - data = cursor.fetchall() -``` - -### Search Results Showcase - -When retrieving a relevant image from an image or text, we need to print the results, where we use Matplotlib to present the search results. - -```python -import matplotlib.pyplot as plt -import matplotlib.image as mpimg - -def show_img(img_path,rows,cols): - if img_path: - result_path = [img_path] + [path for path_tuple in data for path in path_tuple] + return cursor.fetchall() + +# Show results +def show_img(result_paths): + fig, axes = plt.subplots(nrows=1, ncols=len(result_paths), figsize=(15, 5)) + for ax, result_path in zip(axes, result_paths): + image = mpimg.imread(result_path[0]) # Read image +ax.imshow(image) # Display image + ax.axis('off') # Remove axes + ax.set_title(result_path[0]) # Set subtitle + plt.tight_layout() # Adjust subplot spacing + st.pyplot(fig) # Display figure in Streamlit + +# Streamlit interface +st.title("Image and Text Search Application") + +# Prompt for local directory path input +directory_path = st.text_input("Enter the local image directory") + +# Once user inputs path, search for images in the directory +if directory_path: +if os.path.exists(directory_path): + jpg_files = find_img_files(directory_path) + if jpg_files: + st.success(f"Found {len(jpg_files)} images in the directory.") + if st.button("uploaded"): + storage_img(jpg_files) + st.success("Upload successful!") + else: + st.warning("No .jpg files found in the directory.") + else: + st.error("The specified directory does not exist. Please check the path.") +#Image upload option +uploaded_file = st.file_uploader("Upload an image for search", type=["jpg", "jpeg", "png"]) +if uploaded_file is not None: + # Display uploaded image + img = Image.open(uploaded_file) + st.image(img, caption='Uploaded image', use_column_width=True) + + # Perform image-to-image search + if st.button("Search by image"): + result = img_search_img(uploaded_file, 3) # Image-to-image search + if result: +st.success("Search successful. Results are displayed below:") + show_img(result) # Display results + else: + st.error("No matching results found.") + +# Text input for text-to-image search +text_input = st.text_input("Enter a description for search") +if st.button("Search by text"): + result = text_search_img(text_input, 3) # Text-to-image search + if result: + st.success("Search successful. Results are displayed below:") +show_img(result) # Display results else: - result_path = [path for path_tuple in data for path in path_tuple] - # Create a new graph and axes - fig, axes = plt.subplots(nrows=rows, ncols=cols, figsize=(10, 10)) - # Loop over image paths and axes - for i, (result_path, ax) in enumerate(zip(result_path, axes.ravel())): - image = mpimg.imread(result_path) # Read image - ax.imshow(image) # Show picture - ax.axis('off') # Remove Axis - ax.set_title(f'image{i + 1}') # Setting the Submap Title - plt.tight_layout() # Adjusting subgraph spacing - plt.show() # Display the entire graph + st.error("No matching results found.") ``` -### View Results +**Code Interpretation:** -Run the program by entering the following code in the main program: +1. Connect to the local MatrixOne database through pymysql to insert image features and query similar photos. +2. Use HuggingFace's transformers library to load the OpenAI pre-trained CLIP model (clip-vit-base-patch32). The model supports processing text and images simultaneously, converting them into vectors for similarity calculations. +3. Define the method find_img_files to traverse the local picture folder. Here I have pre-stored five categories of fruit pictures locally: apples, bananas, blueberries, cherries, and apricots. There are several pictures of each category, and the format is jpg. +4. Store the image features into the database, use the CLIP model to extract the embedding vector of the image, and store the image path and embedding vector in the pic_tab table of the database. +5. Define methods img_search_img and text_search_img to implement image search and text search. MatrixOne has vector retrieval capabilities and supports multiple similarity searches. Here we use Euclidean distance for retrieval. +6. Display the image results, use matplotlib to display the image path queried from the database, and display it on the Streamlit web interface. -```python -if __name__ == "__main__": - directory_path = '/Users/admin/Downloads/fruit01' # Replace with the actual directory path - jpg_files = find_img_files(directory_path) - storage_img() - create_idx(4) - img_path = '/Users/admin/Downloads/fruit01/blueberry/f_01_04_0450.jpg' - img_search_img(img_path, 3) # search for maps - show_img(img_path,1,4) - text = ["Banana"] - text_search_img(text,3) # search for pictures by writing - show_img(None,1,3) -``` - -Using the results of the chart search, the first chart on the left is a comparison chart. As you can see, the searched picture is very similar to the comparison chart: - -
- -
+### Running results -As you can see from the text search results, the searched image matches the input text: +```bash +streamlit run pic_search_example.py +```
- +
-## Reference Documents +## Reference documentation -- [Vector Type](../Develop/Vector/vector_type.md) -- [Vector retrieval](../Develop/Vector/vector_search.md) +- [Vector type](../Develop/Vector/vector_type.md) +- [Vector Search](../Develop/Vector/vector_search.md) - [CREATE INDEX...USING IVFFLAT](../Reference/SQL-Reference/Data-Definition-Language/create-index-ivfflat.md) - [L2_DISTANCE()](../Reference/Functions-and-Operators/Vector/l2_distance.md) - [NORMALIZE_L2()](../Reference/Functions-and-Operators/Vector/normalize_l2.md) \ No newline at end of file