This sample uses the Document AI API to detect the languages in a multi-page document.
-
Install the prerequisites:
pip install -r requirements.txt
-
Update the following values with information from your project
PROJECT_ID = "YOUR_PROJECT_ID" LOCATION = "us" # Format is 'us' or 'eu' PROCESSOR_ID = "YOUR_PROCESSOR_ID" # Create processor in Cloud Console
-
Run the sample:
python extract_languages.py
-
Your output should look like this if using the sample document:
$ python3 extract-languages.py Document processing complete. page_number language_code confidence 0 1 en 98% 1 1 und 2% 2 2 th 62% 3 2 und 20% 4 2 en 15% 5 2 bs 2% 6 2 it 1% 7 3 en 97% 8 3 de 1% 9 3 und 1% 10 3 so 1%