Skip to content
This repository has been archived by the owner on Jun 15, 2024. It is now read-only.

Cannot detect the regional variations/ dialects in the Language with gcld3 #89

Open
PriyankaKB opened this issue Oct 17, 2023 · 0 comments

Comments

@PriyankaKB
Copy link

I am trying to detect the regional variations in language using gcld3. Below is the code I have tried so far...

import gcld3

def detect_language_with_region(text):
# Create a language detector object
detector = gcld3.NNetLanguageIdentifier(min_num_bytes=0, max_num_bytes=1000)

# Detect the language
result = detector.FindLanguage(text)

# Extract detected language
detected_language = result.language if result.is_reliable else "undetermined"

return detected_language

Example usage

text = "This is a sample text in English."
detected_language = detect_language_with_region(text)
print("Detected language:", detected_language)


The output of this code is as below:

Detected language: en

I want to detect regional variations/ dialects like "en-US", "en-GB", "en-AU" etc. as per country/region.
Is it possible to detect such dialects with gcld3?

Please, help on this. Any suggestions are welcome...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant