Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieving full caption and entities for messagePhoto. #3265

Open
gramern opened this issue Mar 4, 2025 · 0 comments
Open

Retrieving full caption and entities for messagePhoto. #3265

gramern opened this issue Mar 4, 2025 · 0 comments

Comments

@gramern
Copy link

gramern commented Mar 4, 2025

Hello, I'm trying to build my own Telegram client in Python using TDLib and I have issues with getting complete captions and entities for some messagePhoto on updateNewMessage. Therefore, some messagePhoto messages are presented with apparently incomplete text (captions) compared to the native Telegram clients.

As mentioned in #3167 (even if for Bot API): "Also, apps may need to get some data from TDLib to display, for example, messages or full user profile info," I implemented some strategies to get that missing content while avoiding getMessages (#2605). I'm pasting those below:

# Try multiple strategies to get the complete message
target_message = None

# Wait for response
attempts = 0
max_attempts = 3

# Strategy 1:
if not target_message:
    logger.info(f"Strategy 1: Using searchChatMessages with empty query for recent photos")
    await self.td_send({
        '@type': 'searchChatMessages',
        'chat_id': chat_id,
        'query': '',  # Empty query to get recent messages
        'sender_id': None,
        'from_message_id': 0,
        'offset': 0,
        'limit': 20,  # Increased limit to find more recent photos
        'filter': {'@type': 'searchMessagesFilterPhoto'},
        'message_thread_id': 0
    })
    
    attempts = 0
    while attempts < max_attempts and not target_message:
        update = await self.td_receive()
        
        if update and update.get('@type') == 'foundChatMessages' and update.get('messages'):
            messages = update.get('messages')
            
            # First try exact ID match
            for msg in messages:
                if msg and msg.get('id') == message_id:
                    target_message = msg
                    logger.info(f"Found message using Strategy 2 (exact ID match)")
                    break
            
            # If no exact match but we found photos, use the most recent one
            if not target_message and messages and len(messages) > 0:
                # Look for messages with similar content
                for msg in messages:
                    if msg and msg.get('content', {}).get('@type') == 'messagePhoto':
                        caption = msg.get('content', {}).get('caption', {}).get('text', '')
                        # Check if captions are similar
                        if caption and caption_text and (caption.startswith(caption_text[:20]) or caption_text.startswith(caption[:20])):
                            target_message = msg
                            logger.info(f"Found message using Strategy 2 (content similarity)")
                            break
        
        attempts += 1
        if not target_message and attempts < max_attempts:
            await asyncio.sleep(0.3)

# Strategy 2: Use getChatHistory as last resort
if not target_message:
    logger.info(f"Strategy 2: Using getChatHistory to find recent messages")
    await self.td_send({
        '@type': 'getChatHistory',
        'chat_id': chat_id,
        'from_message_id': message_id,  # Get from the last message received from updates
        'offset': 0,
        'limit': 5,
        'only_local': False
    })
    
    attempts = 0
    while attempts < max_attempts and not target_message:
        update = await self.td_receive()
        
        if update and update.get('@type') == 'messages' and update.get('messages'):
            messages = update.get('messages')
            for msg in messages:
                if msg and msg.get('content', {}).get('@type') == 'messagePhoto':
                    # If we can't find exact ID, take the newest photo message
                    target_message = msg
                    logger.info(f"Found photo message using Strategy 2 (newest messagePhoto)")
                    break
        
        attempts += 1
        if not target_message and attempts < max_attempts:
            await asyncio.sleep(0.3)

These strategies don't solve my issues completely, as they sometimes work and sometimes don't. I'm certainly doing something wrong or missing something important. How can I always get complete captions and entities for new messagePhoto retrieved on updateNewMessage?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant