Skip to content

Retrieving full caption and entities for messagePhoto. #3265

Closed
@gramern

Description

@gramern

Hello, I'm trying to build my own Telegram client in Python using TDLib and I have issues with getting complete captions and entities for some messagePhoto on updateNewMessage. Therefore, some messagePhoto messages are presented with apparently incomplete text (captions) compared to the native Telegram clients.

As mentioned in #3167 (even if for Bot API): "Also, apps may need to get some data from TDLib to display, for example, messages or full user profile info," I implemented some strategies to get that missing content while avoiding getMessages (#2605). I'm pasting those below:

# Try multiple strategies to get the complete message
target_message = None

# Wait for response
attempts = 0
max_attempts = 3

# Strategy 1:
if not target_message:
    logger.info(f"Strategy 1: Using searchChatMessages with empty query for recent photos")
    await self.td_send({
        '@type': 'searchChatMessages',
        'chat_id': chat_id,
        'query': '',  # Empty query to get recent messages
        'sender_id': None,
        'from_message_id': 0,
        'offset': 0,
        'limit': 20,  # Increased limit to find more recent photos
        'filter': {'@type': 'searchMessagesFilterPhoto'},
        'message_thread_id': 0
    })
    
    attempts = 0
    while attempts < max_attempts and not target_message:
        update = await self.td_receive()
        
        if update and update.get('@type') == 'foundChatMessages' and update.get('messages'):
            messages = update.get('messages')
            
            # First try exact ID match
            for msg in messages:
                if msg and msg.get('id') == message_id:
                    target_message = msg
                    logger.info(f"Found message using Strategy 2 (exact ID match)")
                    break
            
            # If no exact match but we found photos, use the most recent one
            if not target_message and messages and len(messages) > 0:
                # Look for messages with similar content
                for msg in messages:
                    if msg and msg.get('content', {}).get('@type') == 'messagePhoto':
                        caption = msg.get('content', {}).get('caption', {}).get('text', '')
                        # Check if captions are similar
                        if caption and caption_text and (caption.startswith(caption_text[:20]) or caption_text.startswith(caption[:20])):
                            target_message = msg
                            logger.info(f"Found message using Strategy 2 (content similarity)")
                            break
        
        attempts += 1
        if not target_message and attempts < max_attempts:
            await asyncio.sleep(0.3)

# Strategy 2: Use getChatHistory as last resort
if not target_message:
    logger.info(f"Strategy 2: Using getChatHistory to find recent messages")
    await self.td_send({
        '@type': 'getChatHistory',
        'chat_id': chat_id,
        'from_message_id': message_id,  # Get from the last message received from updates
        'offset': 0,
        'limit': 5,
        'only_local': False
    })
    
    attempts = 0
    while attempts < max_attempts and not target_message:
        update = await self.td_receive()
        
        if update and update.get('@type') == 'messages' and update.get('messages'):
            messages = update.get('messages')
            for msg in messages:
                if msg and msg.get('content', {}).get('@type') == 'messagePhoto':
                    # If we can't find exact ID, take the newest photo message
                    target_message = msg
                    logger.info(f"Found photo message using Strategy 2 (newest messagePhoto)")
                    break
        
        attempts += 1
        if not target_message and attempts < max_attempts:
            await asyncio.sleep(0.3)

These strategies don't solve my issues completely, as they sometimes work and sometimes don't. I'm certainly doing something wrong or missing something important. How can I always get complete captions and entities for new messagePhoto retrieved on updateNewMessage?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions