Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop summary generation #11

Open
lebonq opened this issue Feb 18, 2025 · 4 comments
Open

Loop summary generation #11

lebonq opened this issue Feb 18, 2025 · 4 comments

Comments

@lebonq
Copy link

lebonq commented Feb 18, 2025

I've added miniflux-ai to my docker compose of miniflux. Everything work fine except the news summary, every 2–3 minutes the fetch_unread_entries.py is called and generate a summary for every unread entry. However, it runs multiple times on the same article as this :

Image

Mini-flux-ai should be able to know if an article as already being summarized to prevent useless use of the model and avoid to break the layout of the article.

@Qetesh
Copy link
Owner

Qetesh commented Feb 18, 2025

Thanks for using and feedback

The entry_entry method of entry_filter.py is used in the project to filter summarized articles by title and style_block style keywords in the config file.

Could you provide the configuration file for title in agents, and logs to identify the issue

@lebonq
Copy link
Author

lebonq commented Feb 18, 2025

I extracted the entries.json to see what is wrong.

[
    {
        "datetime": "2025-02-18T09:54:13.595325+01:00",
        "category": "Actualités",
        "title": "Guerre en Ukraine : \"Les Européens sont en train de négocier avec les Américains\", assure Nicole Gnesotto, vice-présidente de l'Institut Jacques-Delors",
        "content": "Nicole Gnesotto assure que les Européens négocient avec les Américains. \nIl y a deux négociations : américano-russe et euro-américaine.\nL'Europe doit être associée aux négociations sur l'Ukraine."
    },
    {
        "datetime": "2025-02-18T09:54:13.595325+01:00",
        "category": "Actualités",
        "title": "Guerre en Ukraine : \"Les Européens sont en train de négocier avec les Américains\", assure Nicole Gnesotto, vice-présidente de l'Institut Jacques-Delors",
        "content": "Les Européens négocient avec les Américains sur l'Ukraine.\nL'Europe doit être associée aux négociations pour un accord de paix.\nIl y a deux négociations : américano-russe et euro-américaine."
    }
]

And the title in agent is defined as follow :

 title: '<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 17.777 14.283" width="17.777" height="14.283"> <style> path { fill: #4b4b4b; } @media (prefers-color-scheme: dark) { path { fill: #d0d0d0; } } </style> <g fill="#d0d0d0" fill-opacity="1" transform="translate(2.261,-1.754)"> <path d="M-2.261 3.194v6.404c0 1.549 0.957 4.009 4.328 4.188h9.224l0.061 1.315c0.04 0.882 0.663 1.222 1.205 0.666l2.694-2.356c0.353-0.349 0.353-0.971 0-1.331L12.518 10.047c-0.525-0.524-1.205-0.196-1.205 0.665v1.091H2.257c-0.198 0-2.546 0.221-2.546-2.911V3.194c0-0.884-0.362-1.44-0.99-1.44-1.106 0-0.956 1.439-0.982 1.44z"/> </g> <path d="M5.679 1.533h8.826c0.421 0 0.753-0.399 0.755-0.755 0.002-0.36-0.373-0.774-0.755-0.774H5.679c-0.536 0-0.781 0.4-0.781 0.764 0 0.418 0.289 0.764 0.781 0.764zm0 4.693h4.502c0.421 0 0.682-0.226 0.717-0.742 0.03-0.44-0.335-0.787-0.717-0.787H5.679c-0.402 0-0.763 0.214-0.781 0.71-0.019 0.535 0.379 0.818 0.781 0.818z" fill="#d0d0d0"/> </svg> AI summary:'

I checked the code and I think process_entry should store also the id field of entry, thus entry_filter could filter using the unique ID of each entry instead of trying to match a string that represent the title.

@Qetesh
Copy link
Owner

Qetesh commented Feb 18, 2025

If a unique ID needs to be saved, data persistence and the database must be considered, and since this project will perform a hotfix on miniflux and there will be two databases, I will consider this solution based on the future.

Based on the configuration file, there shouldn't be any issues

Considering that fetch_unread_entries.py is normally executed once a minute without webhook callback configuration, this does not match the interval of every 2–3 minutes you mentioned

Need to check application stdout logs to check this issue

@lebonq
Copy link
Author

lebonq commented Feb 18, 2025

I check closely the log, it does happen every minute. Sorry about the imprecision

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants