Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling quotes #35

Open
claell opened this issue Oct 20, 2019 · 8 comments
Open

Handling quotes #35

claell opened this issue Oct 20, 2019 · 8 comments

Comments

@claell
Copy link

claell commented Oct 20, 2019

Currently quotes seem to just be read out like normal text. That makes it hard to recognize them as quotes. It would be cool to have some better indication like a short introduction that now a quote will follow.

@petergtz
Copy link
Owner

Hey @claell, it also always bothered me that this information gets completely lost. Unfortunately, there is no built-in reading style or similar to signal Alexa when a quote starts.

So the alternative would be to let her say "quote" (or "Zitat" in German) before the quote starts. I wasn't sure if this is good English or German and therefore left it untouched so far.

Let me spend a little more time on this. In the meantime, I'll move this issue from "New issues" to backlog.

@claell
Copy link
Author

claell commented Oct 21, 2019

I guess at least in German using "Zitat" is ok to use. Having some kind of indication is better than no hint at all.

@petergtz
Copy link
Owner

I gave this a little more thought and it turns out (again) not to be a trivial issue :-). You can't just prefix every quote with a "quote" or "Zitat". Check e.g. https://de.wikipedia.org/wiki/Baum and see how many quotes are used for words. Now, of course we could build in a heuristic saying e.g. if there's just a single word within the quotes do not prefix it and so on. But it seems very error prone when done naivly and could, in a first shot, actually make the reading much worse.

So I still agree with you that the current way is not good, but fixing is probably much harder than I (or we?) thought :-). Leaving in backlog for now. If you have good ideas how to make it work, I'd love to hear them.

@claell
Copy link
Author

claell commented Oct 22, 2019

I had this article in mind when creating this issue: https://de.wikipedia.org/wiki/Arthur_Schopenhauer

So I was only thinking about those indented quotes on Wikipedia.

I thought there would be some special "quote" format for them on Wikipedia that can then be detected. However I am now not sure whether there is such special format nor whether it can be detected when using the API you currently use. But maybe there is hope :)

@petergtz
Copy link
Owner

I thought there would be some special "quote" format for them on Wikipedia that can then be detected.

Yes, that would very nice. But I just noticed, you actually revealed even more problems. Check out:

https://de.wikipedia.org/w/api.php?format=jsonfm&action=query&prop=extracts&titles=Arthur+Schopenhauer&redirects=true&formatversion=1&explaintext=true&exlimit=1

The indented quotes are not even part of the result of that API call. So currently, these kinds of quotes are completely broken in my skill. Hmmm...

@claell
Copy link
Author

claell commented Oct 22, 2019

Oh, really? That would explain, why the read out text felt a bit wrong there.

Maybe another bug in the Wikipedia API.

@petergtz
Copy link
Owner

Probably. Interestingly, while the other bugs are listed even on the extension page, missing quotes seemd to be missing from that page: https://www.mediawiki.org/wiki/Extension:TextExtracts#Caveats

@claell
Copy link
Author

claell commented Oct 23, 2019

I saw the same. So it is probably a new bug, I filed a report for that: https://phabricator.wikimedia.org/T236283 although it is probably also not fixed by them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants