Broken links #4

pvgladkov · 2019-09-10T10:57:06Z

I see too much broken links in train_set.csv. From 1,027,871 images I downloaded only 565,002. I would like to use this dataset as a benchmark for comparing different approaches (including yours). But your evaluation method assumes the presence of all images.
Could you provide the full dataset?

The text was updated successfully, but these errors were encountered:

abby621 · 2019-09-10T14:58:06Z

Expedia seems to be in the process of changing their URL formats. We are going through to locate updated URLs for the broken images using the new URL format, and will post an updated train_set.csv as soon as it's ready. Apologies for the current broken images!

pvgladkov · 2019-09-10T15:00:05Z

Great! Thanks a lot!

bkj · 2019-10-17T20:02:28Z

Any updates on this? I'd like to download the dataset, but I'm hitting a large number of broken links as well.

Alternatively -- do you have a .tar.gz of the dataset that you'd be able to share?

Thanks!
~ Ben

av-savchenko · 2019-12-27T07:03:15Z

Thanks for gathering this dataset!
However, the issue with unresolved urls seems to be unresolved yet. I sucessfully downloaded only 250,463 images. Do you have any updates? Is it possible to share all images as suggested in the previous comment?

abby621 · 2019-12-27T15:32:20Z

Hi! For copyright reasons, we cannot release the specific images. We have been trying to determine if there is a new mapping for the broken images, but that does not seem to be the case. We will be releasing an updated dataset and report on results, and are working to see if we can get permission to share actual images rather than URLs.

Apologies for the delays; I got caught up in my first semester as a professor and this has taken longer for me to resolve than I had hoped/expected.

virginianegri · 2020-02-09T10:04:39Z

Hi! Are there any updates on this? Is there a projected date for the release of the updated dataset? I would like to use this as part of my master thesis project.
Thank you!!

Pyzow · 2020-03-20T17:53:27Z

+1 for curiousity of an update. Let me know if there's any way that I assist.

abby621 · 2020-04-30T00:23:52Z

Hi all! Apologies for the delayed update.

The repository has been updated with valid, downloadable imagery (the specific updates files are the dataset files in input/dataset.tar.gz and the test image tar ball which has an updated link in the repository). Due to copyright issues, we still provide links for all of the training imagery which has to be downloaded (the download_train.py file has also been updated to support downloading the updated imagery). This means that there remains the possibility that the travel website imagery may move again in the future. We are working to see if we can work out a solution to this with the imagery providers, but in the meantime, we hope that we have a functional solution for the foreseeable future.

There were a small number of the hotels from the original test set that no longer had any valid gallery images (due to there no longer being any working travel website images). Those test images have been deleted from the test set. There were also a few hundred training hotels that no longer had valid imagery. We have replaced those with new classes, leaving the number of classes in the gallery at 50,000.

I will be posting updated retrieval and classification results in the coming weeks. My hypothesis is that they won't be hugely different from those reported in the paper, but we will make sure to include the results in the repository, both for the method described in the Hotels-50K AAAI paper, and the new state of the art approach using Easy Positive Triplet Mining (presented at WACV2020, https://arxiv.org/abs/1904.04370).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broken links #4

Broken links #4

pvgladkov commented Sep 10, 2019

abby621 commented Sep 10, 2019

pvgladkov commented Sep 10, 2019

bkj commented Oct 17, 2019

av-savchenko commented Dec 27, 2019 •

edited

Loading

abby621 commented Dec 27, 2019

virginianegri commented Feb 9, 2020

Pyzow commented Mar 20, 2020

abby621 commented Apr 30, 2020 •

edited

Loading

Broken links #4

Broken links #4

Comments

pvgladkov commented Sep 10, 2019

abby621 commented Sep 10, 2019

pvgladkov commented Sep 10, 2019

bkj commented Oct 17, 2019

av-savchenko commented Dec 27, 2019 • edited Loading

abby621 commented Dec 27, 2019

virginianegri commented Feb 9, 2020

Pyzow commented Mar 20, 2020

abby621 commented Apr 30, 2020 • edited Loading

av-savchenko commented Dec 27, 2019 •

edited

Loading

abby621 commented Apr 30, 2020 •

edited

Loading