Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

daw_json_link's internal iterators behave differently than standard iterators #432

Open
pkerichang opened this issue Jul 16, 2024 · 4 comments

Comments

@pkerichang
Copy link

pkerichang commented Jul 16, 2024

so I'm still trying to get my custom map data structure working as described in Issue 428, I have the modified simple example as below:

#include <daw/daw_ensure.h>
#include <daw/json/daw_json_link.h>

#include <iostream>
#include <iterator>
#include <type_traits>
#include <vector>

template <typename Iter> std::size_t my_distance(Iter first, Iter last) {
    std::size_t count = 0;
    while (first != last) {
        // *first;  // issue 1
        ++first;
        ++count;
        if (count > 1000) {
            throw std::runtime_error("oops! more than 1000 elements?!");
        }
    }
    return count;
}

template <class Key, class T> struct vector_map {
    using value_type = std::pair<Key, T>;
    using vector_type = std::vector<value_type>;
    using mapped_type = T;
    using key_type = Key;
    using reference = typename vector_type::reference;
    using const_reference = typename vector_type::const_reference;
    using iterator = typename vector_type::iterator;
    using const_iterator = typename vector_type::const_iterator;

  private:
    vector_type data_;

  public:
    vector_map() = default;

    template <typename Iter> vector_map(Iter first, Iter last) {
        data_.reserve(my_distance(first, last));  // issue 2
        while (first != last) {
            auto [k, v] = *first;
            emplace(std::move(k), std::move(v));
            ++first;
        }
    }
    template <typename Container,
              std::enable_if_t<not std::is_same_v<vector_map, daw::remove_cvref_t<Container>>,
                               std::nullptr_t> = nullptr>
    vector_map(Container &&values) : vector_map(std::begin(values), std::end(values)) {}

    bool operator==(vector_map const &rhs) const { return data_ == rhs.data_; }

    const_iterator begin() const { return data_.begin(); }

    const_iterator end() const { return data_.end(); }

    template <class... Args> std::pair<iterator, bool> emplace(Args &&...args) {
        data_.emplace_back(std::make_pair(std::forward<Args>(args)...));
        return std::make_pair(data_.end() - 1, true);
    }
};

namespace daw::json {
template <class Key, class T> struct json_data_contract<vector_map<Key, T>> {
    using class_type = vector_map<Key, T>;
    using type = json_type_alias<json_key_value_no_name<class_type>>;
};
} // namespace daw::json

void json_test(std::string_view data_view) {
    auto obj = daw::json::from_json<vector_map<std::string, int>>(data_view);
    auto obj_json = daw::json::to_json(
        obj, daw::json::options::output_flags<daw::json::options::SerializationFormat::Pretty>);

    std::cout << obj_json.c_str() << std::endl;
}

int main() {
    auto v = vector_map<std::string, int>{};
    v.emplace("Hello", 42);
    v.emplace("World", 24);
    auto json_doc = daw::json::to_json(v);
    json_test(json_doc);
    auto v2 = daw::json::from_json<vector_map<std::string, int>>(json_doc);
    daw_ensure(v == v2);
}

The only difference from the example you gave in Issue 428 is that I try to pre-allocate the underlying vector with the correct size, by calling my_distance(first, last), which is a modified dumb version of std::distance().

With this code, my_distance() raises a runtime error, as it goes in an infinite loop because it seems like ++first does nothing unless you evaluate the iterator by calling *first first. This is different than how iterators work by convention. In fact, this is why std::distance() will not work on these iterators.

If I uncomment the *first line, then this code will raise a daw::json::v3_24_0d::json_exception, with the error Unexpected end of data. This implies that even though the iterators are pass-by-value to the my_distance() function, incrementing the iterator copies impacted the original iterator. Again, this is different than how iterators work by convention.

In summary:

  • C++ iterators can be incremented without evaluation, but these iterators cannot.
  • C++ iterators can be copied and incremented/evaluated while still keeping the original iterators intact, but these iterators cannot.

Since this is very non-standard, if things have to stay this way, it should be clearly documented. Also, is there anyway I can find out the size of the iterator range, even if it takes linear time to do so?

@beached
Copy link
Owner

beached commented Jul 22, 2024

The documentation could be better here, but it has never been an issue either as it's intended to fill containers which will only call distance on forward iterators which they sometimes are. Right now, it allows much cheaper filling. I could add an option to fulfill input fully however, but in the mean time a distance like the following would work

std::ptrdiff_t distance( auto first, auto last ) {
  std::ptrdiff_t count = 0;
  while( first != last ) {
    *first;
    ++count;
    ++first; // so it doesn't look weird
  }
  return count;
}

It will generally not be best to reserve the vector based on the size here though, unless the iterator passed to your constructor is forward. It sometimes is when the size is known because that member was in a different order to that needed. The cost of parsing will probably be less than that of allocation. What is often fruitful is to guess, in the static size case that probably isn't needed though as there is no allocation.

@pkerichang
Copy link
Author

Ah, it's been quite a while since I looked at iterators in C++, for some reason I completely forgot about input iterators (iterators which only support a single pass). Sorry about the confusion. So the only difference between daw_json_link's iterator and conventional input iterators is that you must dereference the iterator before incrementing. Probably not a big deal, but also a good idea to mention in the documentation.

Does iterators in daw_json_link support iterator_traits/iterator_category, so I can use it to figure out when the iterator arguments are forward v.s. input, and thus I can preallocate memory if possible? Not a big deal, just curious.

@beached
Copy link
Owner

beached commented Jul 22, 2024

The iterator does not fullfill input either because of op++ and equality comparable. They do work in all 3 of the standard implementations however for constructing/inserting into containers. Adding this as an option isn’t much, it just hasn’t been needed.

@pkerichang
Copy link
Author

Sounds fine to me, should I close this unless you want to keep track of documentation update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants