Skip to content

Introduce Extensions concept to object_store::GetOptions and object_store::PutOptions #17

Closed
apache/arrow-rs
#7170
@waynr

Description

@waynr

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

This problem is roughly described in apache/arrow-rs#7135, but essentially we are looking for a way to pass arbitrary implementation-specific data (such as configuration or tracing spans) originating at a high-level query API to ObjectStore implementations (eg caching or metrics-oriented wrappers).

Describe the solution you'd like

Here we propse updating the existing GetOption and PutOption types with a similarly extensible context/session type. @crepererum suggested something like the following:

struct Extensions {
    inner: HashMap<TypeId, Box<dyn Extension>>,
}

impl Extensions {
    pub fn get::<T>(&self) -> Option<&T> where T: Extension {
        self.inner.get(TypeId::of::<T>()).map(|ext| {
            ext.as_any().downcast_ref().expect("correct type IDs are enforced by the compiler")
        })
    }

    pub fn set::<T>(&self, ext: T) -> Option<T> where T: Extension {
        self.inner.insert(TypeId::of::<T>(), Box::new(ext)).map(|ext| {
            ext.as_any().downcast_ref().expect("correct type IDs are enforced by the compiler")
        })
    }
}

impl PartialEq for Extensions {
    // ...
}

trait Extensions: PartialEq<Self> + std::fmt::Debug {
    fn as_any(&self) -> &dyn Any;
}

// other module
pub struct GetOptions {
   // current stuff
   // ...

   extensions: Extensions,
}

One downside with this approach is that there is no way to pass GetOption to methods like get_ranges and get_range, or PutOption to put_multipart.

Describe alternatives you've considered

In apache/arrow-rs#7135 the proposal was to introduce new ObjectStore methods that takes an extensible context/session type that could hold arbitrary data. This was considered too heavy in terms of the additional trait methods. This approach would have supported contextualizing get_ranges and get_range. From my point of view as a user of the ObjectStore API, this would be the ideal approach since it makes the context passing an explicit and easily-discoverable part of the ObjectStore API.

Another alternative that has been discussed would be to initialize ObjectStore wrappers with the additional context needed at the outset of a query call rather than adding the context at the point where ObjectStore methods themselves are called. For the purpose of something like propagating tracing spans, this approach is less than ideal due to the spans not being properly situated in the hierarchy of spans built in the course of setup, planning, and execution of queries.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions