Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete validation of the data instance #106

Open
gusmb opened this issue Jun 19, 2021 · 11 comments
Open

Complete validation of the data instance #106

gusmb opened this issue Jun 19, 2021 · 11 comments

Comments

@gusmb
Copy link

gusmb commented Jun 19, 2021

This is a feature request. With the current implementation of the validation logic, the function returns an exception reporting the first validation error found. If there are multiple errors in the data instance, the only way to find out is to correct them one by one and run the validation method again each time. It would be nice to have a validation option they could perform a full validation which would return a list of all validation errors found in the instance.
Since the current method returns None or a single exception, maybe the best way would be to have a new method validate_all(), which would return a tuple (result, errors[]) where result would be 0 for a successful validation and non-zero otherwise, and the error list would be empty in case of result=0, and populated with a list of validation errors otherwise.
I think this could be useful in cases where yangson is being used to validate data instances, for custom service data models, where the error messages are also customized for yang leaves that must follow certain patterns.

@christian-herber
Copy link

I can relate to that. To get to a list of all errors, I am currently iterating through the data tree, running validation at all levels, and eliminate duplicates at the end. Works, but a built-in way for that would be great

@llhotka
Copy link
Member

llhotka commented Jun 22, 2021

I had been thinking about this, too. But what should be the result? A list of exception objects?

@christian-herber
Copy link

I had been thinking about this, too. But what should be the result? A list of exception objects?

from my point of view exactly that

@gusmb
Copy link
Author

gusmb commented Jun 22, 2021

That is my point of view too. Empty result for successful validation, or a list of exception objects otherwise.

@gusmb
Copy link
Author

gusmb commented Jun 22, 2021

ok, so it looks pretty straightforward:

    def _validate_subtree(instance_data, errors=None):
        """Validate data instance tree at all levels

        :param instance_data: instance data object

        :return: a list of Validation exception objects
        """
            for item in instance_data._children():
                try:
                    item.validate()            
                except (YangTypeError, SchemaError, SemanticError) as validationError:
                    if not errors:
                        errors = set()
                    errors.add(validationError)
                _validate_subtree(item, errors)

        return errors

With this function I obtain a list of exception objects that I can use.

@christian-herber
Copy link

you can check my solution here https://github.com/christian-herber/yanggui/blob/47e34458b5c4300e4375de29166f33d27270ce6e/yanggui/dsrepo.py#L117
It is pretty brute force, and also you end up having duplicates of errors, which is why I have another function to remove all those duplicates

@gusmb
Copy link
Author

gusmb commented Jun 23, 2021

That's right about the duplicated exceptions. I even tried to use a s set instead of list to collect the exception objects and still get duplicates due to storing different objects with the same message. The problem is using the existing validate method in the library. For large hierarchies finding all errors can be a slow an inefficient process. A better way would be to implement a validation filter that only considers the nodes in the current hierarchy level and not children nodes. That way a simple recursive function could iterate over the data and validate nodes only once.

@gusmb
Copy link
Author

gusmb commented Jun 23, 2021

As a workaround, I came up with this function, which validates each node only once:

def validate_subtree(instance):
        """Validate data instance tree at all levels

        :param instance: Yangson instance node

        :return: a set of Validation exception objects
        """
        errors = set()
        if not instance._children():
            try:
                instance.validate()
            except (YangTypeError, SchemaError, SemanticError) as validationError:
                errors.add(validationError)
        else:
            for item in instance._children():
                errors.update(validate_subtree(item))        
        return errors

I then pass the Root node to that function and iterates over the children. It only performs validation for nodes which don't have other children, otherwise it calls itself again to go a level deeper.

Doing it this way, there are also no duplicates in the error objects.

@gusmb
Copy link
Author

gusmb commented Jun 23, 2021

Sent a new method proposal for review via PR:
#108

The get_error_list() in instance.py uses the 'validate()' method to collect the list of validation errors at each hierarchy level and for all node types except for a list of nodes that should be validated as a block (calling the ._children() method on those node types fails with error AttributeError: '<NodeType>' object has no attribute '_default_value' )

@gusmb
Copy link
Author

gusmb commented Nov 29, 2022

As per comments on #108 (now closed) , this won't be easy to implement properly. Requires dealing with schema patterns. See PR comments. Keeping this issue open as it is a desirable feature

@gusmb
Copy link
Author

gusmb commented Mar 4, 2024

Hello @llhotka , related to this issue, is there any plan to add the "lazy" validation capability to Yangson? Data validation against a Yang schema is one of the highlight features of this library, but the fact that it errors out on the first encountered error makes validation and correction iterations a lengthy and inefficient process. I recall solution was based on dealing with schema patterns, is this still relevant?
Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants