This change includes several optimizations of the validation performed during encoding using Ruby. For a use case with a few levels of nesting and unions in several places within the schema we saw a 5x improvement in encoding performance with these changes.
The main changes are:
1. Avoid the exhaustive validation of schemas in a union. Previously a datum was tested against all schemas in a union even though the failures were unused if a compatible schema was found. Now validation stops when the first compatible schema is found, but all failures are still available if there is no compatible type.
2. Avoid the repeated validation of nested schemas. Previously, the datum was recursively validated against the schema prior to encoding. Then during encoding, each complex field (record, array, map, union) was recursively validated again. Thus each field was validated a number of times equal to its level of nesting plus one. This change introduces an option for validation not to recurse. Since encoding proceeds recursively, validation is instead performed as each level is encoded.
0ther minor improvements:
- delay creating error messages until they are required
- use explicit instead of dynamic code (`&method(:is_a?)`)
- additional use of constants
The only additional tests in this change demonstrate that validation without recursion returns the same results for "simple" fields and no validation errors for complex fields that would require recursion.
The updated methods for `Avro::Schema.validate` and `Avro::SchemaValidator.validate!` were implemented to take an options hash with the new `:recursive` option in anticipation of eventually being combined with logical type support (https://github.com/apache/avro/pull/116) which would specify whether the datum is already `:encoded`.
These changes have been tested against:
You can merge this pull request into a Git repository by running:
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #230
Author: Tim Perkins <[hidden email]>
Ruby encoding performance improvements
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket