[jira] [Commented] (AVRO-1933) SchemaCompatibility class could be more user-friendly about incompatibilities

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (AVRO-1933) SchemaCompatibility class could be more user-friendly about incompatibilities

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/AVRO-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867887#comment-15867887 ]

Elliot West commented on AVRO-1933:

I agree with the motivations behind this effort. There is a lot of room for improvement with regards to the reporting of incompatibilities within schemas. This is especially important as the end user focus of Avro Schema development shifts from developers to analysts  and I know of a few projects that are heading in this direction. For example, one can envisage systems built upon Avro that have web-based UI tooling for schema submission etc.

In these cases it would be very handy to model situations that would later enable us to make statements such as:

* "_Field 'uuid' in record 'namespace.record_name' in the existing schema cannot be read by the new schema because it has changed to an incompatible type; was 'string', now 'long'_"
* "_Field 'address' in record 'namespace.record_name' in the new schema does not exist in the earlier schema and does not declare a default_"
* "_Field union branch 1 at 'namespace.record_name.uField' in the earlier schema does not exist in the new schema_"

Certainly these conditions should generated as an accessible modeled and not simply returned as a string message, so that integrations with editors etc. will be possible.

Although this patch is moving in the right direction I think it could be further improved as follows:
# Instead of providing a separate implementation of the compatibility rules, extend the implementation provided in {{org.apache.avro.io.parsing.ResolvingGrammarGenerator}}, or refactor in such a way that they are declared only once.
# {{SchemaCompatibilityDetails}} should include a property that describes the fully qualified path to the incompatible node in the schema tree. Initially this would help users navigate to the problematic location in the schema. Later it would allow future tooling to highlight the relevant sections of the schema source code.
# Perhaps {{SchemaCompatibilityResult}} would be {{SchemaCompatibilityDetails}}

> SchemaCompatibility class could be more user-friendly about incompatibilities
> -----------------------------------------------------------------------------
>                 Key: AVRO-1933
>                 URL: https://issues.apache.org/jira/browse/AVRO-1933
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.8.1
>         Environment: Any Java env
>            Reporter: Anders Sundelin
>            Priority: Minor
>             Fix For: 1.9.0
>         Attachments: AVRO-1933-compatible-with-AVRO-1931.patch, AVRO-1933.patch
>   Original Estimate: 1h
>  Remaining Estimate: 1h
> Today, the class SchemaCompatibility reports incompatibilities with quite little detail. The whole reader and the whole writer schema is listed, and no particular detail about what was incompatible.
> The attached patch fixes this, introducing a new enum (SchemaIncompatibilityType), and more specific sub-schemas that were incompatible.
> The old, overall picture, is still there - the new compatibility state is encapsulated in the SchemaCompatibilityDetails class.
> Lots of test cases have been added, and there has been refactoring done in the TestSchemaCompatibility and other test classes.

This message was sent by Atlassian JIRA