Implementation of compatibility rules

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Implementation of compatibility rules

Elliot West
Hi,

I've been attempting to understand the implementation of Avro schema compatibility rules and am slightly confused by the structure of the code. It seems that there are at least two possible entry points:
  • org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema, Schema)
  • org.apache.avro.SchemaValidatorBuilder
The code paths of these do not seem to intersect, with one implementing a static set of rule checks and the other seemingly delegating to grammar based approach. Does this imply that there are in fact two implementations of the compatibility rules?

Apologies if this is a naïve question.

Thanks,

Elliot.
Reply | Threaded
Open this post in threaded view
|

Re: Implementation of compatibility rules

Elliot West
Further to this, is there any reason why conceptually, the implementation of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not be changed from:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    boolean error;
    try {
      error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
          writtenWith, readUsing));
    } catch (IOException e) {
      throw new SchemaValidationException(readUsing, writtenWith, e);
    }
    if (error) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }

to:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    SchemaCompatibilityType compatibilityType
      = SchemaCompatibility.checkReaderWriterCompatibility(readUsing, writtenWith).getType();
    if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }

Or am I missing something fundamental?

Thanks,

Elliot.

On 17 February 2017 at 12:27, Elliot West <[hidden email]> wrote:
Hi,

I've been attempting to understand the implementation of Avro schema compatibility rules and am slightly confused by the structure of the code. It seems that there are at least two possible entry points:
  • org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema, Schema)
  • org.apache.avro.SchemaValidatorBuilder
The code paths of these do not seem to intersect, with one implementing a static set of rule checks and the other seemingly delegating to grammar based approach. Does this imply that there are in fact two implementations of the compatibility rules?

Apologies if this is a naïve question.

Thanks,

Elliot.

Reply | Threaded
Open this post in threaded view
|

Re: Implementation of compatibility rules

Elliot West
Update:

I had a go at modifying org.apache.avro.SchemaValidatorBuilder to use SchemaCompatibility and have then run schema compatibility test suites from both the Avro project and Confluent's Schema registry. Every case that is tested appears to continue to function correctly with one exception; SchemaCompatibility appears to favourably consider aliases when performing name based compatibility checks whereas the implementation provided via SchemaValidatorBuilder is more strict, and does not.

The specification makes no definitive judgement on the matter, simply stating that 'an implementation may optionally use aliases'. Should perhaps this be configurable in the aforementioned implementations so that the user can decide and also have a chance of obtaining consistent behaviour?

Elliot.

On 22 February 2017 at 13:48, Elliot West <[hidden email]> wrote:
Further to this, is there any reason why conceptually, the implementation of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not be changed from:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    boolean error;
    try {
      error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
          writtenWith, readUsing));
    } catch (IOException e) {
      throw new SchemaValidationException(readUsing, writtenWith, e);
    }
    if (error) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }

to:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    SchemaCompatibilityType compatibilityType
      = SchemaCompatibility.checkReaderWriterCompatibility(readUsing, writtenWith).getType();
    if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }

Or am I missing something fundamental?

Thanks,

Elliot.

On 17 February 2017 at 12:27, Elliot West <[hidden email]> wrote:
Hi,

I've been attempting to understand the implementation of Avro schema compatibility rules and am slightly confused by the structure of the code. It seems that there are at least two possible entry points:
  • org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema, Schema)
  • org.apache.avro.SchemaValidatorBuilder
The code paths of these do not seem to intersect, with one implementing a static set of rule checks and the other seemingly delegating to grammar based approach. Does this imply that there are in fact two implementations of the compatibility rules?

Apologies if this is a naïve question.

Thanks,

Elliot.


Reply | Threaded
Open this post in threaded view
|

Re: Implementation of compatibility rules

Joseph P.
This change (considering alias in schema compatibility) is really welcomed and needed in our usage of it. So thanks a lot for this much needed change (IMHO).

best,
joseph

On Wed, Feb 22, 2017 at 4:55 PM, Elliot West <[hidden email]> wrote:
Update:

I had a go at modifying org.apache.avro.SchemaValidatorBuilder to use SchemaCompatibility and have then run schema compatibility test suites from both the Avro project and Confluent's Schema registry. Every case that is tested appears to continue to function correctly with one exception; SchemaCompatibility appears to favourably consider aliases when performing name based compatibility checks whereas the implementation provided via SchemaValidatorBuilder is more strict, and does not.

The specification makes no definitive judgement on the matter, simply stating that 'an implementation may optionally use aliases'. Should perhaps this be configurable in the aforementioned implementations so that the user can decide and also have a chance of obtaining consistent behaviour?

Elliot.

On 22 February 2017 at 13:48, Elliot West <[hidden email]> wrote:
Further to this, is there any reason why conceptually, the implementation of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not be changed from:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    boolean error;
    try {
      error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
          writtenWith, readUsing));
    } catch (IOException e) {
      throw new SchemaValidationException(readUsing, writtenWith, e);
    }
    if (error) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }

to:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    SchemaCompatibilityType compatibilityType
      = SchemaCompatibility.checkReaderWriterCompatibility(readUsing, writtenWith).getType();
    if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }

Or am I missing something fundamental?

Thanks,

Elliot.

On 17 February 2017 at 12:27, Elliot West <[hidden email]> wrote:
Hi,

I've been attempting to understand the implementation of Avro schema compatibility rules and am slightly confused by the structure of the code. It seems that there are at least two possible entry points:
  • org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema, Schema)
  • org.apache.avro.SchemaValidatorBuilder
The code paths of these do not seem to intersect, with one implementing a static set of rule checks and the other seemingly delegating to grammar based approach. Does this imply that there are in fact two implementations of the compatibility rules?

Apologies if this is a naïve question.

Thanks,

Elliot.



Reply | Threaded
Open this post in threaded view
|

Re: Implementation of compatibility rules

Doug Cutting-2
Support for aliases should be easy to add by calling
Schema#applyAliases before the compatibility check.

Whether aliases should be applied depends on whether the compatibility
check is meant to be valid only for implementations that support
aliases or also ones that do not.

Note that support for aliases might be implemented through a service.
A schema registry service could be extended to also apply aliases.  A
command to retrieve a writer's schema with a given ID could also be
provided the reader's schema, and its result would be the writer's
schema with the reader's aliases applied.

Doug

On Wed, Feb 22, 2017 at 8:47 AM, Joseph P. <[hidden email]> wrote:

> This change (considering alias in schema compatibility) is really welcomed
> and needed in our usage of it. So thanks a lot for this much needed change
> (IMHO).
>
> best,
> joseph
>
> On Wed, Feb 22, 2017 at 4:55 PM, Elliot West <[hidden email]> wrote:
>>
>> Update:
>>
>> I had a go at modifying org.apache.avro.SchemaValidatorBuilder to use
>> SchemaCompatibility and have then run schema compatibility test suites from
>> both the Avro project and Confluent's Schema registry. Every case that is
>> tested appears to continue to function correctly with one exception;
>> SchemaCompatibility appears to favourably consider aliases when performing
>> name based compatibility checks whereas the implementation provided via
>> SchemaValidatorBuilder is more strict, and does not.
>>
>> The specification makes no definitive judgement on the matter, simply
>> stating that 'an implementation may optionally use aliases'. Should perhaps
>> this be configurable in the aforementioned implementations so that the user
>> can decide and also have a chance of obtaining consistent behaviour?
>>
>> Elliot.
>>
>> On 22 February 2017 at 13:48, Elliot West <[hidden email]> wrote:
>>>
>>> Further to this, is there any reason why conceptually, the implementation
>>> of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not be
>>> changed from:
>>>
>>>   static void canRead(Schema writtenWith, Schema readUsing)
>>>       throws SchemaValidationException {
>>>     boolean error;
>>>     try {
>>>       error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
>>>           writtenWith, readUsing));
>>>     } catch (IOException e) {
>>>       throw new SchemaValidationException(readUsing, writtenWith, e);
>>>     }
>>>     if (error) {
>>>       throw new SchemaValidationException(readUsing, writtenWith);
>>>     }
>>>   }
>>>
>>>
>>> to:
>>>
>>>   static void canRead(Schema writtenWith, Schema readUsing)
>>>       throws SchemaValidationException {
>>>     SchemaCompatibilityType compatibilityType
>>>       = SchemaCompatibility.checkReaderWriterCompatibility(readUsing,
>>> writtenWith).getType();
>>>     if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
>>>       throw new SchemaValidationException(readUsing, writtenWith);
>>>     }
>>>   }
>>>
>>>
>>> Or am I missing something fundamental?
>>>
>>> Thanks,
>>>
>>> Elliot.
>>>
>>> On 17 February 2017 at 12:27, Elliot West <[hidden email]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I've been attempting to understand the implementation of Avro schema
>>>> compatibility rules and am slightly confused by the structure of the code.
>>>> It seems that there are at least two possible entry points:
>>>>
>>>>
>>>> org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema,
>>>> Schema)
>>>> org.apache.avro.SchemaValidatorBuilder
>>>>
>>>> The code paths of these do not seem to intersect, with one implementing
>>>> a static set of rule checks and the other seemingly delegating to grammar
>>>> based approach. Does this imply that there are in fact two implementations
>>>> of the compatibility rules?
>>>>
>>>> Apologies if this is a naïve question.
>>>>
>>>> Thanks,
>>>>
>>>> Elliot.
>>>
>>>
>>
>