Avro serialization / deserialization with different objects

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Avro serialization / deserialization with different objects

Cooper, Chris

Hello,

 

I’m using Avro’s reflection api  to publish out domain objects from Hadoop through messaging middleware.  The subscribers of my data are interested in different subsets of the objects.  Instead of them having to use a version of my original object, is it possible for them to define a totally different object (different namespace/name) that has a small subset of the original objects properties (matching names/primitive types) and then deserialize to that object?

 

So far I’ve been running into AvroTypeException’s when I use a different reader object.

 

Thanks for your help!

 

-CC

 


The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.

Reply | Threaded
Open this post in threaded view
|

Re: Avro serialization / deserialization with different objects

Doug Cutting
Cooper, Chris wrote:
> I’m using Avro’s reflection api  to publish out domain objects from
> Hadoop through messaging middleware.  The subscribers of my data are
> interested in different subsets of the objects.  Instead of them having
> to use a version of my original object, is it possible for them to
> define a totally different object (different namespace/name) that has a
> small subset of the original objects properties (matching
> names/primitive types) and then deserialize to that object?

Yes, this is possible.

The simplest way to do this would be to use the generic data
representation, and simply specify a subset of the original object's
schema, i.e., remove fields you're not interested in.

If you need to have a distinct Java class that corresponds to each
subset (rather than a GenericRecord instance) then you could do this
using the specific or reflect data representations, but it'll be more
work.  You'd need to subclass SpecificDatumReader or ReflectDatumReader
and override the #newRecord() method to create the class you want to use
to represent your subset schema.  This should work, but, since I don't
think anyone has tried it before, there may be ways we can improve Avro
to make this easier.  So please tell us how it goes.

Doug