You should be able to specify a reader schema with the namespace and the writer schema without it. See https://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/specific/SpecificData.html#createDatumReader(org.apache.avro.Schema, org.apache.avro.Schema)
On Thursday, August 6, 2015 3:31 PM, Mehrez Alachheb <[hidden email]> wrote:
I am working in project, in which i have to deserialize an avro files provided by an other external company.
The problem is that the schema (example below) of the serialized avro files doesn't contain a namespace, however i need to add the namespace to the avro Schema.
I can't serialize the avro files with another schema because we get them from another company.
I created a new schema( example below) with name space and i generated the associated java classes.
How i can deserialize the avro files with my new schema ?
Can I guess? You're reading some Python/Pig AvroStorage output? Hate that.
I get the same error when the reader schema has a namespace but the writer has none. But only when a record is in a union.
Here's a pair of small runnable examples that show errors with reading and writing accross namespaces.
For the sake of being complete, here's my question, and it looks like Vitaly Gordon ran into this issue as well, here.
IHMO this is a bug that hinders Avro's utility as a data interchange format. I don't think the technical issue is in trying to import a class from the default package (which succeeds outside of unions), but instead it's from trying to resolve a union reflectively and the writer schema's fullname doesn't match the class' fullname.
The fix for now:
You could try using the Generic API instead, and then map the Generic Records to your Specific Records manually. Here's a start in Java:
GenericDatumReader<GenericRecord> datumReader = new GenericDatumReader<>(schema);
DataFileReader<GenericRecord> fileReader = new DataFileReader<>(file, datumReader);
GenericRecord record = fileReader.next();