Quantcast

Java Example of writing a union

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Java Example of writing a union

Sam Poole
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Java Example of writing a union

Vyacheslav Zholudev
I'm assuming for now that you are using a specific writer and you have a union schema with two records FOO and BAR (you should get two classes FOO and BAR generated by avro tools):

FOO fooObj = ....
BAR barObj = ....
BAR barObj2 = ....
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        DatumWriter<GenericRecord> writer = new SpecificDatumWriter<Record>(yourSchema);
        encoder = EncoderFactory.get().binaryEncoder(out, encoder);
        writer.write(fooObj, encoder);
        writer.write(barObj, encoder);
        writer.write(barObj2, encoder);
        encoder.flush();
        out.close();

Does it make sense?

Vyacheslav

On Aug 8, 2011, at 3:53 PM, Sam Poole wrote:

Does anybody have an example of writing a file that uses a union schema?  I
am having problems trying to write a file that uses a union schema because
once I set the schema, I can't add an individual datum because it is not
part of a union.  



--
View this message in context: http://apache-avro.679487.n3.nabble.com/Java-Example-of-writing-a-union-tp3235624p3235624.html
Sent from the Avro - Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Java Example of writing a union

Sam Poole
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Java Example of writing a union

Sam Poole
This post was updated on .


Thank you very much. Yes, this works good. And then I took it one step further to try and get the schema put in the file and also to apply encoding.





FOO fooObj = ....
BAR barObj = ....
BAR barObj2 = ....
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        DatumWriter<SpecificRecord> writer = new SpecificDatumWriter<SpecificRecord>(yourSchema);

        DataFileWriter filewriter=new DataFileWriter(writer);

        CodecFactory codec = CodecFactory.deflateCodec(9);

        filewriter.setCodec(codec);



        filewriter.create(yourSchema,out);



        encoder = EncoderFactory.get().binaryEncoder(out, encoder);



        filewriter.append(fooObj);

        filewriter.append(barObj);

        filewriter.append(barObj2);



        OutputStream outstream=new FileOutputStream("/somefolder/somefile.avro");

        out.writeTo(outstream);



________________________________

From: Vyacheslav Zholudev [vyacheslav.zholudev@gmail.com]
Sent: Monday, August 08, 2011 12:52 PM
To: user@avro.apache.org
Subject: Re: Java Example of writing a union

I'm assuming for now that you are using a specific writer and you have a union schema with two records FOO and BAR (you should get two classes FOO and BAR generated by avro tools):

FOO fooObj = ....
BAR barObj = ....
BAR barObj2 = ....
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        DatumWriter<GenericRecord> writer = new SpecificDatumWriter<Record>(yourSchema);
        encoder = EncoderFactory.get().binaryEncoder(out, encoder);
        writer.write(fooObj, encoder);
        writer.write(barObj, encoder);
        writer.write(barObj2, encoder);
        encoder.flush();
        out.close();

Does it make sense?

Vyacheslav

On Aug 8, 2011, at 3:53 PM, Sam Poole wrote:

Does anybody have an example of writing a file that uses a union schema?  I
am having problems trying to write a file that uses a union schema because
once I set the schema, I can't add an individual datum because it is not
part of a union.



--
View this message in context: http://apache-avro.679487.n3.nabble.com/Java-Example-of-writing-a-union-tp3235624p3235624.html
Sent from the Avro - Users mailing list archive at Nabble.com<http://Nabble.com>.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Java Example of writing a union

Scott Carey-2
In reply to this post by Sam Poole
FYI, deflateCodec(9) rarely improves compression over level 6, but is much slower to write. 

Also, unless you increase the block size in the file to over 256KB it probably won't improve it at all.  The primary thing that larger deflate/gzip compression levels do is increase the size of the lookback window for finding duplicate segments.

In short, with your actual data, try different compression levels and buffer sizes and see what works best for you.   The best choice is almost never compression level 9.

I often end up with compression level 3 or 1 when I need the speed, and level 6 or 7 with larger blocks for 'archival' use. 
A useful link comparing speed to compression ratio for gzip (gzip is deflate with a different header and crc) is:

As you can see, compression level 9 is typically 2 to 3 times slower than level 6 and only a tiny fraction better compression ratio.

On 8/8/11 12:56 PM, "Poole, Samuel [USA]" <[hidden email]> wrote:

Thank you very much. Yes, this works good. And then I took it one step further to try and get the schema put in the file and also to apply encoding.

 

 

FOO fooObj = ....
BAR barObj = ....
BAR barObj2 = ....
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        DatumWriter<SpecificRecord> writer = new SpecificDatumWriter<SpecificRecord>(yourSchema);
       
        DataFileWriter filewriter=new DataFileWriter(writer);   

        CodecFactory codec = CodecFactory.deflateCodec(9);

        filewriter.setCodec(codec);

 

        filewriter.create(yourSchema,out);

        

        encoder = EncoderFactory.get().binaryEncoder(out, encoder);

 

        filewriter.append(fooObj);

        filewriter.append(barObj);

        filewriter.append(barObj2);

 

        OutputStream outstream=new FileOutputStream("/somefolder/somefile.avro");

        out.writeTo(outstream);

 
 
 
this code works, but now I have an issue with reading the file....
 
When I read the file, I can only see the first datum in the union.  I know that all of the datums were written to the file because of the size of the file, but I can't read all of the datums.
 
Here is my code to read the union file.
 

Schema yourSchema=Schema.parse(new File("/somefolder/someschema.avro"));

DatumReader<SpecificRecord> datumreader=new SpecificDatumReader<SpecificRecord>(yourSchema);

DataFileReader reader=new DataFileReader(new File("/somefolder/somefile.avro"),datumreader);

 

if(reader.hasNext()){

    SpecificRecord result=(SpecificRecord) reader.next();

    System.out.println(result.getClass());

}

 
 
Not sure if I have a problem with how I created the file or how I am reading the file....
 
Any ideas?
 
 


From: Vyacheslav Zholudev [[hidden email]]
Sent: Monday, August 08, 2011 12:52 PM
To: [hidden email]
Subject: Re: Java Example of writing a union

I'm assuming for now that you are using a specific writer and you have a union schema with two records FOO and BAR (you should get two classes FOO and BAR generated by avro tools):

FOO fooObj = ....
BAR barObj = ....
BAR barObj2 = ....
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        DatumWriter<GenericRecord> writer = new SpecificDatumWriter<Record>(yourSchema);
        encoder = EncoderFactory.get().binaryEncoder(out, encoder);
        writer.write(fooObj, encoder);
        writer.write(barObj, encoder);
        writer.write(barObj2, encoder);
        encoder.flush();
        out.close();

Does it make sense?

Vyacheslav

On Aug 8, 2011, at 3:53 PM, Sam Poole wrote:

Does anybody have an example of writing a file that uses a union schema?  I
am having problems trying to write a file that uses a union schema because
once I set the schema, I can't add an individual datum because it is not
part of a union.  



--
View this message in context: http://apache-avro.679487.n3.nabble.com/Java-Example-of-writing-a-union-tp3235624p3235624.html
Sent from the Avro - Users mailing list archive at Nabble.com.

Loading...