Quantcast

Serializing nested records

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Serializing nested records

yimca
How do I serialize a record containing a nested record?  There doesn't seem to be any way to create a DataFileWriter without tieing it to a single record type.

Here's the scenario: I've defined an Avro schema called TransactionStatic with nested internal record called TransactionStaticComponent:

{
  "namespace": "com.wily.apm.blackjack",
  "type": "record",
  "name": "TransactionStatic",
  "fields":
   [
       {"name": "id", "type": "int"},
       {"name": "isIdLocal", "type": "boolean", "default": "true"},
       {"name": "contextPath",  "type": [{ "type": "array", "items": "string" },"null"] },
       {"name": "components", "type":
           [{ "type": "array", "items" :
               [{ "type": "record",
                   "name": "TransactionStaticComponent",
                   "fields":
                   [                      
                       {"name": "id", "type": "int"},
                       {"name": "isIdLocal", "type": "boolean", "default": "true"},
                       {"name": "contextPath", "type": [{ "type": "array", "items": "string" },"null"] },
                       {"name": "application", "type" : ["string","null"]},
                       {"name": "class", "type" :  ["string","null"]},
                       {"name": "method", "type" :  ["string","null"]},
                       {"name": "lineNumber", "type" :  ["int","null"]},
                       {"name": "payload", "type": [{"type": "map", "values": "string"},"null"] },
                       {"name": "components", "type": [{ "type": "array", "items": "TransactionStaticComponent" }], "default": "null" }
                   ]
               }]
           }]
       }
   ]
}

This compiles clean and I'm able to create data in the schema.  However, if I try to serialize a record:

DataFileWriter<TransactionStatic> staticWriter
               = new DataFileWriter<TransactionStatic>(new SpecificDatumWriter<TransactionStatic>(schema));
ByteArrayOutputStream staticOutputStream = new ByteArrayOutputStream(1024);
staticWriter.create(TransactionStatic.SCHEMA$, staticOutputStream);
staticWriter.append(servletA);
staticWriter.close();

I get an Avro exception stating that TransactionStaticInstance is not defined:

Exception in thread "main" org.apache.avro.file.DataFileWriter$AppendWriteException: org.apache.avro.AvroRuntimeException: Unknown datum type: [Lcom.wily.apm.blackjack.TransactionStaticComponent;@3ffa1b16
        at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263

How can I serialize a TransactionStatic?  Also, where did the "L" come from in "Lcom.wily..."?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Serializing nested records

Doug Cutting
On Mon, Jan 21, 2013 at 6:14 AM, yimca <[hidden email]> wrote:
> How can I serialize a TransactionStatic?  Also, where did the "L" come from
> in "Lcom.wily..."?

The L means it's a Java array, i.e., TransactionStatic[].  Avro's
specific representation uses List to represent arrays.  (Look at the
generated code.)  If you instead use a ReflectDatumWriter then you
should be able to write TransactionStatic[] too.

Doug
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Serializing nested records

Rob Turner
In reply to this post by yimca
I think the problem is with the following union in your schema:

    "type": [{ "type": "array", "items": "TransactionStaticComponent" }],
"default": "null"

The union only has one type, that of the array, and does not also include
"null" which is necessary for the default value of null to be valid. The
following might be worth a try:

   "type": [{ "type": "array", "items": "TransactionStaticComponent" },
"null"], "default": "null"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Serializing nested records

yimca
Thanks very much Doug and Rob for the quick and accurate replies.   Doug's suggestion helped make sense of the error, and Rob's suggestion got me most of the way to resolving it.   It turned out there was a second error in the definition of the first "components" field - instead of:

  {"name": "components", "type":
     [{ "type": "array", "items" :
          [{ "type": "record",

It needed to be:

  {"name": "components", "type":
    ["null",
      { "type": "array", "items":
        { "type": "record",

This, along with Doug's suggestion, resulted in the working schema:

{
  "namespace": "com.wily.apm.blackjack",
  "type": "record",
  "name": "TransactionStatic",
  "fields":
   [
       {"name": "id", "type": "int"},
       {"name": "isIdLocal", "type": "boolean", "default": "true"},
       {"name": "contextPath", "type": [{ "type": "array", "items": "string" },"null"] },
       {"name": "components", "type":
          ["null",
           { "type": "array", "items":
              { "type": "record",
                "name": "TransactionStaticComponent",
                "fields":
                [                      
                    {"name": "id", "type": "int"},
                    {"name": "isIdLocal", "type": "boolean", "default": "true"},
                    {"name": "contextPath", "type": [{ "type": "array", "items": "string" },"null"] },
                    {"name": "application", "type" : ["string","null"]},
                    {"name": "class", "type" :  ["string","null"]},
                    {"name": "method", "type" :  ["string","null"]},
                    {"name": "lineNumber", "type" :  ["int","null"]},
                    {"name": "payload", "type": [{"type": "map", "values": "string"},"null"] },
                    {"name": "components", "type": [{"type": "array", "items": "TransactionStaticComponent"},"null"], "default": "null"}
                ]
              }
           }]
        }
    ]
}
Loading...