[jira] [Commented] (AVRO-3118) Namespace with empty string is not treated as null in Python API

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (AVRO-3118) Namespace with empty string is not treated as null in Python API

Dave Cole (Jira)

    [ https://issues.apache.org/jira/browse/AVRO-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345935#comment-17345935 ]

Frank Mieves commented on AVRO-3118:

Hi Scott,

I have also seen that this kind of definition and reuse by name is not supported/implemented in fastavro. When using the AvroConsumer in Kafka I got a warning and it's dropping back to the Apache AVRO library.

I can't say if it is working in Java, but I guess yes. Other projects accessing the same Kafka topics by Java and C++ libraries didn't have any issues like me with Python.

I asked the guys, responsible for the Producer, to replace the empty string in namespace to Null or the schema namespace, but they told me they have no influence on that and this is done by their Kafka library.

> Namespace with empty string is not treated as null in Python API
> ----------------------------------------------------------------
>                 Key: AVRO-3118
>                 URL: https://issues.apache.org/jira/browse/AVRO-3118
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.10.2
>            Reporter: Frank Mieves
>            Assignee: Michael A. Smith
>            Priority: Blocker
> Hi,
> based on [the AVRO documentation|https://avro.apache.org/docs/current/spec.html#names] an empty string may also be used as a namespace to indicate the null namespace.
> The Python package doesn't do that. Following example will fail when parsing:
> {code:python}
> import avro.schema
> schema_str="""
> {"type": "record", "name": "my_schema", "namespace": "my_ns",
>  "fields": [{"name": "field_a", "type": {"type": "record", "name": "row", "namespace": "",
>  "fields": [{"name": "subfield", "type": "string"}]}},
>  {"name": "field_b", "type": "row"}
>  ]}
> """
> schema = avro.schema.parse(schema_str)
> {code}
> Raised exception is "SchemaParseException: Type property "row" not a valid Avro schema: Could not make an Avro Schema object from row."
> If I set the namespace in the in the subfield to null it's working.
> Problem for me is, that I can't change the schema definition. The schema is in the Kafka schema repository. The Kafka AVRO consumer receives this from the schema registry server with an empty string.
> I could fix this by adding a check in the parser source schema.py:
> {code:python}
> ...
> return writer.type == self.type and (self.type == 'request' or self.check_props(writer, ['fullname']))
> def __init__(self, name, namespace, fields, names=None, schema_type='record', doc=None, other_props=None):
>     # Fixing empty namespace
>     if namespace == '':
>         namespace = None
>     # Ensure valid ctor args
>     if fields is None:
>         ...
> {code}

This message was sent by Atlassian Jira