[jira] [Resolved] (AVRO-2046) avro-python3: Very restricted set of data types which are allowed in AvroSchemaFromJSONData

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (AVRO-2046) avro-python3: Very restricted set of data types which are allowed in AvroSchemaFromJSONData

Dave Cole (Jira)

     [ https://issues.apache.org/jira/browse/AVRO-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael A. Smith resolved AVRO-2046.
------------------------------------
    Resolution: Won't Fix

The avro-python3 implementation is unmaintained and avro (the implementation in lang/py) fully supports Python 3. Please open a new ticket if there are issues with the lang/py implementation.

> avro-python3: Very restricted set of data types which are allowed in AvroSchemaFromJSONData
> -------------------------------------------------------------------------------------------
>
>                 Key: AVRO-2046
>                 URL: https://issues.apache.org/jira/browse/AVRO-2046
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.8.2
>         Environment: avro-python3 (1.8.2)
>            Reporter: Manvendra Singh
>            Priority: Major
>
> Hey, I come from CWL project: https://github.com/common-workflow-language/cwltool and as a part of my GSoC project, I'm working on adding Python 3 compatibility to *cwltool* codebase. We've been using avro-python2 for a long time now and it has worked great for us in our projects: schema_salad and cwltool.
> In the process of porting cwltool, I'm facing issues with avro-python3 library. I've found the following bug:
> Minimal reproducible example:
> {code:none}
> from collections import OrderedDict
> import avro.schema
> AvroSchemaFromJSONData = avro.schema.SchemaFromJSONData
> a = {
>   "fields": [
>     {
>       "name": "name",
>       "type": "string"
>     },
>     {
>       "name": "favorite_number",
>       "type": [
>         "int",
>         "null"
>       ]
>     },
>     {
>       "name": "favorite_color",
>       "type": [
>         "string",
>         "null"
>       ]
>     }
>   ],
>   "name": "User",
>   "namespace": "example.avro",
>   "type": "record"
> }
> b = OrderedDict(a)
> AvroSchemaFromJSONData(a)
> AvroSchemaFromJSONData(b)
> {code}
> Ouput:
> {code}
> ~/Desktop/test/venv3/lib/python3.5/site-packages/avro/schema.py in SchemaFromJSONData(json_data, names)
>    1252   if parser is None:
>    1253     raise SchemaParseException(
> -> 1254         'Invalid JSON descriptor for an Avro schema: %r.' % json_data)
>    1255   return parser(json_data, names=names)
>    1256
> SchemaParseException: Invalid JSON descriptor for an Avro schema: OrderedDict([('namespace', 'example.avro'), ('type', 'record'), ('name', 'User'), ('fields', [{'type': 'string', 'name': 'name'}, {'type': ['int', 'null'], 'name': 'favorite_number'}, {'type': ['string', 'null'], 'name': 'favorite_color'}])]).
> {code}
>  
> h5. The current implementation of this function does not allow for *any dict like data type*. It, however, works in avro-python2.
> Relevant line of code: https://github.com/apache/avro/blob/master/lang/py3/avro/schema.py#L1250
> Apart from this, I've tried using ``2to3`` tool on avro-python2 and testing our project with it and it works perfectly. Thus, through this issue, I also want to motivate the following PR: https://github.com/apache/avro/pull/234
> I don't expect a unified codebase for avro python2 and python3 as of now or in near future. There has been a discussion on it before: https://github.com/apache/avro/pull/133
> But having avro-python2 cross compatible for both py2 and py3 would be really helpful for our project and we will be able to complete our porting process. Thanks.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)