Quantcast

Where are the rows in Trevni format?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Where are the rows in Trevni format?

sammefford
I read the Trevni Specificaiton: http://avro.apache.org/docs/1.7.4/trevni/spec.html
and I can't see where the row ids are stored for each value in each column.  Am I missing something obvious?  Is the spec incomplete on that point?

Also, to confirm, my understanding is columnar formats are efficient because they store column values sorted and can thereby find specific values or ranges of values quickly.  While the spec mentions the benefits of sorting, I don't see a requirement that column values be sorted.  Can we depend that the blocks of column values are sorted?

Thanks,

Sam Mefford
Chief Architect-Big Data Solutions
Avalon Consluting, LLC.
801-706-9731
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Where are the rows in Trevni format?

Doug Cutting
Row numbers are not stored explicitly.  They are the implicit in the
ordinal position of values in the file.

Values are not sorted but are in row order.  The primary performance
advantage of a columnar file is that, when only a subset of columns
are required, only a subset of the data need be read.

Doug

On Thu, Mar 21, 2013 at 11:26 AM, sammefford <[hidden email]> wrote:

> I read the Trevni Specificaiton:
> http://avro.apache.org/docs/1.7.4/trevni/spec.html
> and I can't see where the row ids are stored for each value in each column.
> Am I missing something obvious?  Is the spec incomplete on that point?
>
> Also, to confirm, my understanding is columnar formats are efficient because
> they store column values sorted and can thereby find specific values or
> ranges of values quickly.  While the spec mentions the benefits of sorting,
> I don't see a requirement that column values be sorted.  Can we depend that
> the blocks of column values are sorted?
>
> Thanks,
>
> Sam Mefford
> Chief Architect-Big Data Solutions
> Avalon Consluting, LLC.
> 801-706-9731
>
>
>
> --
> View this message in context: http://apache-avro.679487.n3.nabble.com/Where-are-the-rows-in-Trevni-format-tp4026663.html
> Sent from the Avro - Users mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Where are the rows in Trevni format?

TrevniUser
This post has NOT been accepted by the mailing list yet.
Is there a way to access these ordinal positions of values in a file?
For example, while scanning through the file, once my filtering criteria on a column are met, I would like to read the corresponding row values. Is this possible?

Thank you.
Loading...