Our website is made possible by displaying online advertisements to our visitors.
Please consider supporting us by disabling your ad blocker.

Responsive image


Data orientation

Data orientation is the representation of tabular data in a linear memory model such as in-disk or in-memory. The two most common representations are column-oriented (columnar format) and row-oriented (row format).[1][2]

The choice of data orientation is a trade-off and an architectural decision in databases, query engines, and numerical simulations.[1] As a result of these tradeoffs, row-oriented formats are more commonly used in Online transaction processing (OLTP) and column-oriented formats are more commonly used in Online analytical processing (OLAP).[2]

Examples of column-oriented formats include Apache ORC,[3] Apache Parquet,[4] Apache Arrow,[5] formats used by BigQuery, Amazon Redshift and Snowflake. Predominant examples of row-oriented formats include CSV, formats used in most relational databases, the in-memory format of Apache Spark, and Apache Avro.[6]

  1. ^ a b Abadi, Daniel J.; Madden, Samuel R.; Hachem, Nabil (2008). "Column-stores vs. Row-stores: How different are they really?". Proceedings of the 2008 ACM SIGMOD international conference on Management of data. pp. 967–980. doi:10.1145/1376616.1376712. ISBN 978-1-60558-102-6.
  2. ^ a b Funke, Florian; Kemper, Alfons; Neumann, Thomas (2012). "Compacting Transactional Data in Hybrid OLTP&OLAP Databases". Proceedings of the VLDB Endowment. 5 (11): 1424–1435. doi:10.14778/2350229.2350258.
  3. ^ "Apache ORC". Retrieved 2024-05-21.
  4. ^ "Apache Parquet". Retrieved 2024-05-21.
  5. ^ "Apache Arrow". Retrieved 2024-05-21.
  6. ^ "Apache Avro". Retrieved 2024-05-21.

Previous Page Next Page