Alluxio 2.3.0 Release

We are excited to announce the release of Alluxio 2.3.0! This is the first release on the Alluxio 2.3.X line. It contains a variety of bug fixes and performance improvements, and feature enhancements over the existing Alluxio 2.2.X line.

Highlights

Concurrent Metadata Synchronization

In Alluxio 2.3, users can synchronize their namespaces with less performance overhead. The new concurrent metadata synchronization algorithm provides an order of magnitude or more performance improvement, especially for large namespaces.

Alluxio Structured Data Services

Glue UDB Support

The Alluxio Catalog Service now supports connecting to AWS Glue for the metadata service. This enables Alluxio Structured Data Services for table metadata stored in AWS Glue, in addition to the existing support for the Hive Metastore.

ORC File Support

ORC is now a supported input type (in addition to CSV and Parquet) for transformations with the Alluxio Catalog Service.

Alluxio Worker Tiered Storage Management V2

Alluxio 2.3 enables users to use multiple storage tiers without performance degradation during high load. The multi-tier caching algorithm went under a major change to improve reliability under peak load. Additionally, Alluxio Worker now manages the block placement dynamically across tiers to optimize the cache performance based on the I/O activity.

K8s Helm Chart

Alluxio 2.3 supports data locality on Kubernetes with ephemeral compute (ie. Spark) without the requirement for host networking. This is important for secure computing environments with pod security policies that may disallow host networking. In addition, for similar reasons, hostPath volumes can now be replaced with local persistent volumes for short circuit access as well as Alluxio worker tiered storage. We also migrated our Helm chart to Helm 3. Other stability and usability fixes are also included with this release with a newly published Alluxio Helm chart repository.

Apache Ozone UFS

Apache Ozone is now included in Alluxio as a new UFS module.

Environment Validation Tools

Alluxio 2.3 has three new validation tools to help users troubleshoot issues in their deployments. These tools are all part of the command line bin/alluxio

runHdfsMountTests checks configuration for mounting the target HDFS path to Alluxio.

runUfsIOTest measures the read/write IO throughput from Alluxio cluster to the target HDFS.

runHmsTests validates the given configuration is sufficient to run Hive Metastore operations.

HDFS API Heuristics

Alluxio 2.3 intelligently optimizes read APIs to HDFS, significantly improving the performance of query engines (ie. Presto, SparkSQL) by avoiding excessive seek and read operations. The heuristic takes into account the read pattern and latency of the HDFS cluster, making a particularly large impact in remote HDFS deployments.

A list of changes can be found below

Other Improvements

  • Improve UfsIOBench and HdfsValidationTool (2cd3916cb7)
  • Add validation tool jar to tarball (704a90e618)
  • Change backup delegation property defaults (d3e19196d6)
  • Improve hdfs tools to return error on invalid paths (20f889c877)
  • Improve SecureHdfsValidationTask (9f29a9f435)
  • Add a state-lock call-tracker for detecting interrupt-cycles (634da72916)
  • Add more traces for backup (db1418edcd)
  • Make validation tool general (4a71a65355)
  • Add more traces for state-lock management (f3e1c99902)
  • Reduce number of clients in DistributedLoadCommand (bc24833d3e)
  • Allow user properties to override generated properties in emr bootstrap (56f77586d0)
  • Bump master-network msg size to 100MB (f4c330923b)
  • Implement interrupt based locking protocol over state-lock (86767c7206)
  • Support active sync for give list in EMR bootstrap (058c063eac)
  • Add HADOOP_CONF into emr script (72629bfd96)
  • Update EMR script with async bootstrapping (2a7f811fb2)
  • Simplify lock list downgrades (d9df66bffa)
  • Add an HDFS validation tool (38745c9c85)
  • Add readonly check while delete recursively (e2431c20d4)
  • Prevent TTL from being set on synced inodes (4a31941239)
  • Add a root/nonroot switch to helm chart (658fd3132d)
  • Add alias option for tieredstore levels (af1273a1e3)
  • Optimize browse input width (fc69b13b86)
  • Add a HDFS throughput testing tool (63606ed6c8)
  • Implement block-info rehydration for files with missing block-infos (256e331ab3)
  • Add hive metastore connectivity and permission check tool (909c46f58c)
  • Refactor and fix block-transfer partitioner (f21772389f)
  • Improve incremental active sync (3227dafec7)
  • Implement non-intrusive locking for backups (0b5b5a4edd)
  • Make Ozone as a under file system of Alluxio (d73dc127bb)
  • Do minor refactoring for tier-management (73611076eb)
  • Improve metadata sync operations (6907d3b0ba)
  • Improve HdfsPositionedReadUnderFileInputStream with a heuristic (c4eba47067)
  • Support enabling active sync in Dataproc init action (a0efcd9b71)
  • Switch default to configure optional component for Presto (b385f896a2)
  • Make reserved space on directories only when required (1763bd81ab)
  • Add dataproc and emr artifacts (04daf953d3)
  • Improve multi-tier block management (3762d75eb6)
  • Upgrade helm command to support only helm 3 (ff59e2a998)
  • Add heuristic to determine if it is a local ufs read (683e1a57cb)
  • Use pread for all remote hdfs api read calls (56bd456ceb)
  • Add Glue UDB to Catalog Service (c3b22f914b)
  • Fix fetching max-msg-size for embedded-journal transport (ce36c332dc)
  • Fix recursive ufs listing (28319c031a)
  • Bump default connection count for streaming channels (0936da79de)
  • Improve Fuse start script (8608e673fd)
  • Release resources properly within Worker ReadHandler (55fbd6dd89)
  • Optimize gRPC managed resource handling (b2c10e03be)
  • Add ORC file input support (da0f0eb43a)
  • Make master inbound message size configurable (8f972432dc)
  • Apply K8s label matching to worker domain socket PVC (356dee5129)
  • Add option to use hostNetwork in K8s worker pod (6ef3adb49e)
  • Improve performance of sasl authentication (66abfcf658)
  • Make connection multiplexing bounded for streaming channels (2529b73c61)
  • Retry block streams with exponential back-off policy (87dcf894f0)
  • Fix inheritance for empty owner on createPath and sync (4260908fd0)
  • Keep state lock for duration of journal context (67c4e96803)

Bug Fixes

  • Update alluxio site-conf after default read type change (caa93c8b30)
  • Improve hdfs version test to parse CDH version (eecd42b149)
  • Disable hostPID by default (cf4bb10586)
  • Remove properties for journal formatting job in helm chart (b041a7b348)
  • Wait for alluxio master to come up before startSync (6bc812eeee)
  • Remove worker hostname from ALLUXIO_JAVA_OPTS in helmchart (0e3ed0d1ae)
  • Fix dataproc script name (2022d27c45)
  • Fix Fuse crash in kubernetes in helmchart (f7ca69928d)
  • Fix fetching partition for Glue UDB (313d198082)
  • Support emptyDir for volumes and infer dnsPolicy from hostNetwork (49b01d0fb0)
  • Handle escaping for single quotes in EMR script args (b1d5a4e3b8)
  • Remove unused variable from EMR script (70fc9a1682)
  • Fix emr script (4f76abd03e)
  • Add back LOCAL root_ufs_uri (c7cbeba904)
  • Fix emr script to use keyword “LOCAL” (1022de5a53)
  • Fix the bug that invalid field appears in alluxio helm chart (cd36985bea)
  • Change hms validation tool to use extension class loader (b3e29ac8b0)
  • Fix NPE associated with s3 account owner (6b5bf6ede7)
  • Clear temp file if Object Store upload failed (75703a9840)
  • Avoid casting timestamps and durations directly to Integer/Long (6393682446)
  • Fix fuse restart failed (eb0af7c42e)
  • Clean up state on failed PUT (421b86669b)
  • Stop worker heartbeat executor first on shutdown (7dd858d4cf)
  • Fix termination conditions for LRFU (489ea53ddd)
  • Prevent secondary UFS journal from modifying journal files (b358b1a6a3)
  • Fix typo in alluxio-fuse script (61b17c16c1)
  • Merge system properties in a thread-safe manner (ee3b9813c1)
  • Stop tracking connections in copycat client/server (9706dfeae7)
  • Update gRPC and netty version to address CVE (dcfa258dd5)

xisting Alluxio 2.2.X line.

Highlights

Concurrent Metadata Synchronization

In Alluxio 2.3, users can synchronize their namespaces with less performance overhead. The new concurrent metadata synchronization algorithm provides an order of magnitude or more performance improvement, especially for large namespaces.

Alluxio Structured Data Services

Glue UDB Support

The Alluxio Catalog Service now supports connecting to AWS Glue for the metadata service. This enables Alluxio Structured Data Services for table metadata stored in AWS Glue, in addition to the existing support for the Hive Metastore.

ORC File Support

ORC is now a supported input type (in addition to CSV and Parquet) for transformations with the Alluxio Catalog Service.

Alluxio Worker Tiered Storage Management V2

Alluxio 2.3 enables users to use multiple storage tiers without performance degradation during high load. The multi-tier caching algorithm went under a major change to improve reliability under peak load. Additionally, Alluxio Worker now manages the block placement dynamically across tiers to optimize the cache performance based on the I/O activity.

K8s Helm Chart

Alluxio 2.3 supports data locality on Kubernetes with ephemeral compute (ie. Spark) without the requirement for host networking. This is important for secure computing environments with pod security policies that may disallow host networking. In addition, for similar reasons, hostPath volumes can now be replaced with local persistent volumes for short circuit access as well as Alluxio worker tiered storage. Other stability and usability fixes are also included with this release with a newly published Alluxio helm chart repository.

Apache Ozone UFS

Apache Ozone is now included in Alluxio as a new UFS module.

Environment Validation Tools

Alluxio 2.3 has three new validation tools to help users troubleshoot issues in their deployments. These tools are all part of the command line `bin/alluxio`

`runHdfsMountTests` checks configuration for mounting the target HDFS path to Alluxio.

`runUfsIOTest` measures the read/write IO throughput from Alluxio cluster to the target HDFS.

`runHmsTests` validates the given configuration is sufficient to run Hive Metastore operations.

HDFS API Heuristics

Alluxio 2.3 intelligently optimizes read APIs to HDFS, significantly improving the performance of query engines (ie. Presto, SparkSQL) by avoiding excessive seek and read operations. The heuristic takes into account the read pattern and latency of the HDFS cluster, making a particularly large impact in remote HDFS deployments. 

Thanks to All of the community users below who contributed to the release:

Adit Madan, alluxio-bot, Bin Fan, Bradley Yoo, buom, Calvin Jia, cheyang, David Zhu, Gene Pang, Göktürk Gezer, Hamza Bilal, happy2048, Jason Tieu, Jiacheng Liu, JySongWithZhangCe, LuQQiu, maobaolong, patrickofriel-wk, Rico Chiu, Shouwei Chen, tinkerrrr, Zac Blanco, 崔志萍

A list of changes can be found below

Other Improvements

  • Improve UfsIOBench and HdfsValidationTool (2cd3916cb7)
  • Add validation tool jar to tarball (704a90e618)
  • Change backup delegation property defaults (d3e19196d6)
  • Improve hdfs tools to return error on invalid paths (20f889c877)
  • Improve SecureHdfsValidationTask (9f29a9f435)
  • Add a state-lock call-tracker for detecting interrupt-cycles (634da72916)
  • Add more traces for backup (db1418edcd)
  • Make validation tool general (4a71a65355)
  • Add more traces for state-lock management (f3e1c99902)
  • Reduce number of clients in DistributedLoadCommand (bc24833d3e)
  • Allow user properties to override generated properties in emr bootstrap (56f77586d0)
  • Bump master-network msg size to 100MB (f4c330923b)
  • Implement interrupt based locking protocol over state-lock (86767c7206)
  • Support active sync for give list in EMR bootstrap (058c063eac)
  • Add HADOOP_CONF into emr script (72629bfd96)
  • Update EMR script with async bootstrapping (2a7f811fb2)
  • Simplify lock list downgrades (d9df66bffa)
  • Add an HDFS validation tool (38745c9c85)
  • Add readonly check while delete recursively (e2431c20d4)
  • Prevent TTL from being set on synced inodes (4a31941239)
  • Add a root/nonroot switch to helm chart (658fd3132d)
  • Add alias option for tieredstore levels (af1273a1e3)
  • Optimize browse input width (fc69b13b86)
  • Add a HDFS throughput testing tool (63606ed6c8)
  • Implement block-info rehydration for files with missing block-infos (256e331ab3)
  • Add hive metastore connectivity and permission check tool (909c46f58c)
  • Refactor and fix block-transfer partitioner (f21772389f)
  • Improve incremental active sync (3227dafec7)
  • Implement non-intrusive locking for backups (0b5b5a4edd)
  • Make Ozone as a under file system of Alluxio (d73dc127bb)
  • Do minor refactoring for tier-management (73611076eb)
  • Improve metadata sync operations (6907d3b0ba)
  • Improve HdfsPositionedReadUnderFileInputStream with a heuristic (c4eba47067)
  • Support enabling active sync in Dataproc init action (a0efcd9b71)
  • Switch default to configure optional component for Presto (b385f896a2)
  • Make reserved space on directories only when required (1763bd81ab)
  • Add dataproc and emr artifacts (04daf953d3)
  • Improve multi-tier block management (3762d75eb6)
  • Upgrade helm command to support only helm 3 (ff59e2a998)
  • Add heuristic to determine if it is a local ufs read (683e1a57cb)
  • Use pread for all remote hdfs api read calls (56bd456ceb)
  • Add Glue UDB to Catalog Service (c3b22f914b)
  • Fix fetching max-msg-size for embedded-journal transport (ce36c332dc)
  • Fix recursive ufs listing (28319c031a)
  • Bump default connection count for streaming channels (0936da79de)
  • Improve Fuse start script (8608e673fd)
  • Release resources properly within Worker ReadHandler (55fbd6dd89)
  • Optimize gRPC managed resource handling (b2c10e03be)
  • Add ORC file input support (da0f0eb43a)
  • Make master inbound message size configurable (8f972432dc)
  • Apply K8s label matching to worker domain socket PVC (356dee5129)
  • Add option to use hostNetwork in K8s worker pod (6ef3adb49e)
  • Improve performance of sasl authentication (66abfcf658)
  • Make connection multiplexing bounded for streaming channels (2529b73c61)
  • Retry block streams with exponential back-off policy (87dcf894f0)
  • Fix inheritance for empty owner on createPath and sync (4260908fd0)
  • Keep state lock for duration of journal context (67c4e96803)

Bug Fixes

  • Update alluxio site-conf after default read type change (caa93c8b30)
  • Improve hdfs version test to parse CDH version (eecd42b149)
  • Disable hostPID by default (cf4bb10586)
  • Remove properties for journal formatting job in helm chart (b041a7b348)
  • Wait for alluxio master to come up before startSync (6bc812eeee)
  • Remove worker hostname from ALLUXIO_JAVA_OPTS in helmchart (0e3ed0d1ae)
  • Fix dataproc script name (2022d27c45)
  • Fix Fuse crash in kubernetes in helmchart (f7ca69928d)
  • Fix fetching partition for Glue UDB (313d198082)
  • Support emptyDir for volumes and infer dnsPolicy from hostNetwork (49b01d0fb0)
  • Handle escaping for single quotes in EMR script args (b1d5a4e3b8)
  • Remove unused variable from EMR script (70fc9a1682)
  • Fix emr script (4f76abd03e)
  • Add back LOCAL root_ufs_uri (c7cbeba904)
  • Fix emr script to use keyword “LOCAL” (1022de5a53)
  • Fix the bug that invalid field appears in alluxio helm chart (cd36985bea)
  • Change hms validation tool to use extension class loader (b3e29ac8b0)
  • Fix NPE associated with s3 account owner (6b5bf6ede7)
  • Clear temp file if Object Store upload failed (75703a9840)
  • Avoid casting timestamps and durations directly to Integer/Long (6393682446)
  • Fix fuse restart failed (eb0af7c42e)
  • Clean up state on failed PUT (421b86669b)
  • Stop worker heartbeat executor first on shutdown (7dd858d4cf)
  • Fix termination conditions for LRFU (489ea53ddd)
  • Prevent secondary UFS journal from modifying journal files (b358b1a6a3)
  • Fix typo in alluxio-fuse script (61b17c16c1)
  • Merge system properties in a thread-safe manner (ee3b9813c1)
  • Stop tracking connections in copycat client/server (9706dfeae7)
  • Update gRPC and netty version to address CVE (dcfa258dd5)