Tech Talk Slide Deck

Accelerate Cloud Training with Alluxio

APACHECON 2021

Alluxio’s capabilities as a Data Orchestration framework have encouraged users to onboard more of their data-driven applications to an Alluxio powered data access layer. Driven by strong interests from our open-source community, the core team of Alluxio started to re-design an efficient and transparent way for users to leverage data orchestration through the POSIX interface. This effort has a lot of progress with the collaboration with engineers from Microsoft, Alibaba and Tencent. Particularly, we have introduced a new JNI-based FUSE implementation to support POSIX data access, created a more efficient way to integrate Alluxio with FUSE service, as well as many improvements in relevant data operations like more efficient distributedLoad, optimizations on listing or calculating directories with a massive amount of files, which are common in model training. We will also share our engineering lessons and roadmap in future releases to support Machine Learning applications.

Speakers:

Bin Fan is the founding engineer and VP of Open Source at Alluxio. Prior to Alluxio, he worked for Google to build the next-generation storage infrastructure. Bin received his Ph.D. in Computer Science from Carnegie Mellon University on the design and implementation of distributed systems.

Lu Qiu has been involved in open source software for many years and is currently a software engineer at Alluxio. Lu develops easier ways for Alluxio integration in the public cloud environment. Lu is mainly responsible for leader election, journal management, metrics management, and big data preparation for machine learning workloads. Lu received an M.S. degree from George Washington University in Data Science.

Slack with speakers, experts, and community members.
Join the Alluxio Global Online Meetup Group.

Video:

Slides: