Cool projects that are worth exploring, contact me if you’re interested.
Get inspired by Richard Feynman!
- “What I cannot create, I do not understand.”
- “You’re unlikely to discover something new without a lot of practice on old stuff.”
1. Understand auto-vectorization in Arrow
Background
Vectorized execution processes multiple data using a single instruction (SIMD).
Vectorized code is typically challenging to implement and maintain. Consequently, arrow-rs has removed all manual SIMD instructions and now relies on LLVM to auto-vectorize the scalar code.
jayzhan211 noted that not all code is auto-vectorized, while tustvold elaborated on common conditions for auto-vectorization.
Project Goals
- Understand what is being auto-vectorized:
- Manually inspect using the cargo-asm tool, focusing on SIMD-friendly operations like
sum
, find
, min
/max
, etc.
- List common coding mistakes that prevent LLVM from auto-vectorizing the execution.
- Understand the benefits of auto-vectorization:
- Write manually vectorized code and compare its performance with LLVM-generated vectorization.
- Evaluate performance on Intel x86, AMD x86, Apple M chips, and other ARM chips on cloud providers.
- Improve the situation:
- Share findings with the world in a blog post or paper.
- (Extended goal) Develop a tool to detect common mistakes.
- (Extended goal) Develop a tool to highlight auto-vectorized regions of code, similar to code coverage.
Related readings: https://github.com/apache/arrow-rs/pull/6554 https://github.com/apache/datafusion/pull/12809#discussion_r1802871921
Related side-project: implement AVX512 encoding for FSST in Rust, as a way to understand what it takes to write SIMD intrinsics.