Sail 0.2: Spark replacement in Rust, runs 4x faster, drop-in PySpark compatible lakesail.com 13 points by chenxi9649 13 hours ago
chenxi9649 13 hours ago Previous discussion in September when they didn't have distributed processing: https://news.ycombinator.com/item?id=41496033Github Repo: https://github.com/lakehq/sailFew interesting notes:- Benchmarks show 4x faster than Spark on TPC-H with 94% cost reduction.- Currently at 65.7% PySpark test compatibility(they talk about this in more detail in the post)- Built in Rust using Tokio runtime and Arrow IPC for high performance- Already supports 79/99 TPC-DS queries chenxi9649 13 hours ago Also, some discussions on Reddit from yesterday/today.https://www.reddit.com/r/dataengineering/comments/1gv840u/in...https://www.reddit.com/r/rust/comments/1gwayz6/introducing_d...
chenxi9649 13 hours ago Also, some discussions on Reddit from yesterday/today.https://www.reddit.com/r/dataengineering/comments/1gv840u/in...https://www.reddit.com/r/rust/comments/1gwayz6/introducing_d...
Previous discussion in September when they didn't have distributed processing: https://news.ycombinator.com/item?id=41496033
Github Repo: https://github.com/lakehq/sail
Few interesting notes:
- Benchmarks show 4x faster than Spark on TPC-H with 94% cost reduction.
- Currently at 65.7% PySpark test compatibility(they talk about this in more detail in the post)
- Built in Rust using Tokio runtime and Arrow IPC for high performance
- Already supports 79/99 TPC-DS queries
Also, some discussions on Reddit from yesterday/today.
https://www.reddit.com/r/dataengineering/comments/1gv840u/in...
https://www.reddit.com/r/rust/comments/1gwayz6/introducing_d...
[dead]