8 Easy distributed joins with pachyderm

medium.com posted by kenny 245 days ago  

Distributing your dataset/database joins can be a daunting task, to say the least. Not only do you need to think about how your data is sharded, indexed, etc., you need to think about what types and sizes of resources you need to allocate based on the scale of your data. For many, these considerations cause them to retreat to brute forcing their data transformations on increasingly beefy boxes.

Register to comment or vote on this story