MR
Read the data from the file (disk access =1)
Run mappers
Write map output (disk access =2)
Run shuffle and sort (read intermediate o/p of mapper, disk access =3)
write shuffle and sort (disk access =4)
run reducers which reads sorted data (disk access =5)
write reducers output (disk access =6)
TEZ
Irrespective of the tasks it first creates DAG(Directed Acyclic Graph)
It is similar to Spark but developed well before than spark.
Executes the plan but no need to read data from disk.
Once ready to do some calculations, get the data from the disk and perform all the steps and produce the output.
One read and one write
Pros: One read and one write
Efficient as it wont access the disk multiple times and stores intermediate results in memory.
Vectorization is enabled on top of it.
Last but not the least
If the table is partitioned and there are delta files (from updates, for eg.), I think mr works but not tez. You may have to run compaction to convert the delta files into base files and then tez will work.
Read the data from the file (disk access =1)
Run mappers
Write map output (disk access =2)
Run shuffle and sort (read intermediate o/p of mapper, disk access =3)
write shuffle and sort (disk access =4)
run reducers which reads sorted data (disk access =5)
write reducers output (disk access =6)
TEZ
Irrespective of the tasks it first creates DAG(Directed Acyclic Graph)
It is similar to Spark but developed well before than spark.
Executes the plan but no need to read data from disk.
Once ready to do some calculations, get the data from the disk and perform all the steps and produce the output.
One read and one write
Pros: One read and one write
Efficient as it wont access the disk multiple times and stores intermediate results in memory.
Vectorization is enabled on top of it.
Last but not the least
If the table is partitioned and there are delta files (from updates, for eg.), I think mr works but not tez. You may have to run compaction to convert the delta files into base files and then tez will work.
Comments
Post a Comment