So I was curious and wanted to know what are the tools of choice these days for on-prem and cloud based platforms. Based on my research, I found out the following: For cloud, it might be platform dependent. For on-prem hive,spark,nifi,airflow,kafka seem to be good. I would like to know which tools are in demand in your opinon. Also, would be nice if some comments on how docker, kubernetes conncect with these data tools. i heard the enw version of spark supports both along with yarn.