Learn Data Engineering
Only $19.97/month
Search

Drafting Your Data Pipelines

With careful consideration and learning about your market, the choices you need to make become narrower and more clear. I can now begin drafting my data ingestion/ streaming pipeline without being overwhelmed.





For A Quick Recap

You can find the first blog post here, where I learned which tech is most in demand in Toronto: https://www.teamdatascience.com/post/dear-hiring-managers-i-m-here-to-help


And the second blog post is here where I learn which Toronto industries need data engineers the most: https://www.teamdatascience.com/post/toronto-data-engineering-market


The Pipeline Proposal



I'll be creating several pipelines in this project, but first things first; I need to ingest the data, process it and store it.


I'll use Python and Spark because they are the top 2 requested skills in Toronto. Kafka, while not in the top 5 most in demand skills, was still the most requested buffer technology requested which makes it worthwhile to include it. The remaining tech (stages 3, 4, 7 and 8) are all AWS technologies.



What's Next

I'll be documenting how I build this setup in the AWS console (with screenshots).

You can find my linked in here: https://www.linkedin.com/in/steven-aranibar-8891a2103/

And you can contact me at cloudengineertoronto@protonmail.com


93 views

© 2020 Team Data Sicence - Andreas Kretz

  • Black LinkedIn Icon
  • Black Twitter Icon
  • Black Facebook Icon
  • Black Instagram Icon