How to create a Data Pipeline for AWS

Andreas Kretz
Apr 22, 2020
2 min read

The first step in creating a data pipeline is to create a plan and select one tool for each of the five key areas Connect, Buffer, Processing Frameworks, Store and Visualize. Accordingly, you must choose one tool for each category and build on that.

For example you can look at API Gateways for the key area Connect. Then you would look into Kinesis for Buffer.

You would look at Lambda for Processing Framework and at S3 or something like DynamoDB for Store. Last you can look to Tableau for Visualize.

These tools are each examples of how you might proceed. So it is important that you choose some tools that you like or in which you already have knowledge.

It is also possible that you do not select Lambda in the Processing Framework category, for example, but instead use Elasticsearch. Accordingly, your plan would look like this:

Possible tools for the individual categories can be found in the Cookbook. Just have a look at all categories and search for a tool per category that suits you. Check them out and then try to learn them.

When choosing the tools, it is important that you already know in which direction you want to go later in your professional life.

For example, if you are looking for jobs that work with Google Cloud, choose tools that are basically Google Cloud Tools. If you are looking for jobs within AWS, choose tools in each section that are for AWS.

So it's important that you choose the tools that will help you in your future career and that you focus on them rather than learning everything that's available in the marketplace.

So stay focused and share your thoughts or experiences in the comments. I look forward to it!

>> created by Mira Roth

- Become a Data Engineer: Click here!

- My free 100+ pages Data Engineering Cookbook: Click here!

- Follow us on LinkedIn: Click here!

- Check out my YouTube: Click here!