Lot's of people like notebooks and so do I. Jupyter Notebooks for instance, are great to quickly explore some data or try something out. If you want to bring code into production however, you should or most likely, have to write standalone scripts.
If you want to create something for production and then do it in production, Jupiter notebooks are not ideal. They are nice if you do some exploration, do some testing or some prototyping.
If you really want to bring something into production, you create code and then in my experience you shouldn't do it with notebooks.
So if you have a notebook, how do you basically create this or change it from a notebook to something else?
You run it, you make a copy, convert it to a Phyton file and remove all print statements. You basically create the whole thing, write unit tests, make them pass and then recycle and refactoring cycle and then you test the whole thing.
Misconception: "If you can do stuff in notebooks, you're ready for production"
Contrary to this misconception, you are just not ready for production. Because handling notebooks is not the way that you automate stuff.
Nevertheless, it's good for exploration, for trying out things, for doing a master's thesis or a PhD or whatever. But if you want to run it in production and run it on the systems that are available, then you have to go and create a standalone script. For example Phyton or whatever.
So this is the normal cycle!
Starting with notebooks still makes sense!
Nevertheless it makes sense if you start with a notebook. It's easy and you get direct feedback - that's why I like Zeppelin for Spark so much. You can try it out very, very easily. You can load in data, manipulate data and look at the results!
But I would not use it in production. I mean, maybe for a quick batch job, to explore some data and try out some stuff. But not otherwise.
What are your experiences with Jupyter notebooks? Let me know in the comments!
>> created by Mira Roth
Check out my full video on YouTube!