[Internal dialogue]
“Is this your first Datathon hike? The Melbourne Datathon 2018 will be my second.”
“No, I’m a newbie to Data Science but an oldie in Computer Science, so it balances out. Anyway, I always learn more when I don’t have too many prejudices and expectations.”
“Stop talking. Lets’ get on our way to climbing to the summit.”
[Full disclosure: Picture isn’t mine]
Chapter 1: Climbing to the summit
Good company in a journey makes the way seem shorter. — Izaak Walton
I’ve tried to climb to the summit before but time got the better of me. Always one or the other, time or energy, runs out. So I’ve learnt a few things from the past. Keep the body light, the pace steady, and find company that keeps up (or tells you to keep up).
So, I’ve set out to Base Camp where I have provisioned some useful tools for the trip:
- Google Cloud. While nothing replaces bare metal/plastic for responsiveness, nothing scales with more data than the cloud. Not caring what machine has my data and how its configured is a load off my mind. I’m not vendor-biased, though, so I used Azure last year and next year I’ll use Amazon (I would have used them this year if I had known about the credits :-()
- Datalab instance. Google’s Data Science Virtual Machine instance deserves a separate point because it provides a standard Jupyter notebook environment for data hacking in python and communicates with the rest of the ecosystem.
- PANDAS. Python’s go to library for data hacking. Before this datathon, I only knew the black and white variety.
- Hackday events. Locked away on my calendar as milestones on the journey.
- Reading widely. Broaden the material to inspire your research.
- Coffee. Enough said.