ETL Using Python and Pandas
Sep 26, 2017
[Dec26, 2020: A follow-on article has been published: ETL Using Python and Pandas: Part 2]
I was working on a CRM deployment and needed to migrate data from the old system to the new one.
The 50k rows of dataset had fewer than a dozen columns and was straightforward by all means. File size was smaller than 10MB. Sadly, that was enough to choke Excel on a modern day ThinkPad with 20GB RAM.
Whipping up some Pandas script was simpler. This was a quick summary. The Jupyter (iPython) version is also available.
The sample data files are published on Github
Originally published at Kenneth Lo, PMP.