ETL Using Python and Pandas

Kenneth Lo, PMP
Sep 26, 2017

[Dec26, 2020: A follow-on article has been published: ETL Using Python and Pandas: Part 2]

I was working on a CRM deployment and needed to migrate data from the old system to the new one.

The 50k rows of dataset had fewer than a dozen columns and was straightforward by all means. File size was smaller than 10MB. Sadly, that was enough to choke Excel on a modern day ThinkPad with 20GB RAM.

Whipping up some Pandas script was simpler. This was a quick summary. The Jupyter (iPython) version is also available.

The sample data files are published on Github

Originally published at Kenneth Lo, PMP.

--

--