Can't stop the signal, Mal.
I've gotten antsy about not doing some sort of large project. A side hustle to either push my understanding or create a new bullet point on my resume. I have plenty of work and other interests but a weak siren's call keeps rattling around in my brain. So I'll start a project and hopefully it pans.
First step is to pull a large dataset. For this I'm using the PPP loan data. You can get it here. [https://data.sba.gov/dataset/ppp-foia]
It's several very large files. First is the data dictionary and then it's twelve 400MB+ files. There is 10.5 million records in total that'll have to be processed.
Next up is getting this data into a tool for use.