This weekend, our group got a lot done on the data front. I helped try to find a way to get the pdf files of Zoobooks into an excel spreadsheet, which led me to learn how to write and read csv files using Python. When Dominic found a Python library that organized the data nicely, I used my newfound Python skills to write a program that parsed the 1970s Zoobook data into a csv. My program did a fairly good job pulling the data, but I had to go through it in OpenRefine as well, which forced me to learn a lot more about that program. When all the Zoobook data was in a csv file, I helped to clean it in OpenRefine. As the weekend went on, I became much more comfortable with both OpenRefine and Python, and was able to do each task more quickly.
The next step for me personally is to help with the data visualization side of things. Once we georeference the data, I will explore it on ArcGIS and Flourish to get ideas about what types of data viz would be useful. While I do not have much experience with these tools, it will give me an opportunity to develop new skills.
Be First to Comment