This isn’t part of my product journal series (if you’re interested, feel free to check it out), but I wanted to share my journey of leveraging ChatGPT to create data visualizations. I teach data visualization at The New School, and common feedback I receive from my students and colleagues is:
Data visualization is cool but at the same time it’s bit daunting that I need to know lots of tech stacks to actually implement it.
I totally agree that even when I was studying data visualization, I spent a pretty substantial amount of time learning how to code, handle web hosting, work with Python, SQL, and more, all while absorbing knowledge on information visualization.
Thankfully, we no longer need to deep dive into technical gatekeepers in this field. This doesn’t mean that technical knowledge is not valuable, but rather that we no longer need to be intimidated by technology because AI can spoon-feed us knowledge and do the heavy lifting for us. Are you excited? Let’s get started!
I’m going to build the data visualization that one of my students posted on weekly write-up homework.
1. Find data source
Even finding data can be pleasant with ChatGPT.
It’s always crucial to honor the original reference. And I’m glad that my student did it for the original publisher of the data visualization.
If you hit the link, you will see the documentation on how the visualization was built as well as a link to the original source of the data (NYC Open Data) GREAT!
You can download the data by exporting it in CSV format from NYC Open Data like below.
Once you download and open the data, you will be able to see how the table is structured.
The lat, lon coordinates sits within a column which seems a little bit hard to use. So we need to massage the table. How? you guessed it. ChatGPT.
2. Data processing
Fear not, we have ChatGPT which will guide us. Let me walk you through the process assuming you know nothing about it.
Ask the right question to ChatGPT
What I’ve noticed is that you have to be extremely specific about what you’re looking for. ChatGPT isn’t a mind reader, so you shouldn’t expect it to understand vague or unclear questions. The less specific you are with your query, the more follow-up work you’ll need to do, and that’s not ideal. I want ChatGPT to do the work for me and provide me with the final answer right away.
Boom, you’ve got the code! But that’s not the end of the story. Do you know how to run it? If you’re not sure, don’t worry, ChatGPT can help you figure it out!
Don’t be shy. You can just drop a question as below.
Great, now you know how to run Python code! Personally, I prefer Jupyter Notebook, which has a pretty name. Let’s keep asking some more dummy questions!
Cool! Let’s get Anaconda by clicking on the link generated by ChatGPT, or simply searching for ‘Anaconda’ on Google. Once you’ve installed and opened Anaconda, you won’t have any trouble finding the Jupyter Notebook icon — you just need to have decent eyesight for this task!
Now we are SO READY to write a code. Let me correct myself. Now we are SO READY to copy and paste the code written by ChatGPT.
Click on the ‘New’ button and select ‘Python 3’. Python 3 is simply the latest version of the Python programming language, and it’s the only choice available in Jupyter Notebook. Paste your code into the notebook and hit the play button to run it.
Ouch! It doesn’t work. How do I fix it? You guessed it, ChatGPT. Just paste the error message you got.
It looks like the Python code is in a separate file, and your CSV file isn’t in the same location. Let’s move the CSV file to the same folder as the Python file. First, let’s save the Python code. After saving the code with the name ‘MyCode’, you should see the file saved in the following screen. It appears that the file was saved in the outermost folder on your Mac. (By the way, ‘ipynb’ is just the file extension for a Jupyter Notebook file — in this case, it’s your code.)
Based on ChatGPT’s advice, let’s move the CSV file to the same location as the Python file. That way, we’ll be able to work with both files more easily.
Let’s run the code again since I’m now confident that the code would work.
Again? I was over confident. Let’s just copy and paste the error message into ChatGPT.
ChatGPT said something but I don’t want to understand what it is since it’s a bit too much. so I just typed in… “Please write a new code to solve the problems you mentioned”.
Wollah! Done. Let’s download the result. How? you guessed it, ChatGPT.
Life can’t be this easy, but that’s exactly what it is with ChatGPT. Now, how do I get borough information for each set of coordinates? Once again, the answer is ChatGPT!
ChatGPT provided me with a link to download the necessary data, and I downloaded the ‘geo-json’ file as it recommended.
I got an error code and felt lazy too figure that out.
I honestly didn’t expect to get this far, but with ChatGPT’s help, I finally got the result I was looking for! It’s worth noting that I encountered several different errors along the way, but all I had to do was copy and paste the error messages and ask ChatGPT to regenerate the code to fix the problem.
3. Decide visualization method
Now only the visualization part is left. Let’s write? NO
Let’s ask and copy and paste. Let’s start with the fundamental question.
Seeking a consulting solution from a ChatGPT
Now I can start to ask several questions based on the answers.
- I need map API and D3.js
- I need csv file, which I already have
- Code editor
- Browser which I already have
4. Implementation of design
Ask for generating the backbone HTML file for the data visualization.
The program indicates that I need an ‘mapboxgl’ accessToken to proceed. Since this is private information, I cannot share it here. However, you can generate one by signing up for Mapbox.
Data Visualization solely made by ChatGPT
Take a look at the final result below. Isn’t it amazing to have a custom data visualization piece without having to write a single line of code?
If you’ve made it this far with me, you should be proud of yourself! Before I wrap up this article, let’s summarize the workflow and compare what it would have been like without ChatGPT.
Find data source
Without ChatGPT, finding the data you need can be a painful process. You have to try different search words and sift through the search results that Google presents to you. With ChatGPT, it’s a different game altogether. You simply type in what you want and ask for the link.
This is often the most intimidating part for those of us without a background in Python or other data processing platforms. In most cases, the data you find online won’t be in the exact format you need. That’s where tools like Python and SQL come in handy — you can use them to process the data and extract the information you need.
After preparing the data, the next step is to decide how you want to display it and what tools you want to use to accomplish that. This can require a decent amount of technical knowledge and familiarity. However, with ChatGPT, you can get clear guidelines and an implementation plan to help you navigate this process.Implementation of Design
This can be the biggest hurdle to overcome and I would call it the final gatekeeper of the world of data visualization. However, you can easily tackle this final boss by leveraging the power of AI and move forward with ease.
I mentioned at the very beginning of this posting casually.
This doesn’t mean that technical knowledge is not valuable, but rather that we no longer need to be intimidated by technology because AI can spoon-feed us knowledge and do the heavy lifting for us.
Ironically, this is the most critical lesson I took throughout this entire process while creating visualization not writing the code by myself.
While you can create stunning visualizations with ChatGPT, knowing how to code opens up even more possibilities. Especially when you encounter an error, the debugging process can be exhausting if ChatGPT is unable to identify it immediately. Additionally, ChatGPT does not retain memory of previous conversations, so it is ideal to have knowledge of coding with ChatGPT to streamline the process.
I want to encourage my audience to use this as a starting point to become more interested in data visualization and coding.
NOTE: A version of this article was originally published on Medium.
Soonk Paik is a Senior Consultant at Deloitte and also teaches Data Visualization at Parsons School of Design at The New School. Data Visualization is not only his profession, but also his passion.