Task 1: Code conversation
First up, I took a recent real-life project to see if AI tools could have provided time savings. My team had decommissioned a legacy analytics platform. The platform previously used Scala data cleaning scripts for experience study analytics. Now I needed to translate more than 1,000 lines of Scala code into SQL for the new analytics platform.
I turned that first task over to ChatGPT and asked it to provide the SQL equivalent of the Scala code. ChatGPT’s response was impressive. It provided the SQL code and even included some helpful explanations about minor adjustments the user might need to make. ChatGPT did drop an important “create a table” command. Otherwise, the output provided exactly what I needed.
To take advantage of ChatGPT’s generative strengths, I dropped in the next chunk of Scala code in my process and asked ChatGPT to convert this snippet by building off the prior step. Again, the chatbot performed well. It surprisingly referenced my code snippet from my first prompt by building a nested subquery to consolidate the two pieces of code into one. Sadly, the dropped table was again left out.
I wrapped up the exercise by asking ChatGPT to rebuild the last query as a common table expression (CTE). The bot demonstrated an understanding of a CTE, but its CTE script was poorly written and would make the code less efficient. And once again, it dropped the “create table” command.
Our other test subject, GitHub Copilot, is built for exactly this type of code-intensive problem. Its translation was executed perfectly for both the code translation and the CTE. Additionally, the convenience of the code translation occurring within the coding platform made this tool ideal for improving the efficiency of my project.
Final grades on Task 1: ChatGPT: B+ | GitHub Copilot: A
Task 2: Computing life expectancy in R on a dataset
Next, I asked the AI tools to tackle a basic actuarial calculation. They must convert mortality rates into predicted life expectancies. This is a function our team uses to evaluate the mortality curves generated by our models and test them for reasonableness.
Here was my prompt for ChatGPT: “Create a data frame in R with given parameters. Then create a function that calculates the life expectancy from that data frame.” ChatGPT generated code that produced a data frame, but one of its formulas contained too many values. The code could not run. Furthermore, its calculation of life expectancy was erroneous.
I then provided ChatGPT with some supplemental information that clearly spelled out my definition of “life expectancy”: “Given the estimated life expectancy is the sum of the cumulative probability of survival at each age plus a half a year, update the function accordingly.” That extra bit of information helped. ChatGPT correctly calculated the life expectancy. In doing so, it included some unnecessary code, which I flagged and removed. The bot’s response also used an interesting function in R called “rev,” which was new to me.
Finally, I asked ChatGPT to update the function to calculate life expectancies for each unique ID within the data frame. After some minor edits, the bot’s code output worked well. Again, its answer gave me the opportunity to learn a new function in R, called “by,” which substantially improved the runtime of my calculations.
GitHub Copilot also stumbled on its first attempt at this project. Its output code was incorrect, and, surprisingly, incorrect in a different way than ChatGPT’s first attempt. After looking more closely at GitHub’s output, we figured out that the bot was assuming mortality rates are constant over time and across all ages. If that assumption were true, then GitHub’s output would have been correct.
I updated GitHub’s code to accurately calculate life expectancy. But the next output calculated the probability of survival for a given year, instead of cumulative survival. Once again, I had to jump in to correct the code. In the time it took for me to correct GitHub’s approach, I could have coded this project myself.
Final grades for Task 2: ChatGPT: B | GitHub Copilot: C