How one can Use ChatGPT as a Knowledge Scientist?

Introduction
Are you an information scientist in search of an thrilling and informative learn? Look no additional, as a result of I’ve obtained a deal with for you! My newest weblog publish is jam-packed with enjoyable and modern experiments that I performed with ChatGPT over the weekend. On this experiment, I put ChatGPT to the take a look at and challenged it to generate the answer to a Knowledge Science drawback mechanically. You received’t need to miss the unimaginable outcomes that we achieved collectively. Be a part of me as we dive into the nitty-gritty of how we created the prompts to attain our desired final result and see for your self simply how correct the options have been. Belief me, it is a weblog publish you received’t need to miss! Come, let’s learn the way to make use of ChatGPT prompts as a Knowledge Scientist?
From code to completion, ChatGPT makes Knowledge Science initiatives a breeze!
Overview of the Experiments
I’ll run via 2 completely different experiments. Within the first experiment, I need to see if ChatGPT might help me with the code for constructing the machine studying mannequin on a selected dataset. We can even consider the code within the jupyter pocket book to see if it’s correct or not. And within the second experiment, we are going to take the learnings of experiment 1 and redesign prompts for desired outcomes. Broadly, we are going to consider the next points-
- Can ChatGPT create spam-free and flawless AI content material?
- Need to automate your coding with ChatGPT’s dataset-specific code technology?
- Perceive learn how to grasp the artwork of ChatGPT and tricks to obtain the specified outcomes with exact prompts.
Experiment 1: ChatGPT for Knowledge Science!
Let’s begin the primary experiment now.
I’ll think about the Black Friday Gross sales dataset. You’ll be able to obtain the dataset from ChatGPT. The dataset comprises the shopper transactions of a retail retailer containing buyer demographics, product particulars, and complete buy quantity. The corporate desires to know buyer buy habits for personalization. So, the ask is to construct a machine studying mannequin to foretell the acquisition quantity based mostly on the shopper demographics and previous merchandise bought.
Within the first immediate, I’m going to inform ChatGPT concerning the dataset and what’s it about.
Immediate 1
You might be supplied with the dataset of the retail retailer containing buyer transactions. Every row comprises buyer demographics, product particulars, and the overall buy quantity from final month. The pattern dataset is given under.
Now, the ChatGPT responds again requesting the dataset. Within the subsequent immediate, I’ll present the pattern dataset of the Black Friday gross sales dataset.
Be aware: You’ll be able to neither add the datasets on to ChatGPT nor copy-paste all the dataset.
So, we are going to copy and paste round 100-150 rows from the dataset.
Immediate 2
User_ID,Product_ID,Gender,Age,Occupation,City_Category,Stay_In_Current_City_Years,Marital_Status,Product_Category_1,Product_Category_2,Product_Category_3,Buy
1005915,P00372445,M,18-25,4,C,0,0,20,,,371
1005916,P00370853,M,51-55,20,B,1,1,19,,,24
1005918,P00370853,M,26-35,12,A,3,1,19,,,12
1005919,P00370853,M,18-25,0,C,0,0,19,,,48
1005920,P00375436,F,26-35,1,C,2,0,20,,,244
1005922,P00370853,M,55+,3,C,3,0,19,,,12
1005923,P00371644,M,26-35,7,C,1,1,20,,,129
1005924,P00370293,M,36-45,0,B,0,1,19,,,49
1005925,P00371644,F,26-35,0,C,1,1,20,,,592
1005927,P00372445,M,36-45,14,B,4+,1,20,,,358
1005929,P00370853,F,36-45,0,C,2,0,19,,,50
1005931,P00372445,F,18-25,7,A,3,0,20,,,129
1005932,P00371644,M,18-25,14,C,3,0,20,,,131
1005933,P00375436,M,26-35,2,C,3,1,20,,,364
Now, let’s ask ChatGPT to put in writing a code for constructing a mannequin to foretell the goal variable “Buy”.
Immediate 3
I would like you to behave as an information scientist and write code for me. Please construct a machine studying mannequin to foretell the Buy variable from the above dataset.
As you possibly can see, ChatGPT offered us with the code for constructing the machine-learning mannequin. We are going to run the code within the jupyter pocket book and see if it’s working or not.
The above code throws the error.
ChatGPT missed out on a few information preprocessing steps-
- There are categorical variables within the dataset. ChatGPT didn’t embody the code for coping with it.
- ChatGPT didn’t deal with the lacking values current within the dataset.
- ChatGPT didn’t drop the pointless columns like Consumer ID and Product ID.
Now, within the subsequent immediate, let me ask ChatGPT to replace the information preprocessing steps within the code with out explicitly mentioning the form of steps to carry out. Let’s discover out if it could possibly do it.
Immediate 4
The above code is incomplete. Replace the above code with the required information preprocessing steps relying on the offered dataset.
The above code throws the error.
As anticipated, it included the code for lacking worth imputation and dealing with categorical variables. However missed out on encoding product id and consumer id columns.
Let’s inquire about ChatGPT to encode product id and consumer id columns within the subsequent immediate.
Immediate 5
The above code offers an error. You missed encoding the consumer id and product id columns.
The above code throws the error. It encoded the product id and consumer id into new columns however didn’t drop the precise columns itself. As you possibly can see, that is the glitchy content material generated by ChatGPT.
Let’s immediate ChatGPT to revise the code.
Immediate 6
You might be incorrect. The above code nonetheless throws an error.
ChatGPT responds again in search of an error. Let’s copy and paste the error confronted working the code. This will probably be our subsequent immediate.
Immediate 7
ValueError: couldn’t convert string to drift: ‘P00233842’.
Is something incorrect with the code? Now you possibly can see that ChatGPT missed encoding the remainder of the specific columns. That is glitchy and flaw content material. It’s anticipated to incorporate the remainder of the specific columns because it encoded the remainder of the specific columns earlier. Whereas fixing the encoding of the product id and consumer id, it missed out on the opposite columns.
Now, let’s inquire about ChatGPT to encode the remainder of the specific variables.
Immediate 8
You missed encoding the remainder of the specific columns. Replace the code.
This time, it offered me with all the information preprocessing steps required. Lets run it within the pocket book. It stills throws the error. Let’s ask ChatGPT to repair it. Hope that is our final immediate.
Immediate 9
Replace the code. The code throws TypeError: Function names are solely supported if all enter options have string names, however your enter has [‘int’, ‘str’] as function title / column title sorts
Lastly, we achieved an error-free code.
Experiment 2: Knowledge Science Prompts for ChatGPT
A few learnings from the primary experiment are that
- All the time present detailed prompts to attain desired outcomes.
- Inform the ChatGPT to repair the code if it’s incorrect. It may repair its personal code.
Now, we are going to begin experiment 2 with our learnings.
Immediate 1
You might be supplied with the dataset of the retail retailer containing buyer transactions. Every row comprises buyer demographics, product particulars, and the overall buy quantity from final month. The pattern dataset is given under.
Immediate 2
User_ID,Product_ID,Gender,Age,Occupation,City_Category,Stay_In_Current_City_Years,Marital_Status,Product_Category_1,Product_Category_2,Product_Category_3,Buy
1005915,P00372445,M,18-25,4,C,0,0,20,,,371
1005916,P00370853,M,51-55,20,B,1,1,19,,,24
1005918,P00370853,M,26-35,12,A,3,1,19,,,12
1005919,P00370853,M,18-25,0,C,0,0,19,,,48
1005920,P00375436,F,26-35,1,C,2,0,20,,,244
1005922,P00370853,M,55+,3,C,3,0,19,,,12
1005923,P00371644,M,26-35,7,C,1,1,20,,,129
1005924,P00370293,M,36-45,0,B,0,1,19,,,49
1005925,P00371644,F,26-35,0,C,1,1,20,,,592
1005927,P00372445,M,36-45,14,B,4+,1,20,,,358
1005929,P00370853,F,36-45,0,C,2,0,19,,,50
1005931,P00372445,F,18-25,7,A,3,0,20,,,129
1005932,P00371644,M,18-25,14,C,3,0,20,,,131
1005933,P00375436,M,26-35,2,C,3,1,20,,,364
Immediate 3
I would like you to behave as an information scientist and write code for me. Please construct a machine studying mannequin to foretell the Buy variable from the above dataset. Embrace information preprocessing steps like dropping pointless ID columns, encoding categorical variables, dealing with lacking values, and so forth.
Immediate 4
Replace the code that features mannequin analysis.
One other inappropriate and glitchy content material from ChatGPT! It generated the code for the classification drawback for the regression dataset.
Immediate 5
The above code is inaccurate. The given dataset is a regression drawback.
Immediate 6
Replace the code that features function engineering. Maintain the remainder of the steps the identical.
Immediate 7
Write a code to tune the hyperparameters of the random forest. Use the neatest hyper-tuning method to attain the very best leads to much less time.
Immediate 8
Write a code to visualise a very powerful options.
Immediate 9
I wish to clarify the mannequin outcomes. Please write a code to interpret the mannequin outcomes.
Immediate 10
Please write a code to interpret the mannequin outcomes utilizing lime.
Unbelievable! Not programming is required. Coding simply obtained a complete lot simpler with ChatGPT.
Conclusion
On this article, we’ve got seen learn how to make use of ChatGPT for Knowledge Science. You’ll be able to automate your whole coding with ChatGPT particular to the dataset. However typically, ChatGPT can present glitchy and flawed AI content material. These are the instances when it’s essential to explicitly inform ChatGPT to repair and regenerate the content material once more. It may right its personal errors and study from them.
Lastly, we understood the significance of the precise prompts to get the specified outcomes from ChatGPT for information scientist. We’ve additionally seen a few of the prime helpful Knowledge Science prompts as properly.
Thats all for as we speak. See you within the subsequent weblog.