Prologue

AI will probably most likely lead to the end of the world, but in the meantime, there’ll
be great companies.

(Altman, 2023)

Sam Altman is the founder of OpenAI’s ChatGPT, and despite claiming to be joking in the above quote, the sentiment of foreboding concerning Artificial Intelligence is widespread amongst all areas of the population.

Identification of general or initial idea

ChatGPT is the most prominent of all Artificial Intelligence bots, and since it’s launch on 30th November 2022, it has both amazed and concerned educators around the globe. In an open letter to the Times, a group of headteachers led by Sir Anthony Seldon – head of Epsom College – said “

Schools are bewildered by the very fast rate of change in AI and seek secure guidance on the best way forward, but whose advice can we trust? We have no confidence that the large digital companies will be capable of regulating themselves in the interests of students, staff, and schools and in the past the government has not shown itself capable or willing to do so.

(TGN, 2023)

Key statistics

(Brandl, 2023)

  • Within one week of launch ChatGPT had one million users
  • On a monthly basis ChatGPT website has 1.6billion visitors
  • The tool set a record for having the fastest-growing user base in history for a consumer application, gaining 1million users in just 5 days.
  • 12% of the top used websites globally are already banning ChatGPT
  • Revenue predictions for ChatGPT currently estimated at $200 million by the
    end of 2023 and $1 billion by the end of 2024.
  • Demographics of users – 62.52% are between 18and 34, 66% male, 34% female.
  • And probably the most concerning statistic for educators
  • An average of 53% of people can’t tell that ChatGPT content was generated
    by an AI.

A Google search (14.10.23) “concerns about AI in education” returns 271,000,000 results, access to ChatGPT is banned in North Korea, Iran, China, Cuba and Syria, although the initial ban in Italy has now been lifted, and there are every growing calls by educational institutions to ban access.

However, our job as educators is to prepare our students for the workplace.

I am a 55-year-old man, who sat his O’Grades in 1985 and remembers not being allowed to use a calculator in maths exams. When our students progress into the workplace, ChatGPT and other AI bots will be even more prevalent than they are now.

Lewin’s Action Research Cycle in this project

Planning

The project would like to investigate the pros and cons of ChatGPT with respect to plagiarism.

Action

Attempt to design a prototype AI plagiarism checker.

Evaluate

Evaluate the success or otherwise of the plagiarism checker

With the above in mind, the research question we should like to attempt is:

data education in Colleges

Analysing the research

Mixed methods research combines elements of quantitative research and qualitative research …. can help you gain a more complete picture than a standalone quantitative or qualitative study, as it integrates benefits of both methods. (George,2021)
As discussed earlier the benefits of a mixed methodology in research is a ‘best of both world’s’ analysis, and as such, the research for this project will be a balance of both.

The research will be gathered as follows:

  1. Literature review
  2. Anonymous survey on social media with a target of1000 responses
  3. In depth interviews with educators

How will the research be analysed

Literature review

Despite the fact that ChatGPT was released 30th November 2022 (Marr, 2023) less than a year ago, a quick Google search of concerns over ChatGPT reveals a gargantuan number of results, which is growing daily.

  • ChatGPT Google search 31.10.23 – 291,000,000 results
  • ChatGPT Google search 11.11.23 – 462,000,000 results
  • ChatGPT Google search 25.11.23 – 826,000,000 results

The very fact that the number of results almost doubled in two weeks, shows the real global apprehensions over artificial intelligence. The Google searches will be continued on a fortnightly basis with the results plotted on a timeline series line
graph. (See analysis summary below). Further literature reviews will be completed with the results and references detailed in the supporting document.

Survey results

The large-scale survey will have a total of 9 questions, with a balance of quantitative and qualitative questions. Target responses – 1000.

Survey questions – Analysis type / visualisation

Data Types in Analysis

Nominal vs Ordinal vs Discrete vs Continuous data

With data classified into the two main areas of qualitative and quantitative, these are then further broken down into Nominal and Ordinal (qualitative) and Discrete and Continuous (quantitative). (Team, 2021)

Plotting the analysis

The survey responses will be visualised in real time. By utilising a combination of various applications within Microsoft 365 (summary below), the data can be collected instantly, GDPR compliant, and visualised within a website. This will allow survey
respondents to see the overall results as and when they happen. The survey responders will be able to compare their responses with the overall results. By sharing the website over social media via LinkedIn, Twitter (X) and Facebook – which will contain a link to the survey – further respondents will be encouraged to take part

Interviews

Four interviews are to be arranged:

  • Lecturer – Edinburgh College
  • Line manager – Edinburgh College
  • Vice Principal – Edinburgh College
  • Industry expert – owner digital marketing company

The interviews will be structured identically, ie with the same questions to be asked to each. This will allow their results to be compared, with results added to a further dashboard within the Power BI report.

Analysis summary / choice of visualisations

  • Google searches will be plotted on a line graph, as this is the most appropriate to plot continuous data over time (Oetting,2020)
  • Discrete data should be plotted as distinct values using either a bar or column chart (Oetting, 2020)
  • Continuous data should be grouped into ranges and plotted as a histogram (ie a column chart with the spacings removed). The term histogram was first coined by statistician Karl Pearson in 1982 (Ioannidis,2003)
  • Nominal data can be plotted using either a WordCloud or Packed Bubble (Emery, 2014)

Ethical considerations

The project research will take 3 forms:

  1. Literature review
  2. Anonymous survey on social media with a target of 1000 responses
  3. In depth interviews with educators

The 3-pronged approach to research will require ethical considerations throughout to ensure the minimisation of bias from beginning to end.

Ethical considerations

The project research will take 3 forms:

  1. Literature review
  2. Anonymous survey on social media with a target of 1000 responses
  3. In depth interviews with educators

The 3-pronged approach to research will require ethical considerations throughout to ensure the minimisation of bias from beginning to end.
Any data collection project follows the following paths, and should ideally be a cyclical process, (Anna,2022).

The 5-step process of acquiring domain knowledge, gathering the data, cleaning said data, KPI identification, data modelling, then visualisation should in an ideal world be repeated ad infinitum.

Though bias can never be completely eradicated, we can take steps to dimmish it’s effects (Namit and Seiden, 2018).
Any project, whether a data collection exercise or not, has both time and budgetary constraints, therefore – as above – in an ‘ideal’ world, this cyclical process would be repeated ad infinitum with bias further minimised at each recurrent undertaking.
However, in this case, the biases will need to be minimised wherever possible, within the time constraints of the project.

Literature review – bias minimisation and ethical considerations

Since Time Berners-Lee developed the first version of HTML (Hyper-Text Markup Language) enabling the first website to be published on August 6th, 1991 (Nix, 2016), there are now over 2 billion live websites, with 576,000 new sites going live every
day (Wise, 2022). The wealth of available research for any topic is overwhelming, particularly considering the prevalence of unreliable information, with the most recent investigations suggesting that 62% of the information on the internet is
undependable (Yaqub, 2022), and as such care must be taken to ascertain that the literature review is corroborated, by triangulating the research across reliable sources.

Anonymous survey -bias minimisation and ethical considerations

Step 2 in the research will involve an anonymised survey on various social media platforms, such as LinkedIn, Twitter (X) and Facebook. One primary ethical concern in data collection is obtaining informed consent from individuals whose data is being
collected. … Failure to obtain informed consent raises serious ethical questions about autonomy and privacy. (Emanuel et al. 2000)
It is normally expected that participants’ voluntary informed consent …. will be obtained at the start of the study (BERA, 2018).

Privacy, anonymity, and confidentiality concerns will be managed by ensuring that the survey – created using Microsoft
Forms – will be anonymised. This can be done within the Forms’ settings. ( see image below )


In addition, in accordance with GDPR’s seven guiding principles (Information Commissioner’s Office, 2023) no personal identifying personal information will be gathered either implicitly or explicitly.

The originally planned survey questions have been updated following a professional discussion with David Hiddleston (Curriculum Portfolio Manager – Data Education in Colleges) on November 15th, 2023 (Appendix). Following this, the 9 survey questions were to be a hybrid of qualitative and quantitative questions (Appendix). The quantitative questions are to ascertain the age, vocational area, and use of ChatGPT, with the qualitative investigating the respondents’ views on the pros and cons of the above.

In depth interviews with educators / industry

For comparison, the 4 planned interview will be conducted identically.

  • Lecturer – Edinburgh College
  • Line manager – Edinburgh College
  • Vice Principal – Edinburgh College
  • Industry expert – owner digital marketing company

Prior to each, the interviewee will be asked to read and sign the Interview Consent Form (Appendix). A further meeting has been arranged with David Hiddleston (04.12.23), to discuss the planned questions and ensure bias minimisation by removing leading questions. The interview subjects have been chosen to provide views across the educational hierarchy, and the workplace. Concentration of subjects has been within education, since this is the main area of the project’s focus, however the ‘net’ for the social media survey (Step 2) will be thrown as wide as possible, so by having the views of an industry expert, this will provide a broader view on the use and concerns of ChatGPT in general.

Reference List

Altman, S. (2023). Can you safely build something that may kill you? [online] Vox. Available at: https://www.vox.com/future-perfect/2023/5/24/23735698/openai-sam-altman-ai-safety-legislation-risks-development-regulation [Accessed 1 Nov. 2023].

Anna, H. (2022). Life Cycle of Data Science Project. [online] www.linkedin.com. Available at: https://www.linkedin.com/pulse/life-cycle-data-science-project-hareesh-anna/ [Accessed 24 Nov. 2023].

BERA (2018). Ethical Guidelines for Educational Research, Fourth Edition (2018). [online] Bera.ac.uk. Available at: https://www.bera.ac.uk/publication/ethical-guidelines-for-educational-research-2018-online [Accessed 19 Nov. 2023].

Brandl, R. (2023). ChatGPT Statistics and User Numbers 2023 – OpenAI Chatbot. [online] Tooltester. Available at: https://www.tooltester.com/en/blog/chatgpt-statistics/.

Clem Adelman (1993) Kurt Lewin and the Origins of Action Research, Educational Action Research, 1:1, 7-24, DOI: 10.1080/0965079930010102 

Corn, H. (2023). Twitter-Sentiment-Analysis-about-ChatGPT. [online] Twitter-Sentiment-Analysis-about-ChatGPT. Available at: https://hxycorn.github.io/Twitter-Sentiment-Analysis-about-ChatGPT/ [Accessed 4 Nov. 2023].

Coursera (2023). Structured vs. Unstructured Data: What’s the Difference? [online] Coursera. Available at: https://www.coursera.org/articles/structured-vs-unstructured-data [Accessed 4 Nov. 2023].

Emanuel, E.J., Wendler, D. and Grady, C. (2000). What Makes Clinical Research Ethical? JAMA, [online] 283(20), p.2701. doi:https://doi.org/10.1001/jama.283.20.2701.

Emery, A.K. (2014). Depict Data Studio. [online] Depict Data Studio. Available at: https://depictdatastudio.com/how-to-visualize-qualitative-data/ [Accessed 10 Nov. 2023].

George, T. (2021). An Introduction to Mixed Methods Research. [online] Scribbr. Available at: https://www.scribbr.com/methodology/mixed-methods-research/ [Accessed 11 Nov. 2023].

Information Commissioner’s Office (2023). A guide to the data protection principles. [online] ico.org.uk. Available at: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-principles/a-guide-to-the-data-protection-principles/ [Accessed 24 Nov. 2023].

Ioannidis, Y. (2003). The History of Histograms (abridged). [online] Available at: https://www.vldb.org/conf/2003/papers/S02P01.pdf [Accessed 10 Nov. 2023].

Joshi, A., Kale, S., Chandel, S. and Pal, D.K. (2015). (PDF) Likert Scale: Explored and Explained. [online] ResearchGate. Available at: https://www.researchgate.net/publication/276394797_Likert_Scale_Explored_and_Explained [Accessed 4 Nov. 2023].

Kudrat (2015). Mixed Methods Research. [online] Academike. Available at: https://www.lawctopus.com/academike/mixed-methods-research/ [Accessed 3 Nov. 2023].

Marr, B. (2023). A Short History Of ChatGPT: How We Got To Where We Are Today. [online] Forbes. Available at: https://www.forbes.com/sites/bernardmarr/2023/05/19/a-short-history-of-chatgpt-how-we-got-to-where-we-are-today/ [Accessed 10 Nov. 2023].

Namit, K. and Seiden, J. (2018). Reducing data collection bias in education research. [online] blogs.worldbank.org. Available at: https://blogs.worldbank.org/education/reducing-data-collection-bias-education-research#:~:text=Though%20human%20bias%20can%20never%20be%20completely%20eradicated%2C [Accessed 20 Nov. 2023].

Nix, E. (2016). The World’s First Web Site. [online] HISTORY. Available at: https://www.history.com/news/the-worlds-first-web-site [Accessed 18 Nov. 2023].

Oetting, J. (2020). Data visualization 101: How to choose the right chart or graph for your data. [online] HubSpot. Available at: https://blog.hubspot.com/marketing/types-of-graphs-for-data-visualization [Accessed 11 Nov. 2023].

Read, M. (2019). Comparison Pros and Cons of Qualitative and Quantitative Research. [online] theintactone. Available at: https://theintactone.com/2019/03/03/brm-u2-topic-3-comparison-pros-and-cons-of-qualitative-and-quantitative-research/#:~:text=Comparison%20Pros%20and%20Cons%20of%20Qualitative%20and%20Quantitative [Accessed 4 Nov. 2023].

Stevens, E. (2022). Quantitative vs. Qualitative Data: What’s the Difference? [online] careerfoundry.com. Available at: https://careerfoundry.com/en/blog/data-analytics/difference-between-quantitative-and-qualitative-data/ [Accessed 3 Nov. 2023].

Team, G.L. (2021). 4 Types Of Data – Nominal, Ordinal, Discrete and Continuous – Great Learning. [online] GreatLearning Blog: Free Resources what Matters to shape your Career! Available at: https://www.mygreatlearning.com/blog/types-of-data/ [Accessed 8 Nov. 2023].

TGN, S. (2023). UK schools ‘bewildered’ by AI and do not trust tech firms, headteachers say. [online] Top Globe News. Available at: https://topglobenews.com/education/uk-schools-bewildered-by-ai-and-do-not-trust-tech-firms-headteachers-say/ [Accessed 2 Nov. 2023].

Wise, J. (2022). How Many Websites Are There in 2022? – EarthWeb. [online] EarthWeb. Available at: https://earthweb.com/how-many-websites-are-there/ [Accessed 23 Nov. 2023].

Yaqub, M. (2022). 62% Percent of unreliable information on the Internet in 2022. [online] BusinessDIT. Available at: https://www.businessdit.com/fake-news-statistics/ [Accessed 24 Nov. 2023].

Appendix