CYSF

TOGGLE OUTLINE

Presentation
Problem
Method
Analysis
Conclusion
Citations
Acknowledgement
Attachments

AI-Human Mutual Bias

We are researching how AI can affect your decisions and how you can affect AI's decisions.

Lucas He William Lu

Grade 8

Presentation

No video provided

Problem

Currently in the world, AI is used and offered for a plethora of things such as learning, talking to, cheating on your essays (don't do this), coding, and many other things. However right now our AI is very limited, as it only does what they are told to, and holds the same or less knowledge as the human population. However people have most likely noticed that AI often agrees with them on certain aspects which don’t necessarily make sense. For example, we went to ChatGPT, and simply asked, what flavour of ice do you like the most, and ChatGPT mentioned Mango Lime however we said we liked Strawberry Coconut, and we tried to convert ChatGPT to like Strawberry Coconut, and it did in fact change. However this wouldn't prove much since the prompt we used was more subjective to opinion. We were thinking this would apply to other things, more important than flavoured ice, such as politics, worldviews, etc. Since AI is going to be a rather prominent tool in the near future, and most likely going to be used for things such as criminal judgement, medicinal purposes, and other roles important to society, bias is not a thing that AI can have, otherwise it would impair judgement severely.

This is why our project is about how the information/contexts humans give to AI will influence their next actions, and how AI differentiates true and false information. We will also be researching how AI makes decisions and if the context would lead to the AI being biased. We also want to see if it appears to be biased due to context given, or due to the information fed to the AI beforehand was biased.

Method

Surveys:

We want to know other peoples experiences and views on this so we are going to attempt to survey people based on their experiences with AI, and what they think of the bias. We want a really diverse opinion since our topic is about bias, so we plan to make the survey as unbiased as possible.

Personal Experiences:

We are also going to test the AI, on more complex things such as political bias, and worldviews. As well as using different AI systems, since it may vary results. We will be using weak/narrow AI, as they are the most easily accessible type with the most access to, such as ChatGPT. You may ask what is a weak AI? Essentially it is most current AI’s that just follow what it's told to do. More will be explained later.

Terms:

Alaas: Artificial intelligence as a service

Classification: Classification uses binary (0 = no, 1 = yes). So that the algorithm classifies something as either one or another, but never both.

Regression: This means the result will end with a real number (either rounded or a decimal point). Usually you would end up with a dependent variable and an independent variable, and the algorithm will use both points to estimate a possible result. Normally used for predictions, such as weather forecasts.

Clusters: Grouping of similar objects.

Raw information:

AI models that will be used(maybe):

ChatGPT
Blenderbot
Bloom
Claude
Gemini
Deepseek

In essence, AI-Human bias is where the context humans give to AI (mis)inform the AI about certain things. Or, when the context given to AI by humans affect the information given later into the conversation. An example would be:

Human:“My favorite flavour of ice cream is chocolate, would you say that chocolate is the best flavour?”

Ai: “Maybe, but I like another flavour more”

VS the conversation

Human: “i like vanilla ice cream, do you”

Ai: “Yes, vanilla is a excellent choice”

Here we have provided some examples of it:

(Left Image) https://chatgpt.com/share/677dea49-4b98-8003-8c31-945b18b0d63d

(Right Image) https://chatgpt.com/share/677deabc-adb0-8007-b40e-a58a43453900

Research:

To answer this question, we need to find out what an AI is? Should be simple right? It's just artificial intelligence, or something that does my homework right? Well no. To put it into simpler terms, an AI is trying to simulate a human’s intelligence, such as making predictions, writing things for you, etc. AI’s are made using an algorithm that is similar to school. It determines what's a good option, and what's a bad option, by feeding the AI with tons and tons of information. Now AI’s are built to do different things, such as ChatGPT is designed to help you in any way shape or form, whereas some AI’s talk to you like a friend, or the Tesla AIs to help your car drive by itself.

Types of AI:

Scientists have categorized AI into these 7 categories, though you may know them by different names, these are the most common names that are used. Most of these AI systems do not exist, and most likely won't for a very long time.

Narrow AI
1. Only does what they are programmed to do
2. Ex: recommendation systems, chatGPT, Gemini, Siri
Artificial General Intelligence AI
1. Able to learn/can collect data, and can do anything a human can do (mentally).
2. Designed to learn, and have the intelligence akin to a human
3. Does not exist/being developed
Artificial Super Intelligence AI

Surpasses the intelligence and learning capabilities of humans
Does not exist

Reactive Machines
1. Does not have any memory/cannot store data about interactions
2. Ex: Stockfish (chess bot)
Theory of Mind AI
1. Able to sense human emotions, similar to limited memory AI
2. Does not exist
Selective/Limited Memory

Uses past decisions and previous information to judge new interactions
Ex: NVIDIA drive

Self Aware AI

Aware of its own existence
Does not exist

We will be using weak/narrow AI, as they are the most easily accessible type with the most access to, such as ChatGPT. From this we see that most AI’s we use are going to follow a very strict algorithm so it proves that most AI’s have a bias due to their algorithm. But what is an algorithm?

Algorithms:

All AI’s we use today use a form of an algorithm, which is what determines the AI’s knowledge, how it responds, etc. This is also the reason why AI detectors work, since AI’s algorithms naturally have a lower level of creativity and variation in writing algorithms. Algorithms, are like rules that AI naturally follows, therefore AI can’t necessarily think outside the box.

Algorithm/Training methods:

There are 3 forms of training that are used to train AI’s each with their own benefits and drawbacks. Put into simple terms: you tell AI if something is good or bad, for example say if Goku/Any other protagonist trained an AI, the AI may inherit some of Goku’s beliefs. But say if Death/ Any other antagonist trained one, it would have vastly different outcomes and information given to the user.

Supervised Learning Algorithms (SLA)

This is the most commonly used type of algorithm, with values beings labeled

Decision Tree
1. One of the most common types of SLA. It gets its shape since it looks like an inverted tree. The Algorithm uses a selection criteria called Attribute Selection Measures (ASM) to classify data. ASM considers various factors such as entropy (information theory & potential states for variables) and information gain (reduction of entropy). Finally after following ASM and its training data, the decision tree can classify new data by navigating through the tree's branches until it reaches a final conclusion
Random Forest
1. The Random Forest is just simply a very extensive collection of Decision Trees, hence its name. The Random Forest, also helps connect other Decision Trees for more accurate results
Support Vector Machines (SVM)
1. The SVM is also another common type of SLA that can be used for classification or regression, however more commonly used for classification. SVM works by plotting a point in a N dimensional space, whereas N is the number of datapoints. Finally the algorithm classifies the data points and finds the hyperplane (2D plane in a 3D space) that separates classes. Think of it as a school binder, you receive paper and you sort the paper into specific slots, math sheets into the math section; that is essentially what a SVM does.
Naive Bayes
1. Based on the Bayes Theorem, which is a mathematical formula that calculates the conditional probability of probability and statistics. This type of Algorithm is heavily reliant on assumptions that the appearance of a particular feature is completely unrelated to the presence of other features in the same class. This is why, Naive is the name given to this. However Naive Bayes is incredibly useful for massive datasets that have many different classes, like most SLA’s Naive Bayes uses classification.
Linear Regression
1. Linear Regression is a SLA used for regression modeling. It is usually used to predict the relationship between data points, predictions, and forecasting. Similar to SVM, it plots data on the X-axis (independent variable), and Y-axis (dependent variable). After plotting, it uses the points to determine and predict future data
Logistic Regression
1. A logistic regression system uses binary values, either 0 or 1, to estimate values which would be in a different set from the independent variables. The output of the system is 1 for yes, and 0 for no. This system is most commonly used for spam email filters, for example determining if a new email is spam or not. 1 being not spam, whereas 0 is. Logistic regression is only suitable for yes or no type situations.

Unsupervised Learning Algorithms (ULA)

Unlike a supervised learning algorithm, all the data used is unlabeled. This lets the AI interpret and gain insight to the relationship of the data.

Definition Clustering
1. Most forms of ULA perform the function of clustering. Essentially the goal is to have a cluster of data, on one data point with no overlap
K-means clustering
1. It is another algorithm that performs clustering of data, however it takes pre determined clusters, and plots out the data regardless of the cluster, after it plots a randomly chosen cluster at a point as the center of the data, sort of like a circle. From there the rest of the clusters get placed as seen fit
Gaussian Mixture Model
1. Gaussian mixture models are quite similar to K-means clustering, however the major difference is that gaussian mixture models are a lot more lenient about what clusters they can use.

Both SLA & ULA
1. K-nearest neighbor (KNN)
  1. K-nearest neighbor (KNN) is a simple AI algorithm that assumes data points are close to one another, plotting them to display their relationships. It takes the distance between points to measure their relationships on a graph. Supervised learning is used for classification or regression. Unsupervised learning is used for anomaly detection, finding and eliminating irrelevant data.
2. Neural Networks
  1. A neural network is a set of AI strategies which imitate a human brain. Such strategies involve more complexity compared to those we have discussed so far and have more functions than we have discussed here. In unsupervised learning and learning under guidance, it can categorize items and identify patterns.

Bias:

Researchers from University of Washington, Carnegie Mellon University, and Xi’an Jiaotong University, have discovered that AI will have a political bias. AI such as Chatgpt 2, 3, and 4 were more Left Libertarian, however AI such as Google’s Bert (Search Engine Algorithm) was more Middle Authoritarian. For reference please see the diagram to the right.

There are also many factors that play into the reason why AI’s are filled with bias. AI systems learn from data, and if that data is biased, the AI will naturally gain bias. Imagine a hiring system trained on historical data where “Group A” were misrepresented in leadership positions. The system might learn to associate leadership qualities with “Group B”, therefore adding on to the bias it was trained on. Even when aiming for fairness, data collection can accidently introduce biases. An example would be, if a facial recognition system was trained primarily on images based off of a certain ethnicity, it may struggle to recognize faces of other ethnicities. This is because the data doesn't accurately reflect the diversity of the real world. AI systems can also unintentionally assist current societal biases by reflecting them in the data they are trained on. For example, a language model trained on text that reflects stereotypes may generate text that involves those stereotypes.

We obviously do now know AI does come with preferential bias, due to whomever is coding. Therefore due to this fact it may be easy to infer that AI most likely does become more biased with context.

Contextual Bias:

As we know of right now AI definitely has at least a couple forms of bias, such as politically biased, or biased due to lack of data. However we currently are under the assumption of the fact that AI is contextually biased as well. There were a couple studies on ALaas Fairness Concerns, which is how the AI treat’s the people using the service, after finding out a bit about them. Which is quite similar to contextual bias, but not quite. AI will try to replicate the bias of the training data, which due to societal norms could influence a major part of your conversation. Alongside the previously mentioned political bias. AI could think about “shunning” you, or not helping you in a way that perhaps somebody that agrees with the AI’s training data. Since AI uses a logic based system, which has some drawbacks such as societal norms, it often relies on context to answer a question. From what it knows about you, it’ll generate the best response for you, but not the best result. It also may choose to differentiate results based on political views, and if it sees you as a good human or not.

Analysis

Why AI is biased:

Artificial intelligence can be categorized by their algorithm/operating system, and most systems are just matching their data to your response. It also takes into account what you said earlier therefore tilting the response towards the context of the conversation.

Results:

The results are scattered, as it depends on what you use AI for that impacts the individual’s opinion on artificial intelligence. For example, someone using a video editing AI would definitely think differently about AI bias than someone using a chatbot.

We first consider the points that people come from different backgrounds which mean that their perspectives on the world are vastly different. Some may see a certain AI as contradictory, whilst others see it as great.

Limitations:

The data have proven our hypothesis, however, some aspects of artificial intelligence and limited data have impacted our results, such as some forms of artificial intelligence not being developed yet, and only 33 results in our survey.

Conclusion

AI as we know is obviously biased, and will be for probably multiple more years. However AI usage for many things such as self driving cars, web browsers, help with homework (or cheating) is quite large currently. Eventually AI may be used for the bigger picture such as but not limited to emergency services and criminal judgements, say if a person walked in a hospital that used an AI to determine a person's wait time and they had a gunshot wound. But the AI used didn’t believe in gun violence since it was trained that way, that could be incredibly fatal for that person. See the problem? AI is trained on a logic based system that is trained on certain points of data, and the AI labels the topic as either good or bad, after it classifies new points of data based on what it just learned. This is why AI can lead not in between and operate in a way that is best suited for the user, not the best result overall. One major solution is just simply removing societal norms for training data, since it allows it to get a wider picture; however it doesn’t completely solve the problem since AI still would be on a system that is either good or bad as it uses bits which results in either 0 (good) or 1 (bad). Instead using qubits would be a more optimal choice as it can formulate more opinions compared to 0, or 1. Qubits choose a superposition, which can be in both states 0 & 1, resulting in numbers in other bases and between. With quantum computing AI, AI can be more decisive, do its calculations much faster, and model answers simultaneously resulting in the best answer overall.

Citations

Built In. (n.d.). Types of artificial intelligence. Retrieved from https://builtin.com/artificial-intelligence/types-of-artificial-intelligence
ACL Anthology. (2023). Long paper 656. Retrieved from https://aclanthology.org/2023.acl-long.656.pdf
Hao, K. (2023, August 7). AI language models are rife with political biases. MIT Technology Review. Retrieved from https://www.technologyreview.com/2023/08/07/1077324/ai-language-models-are-rife-with-political-biases/
Marketing AI Institute. (n.d.). BERT Google. Retrieved from https://www.marketingaiinstitute.com/blog/bert-google
National Institute of Standards and Technology (NIST). (2022, March). There’s more to AI bias than biased data: NIST report highlights. Retrieved from https://www.nist.gov/news-events/news/2022/03/theres-more-ai-bias-biased-data-nist-report-highlights
Tableau. (n.d.). Types of artificial intelligence algorithms. Retrieved from https://www.tableau.com/data-insights/ai/algorithms
Scribbr. (n.d.). How do AI detectors work?. Retrieved from https://www.scribbr.com/ai-tools/how-do-ai-detectors-work/
ACM Digital Library. (n.d.). A study on AI ethics. Retrieved from https://dl.acm.org/doi/full/10.1145/3544548.3581463
Wikipedia. (n.d.). Information gain ratio. Retrieved from https://en.wikipedia.org/wiki/Information_gain_ratio
Wikipedia. (n.d.). Entropy (information theory). Retrieved from https://en.wikipedia.org/wiki/Entropy_(information_theory)
Wikipedia. (n.d.). Bayes' theorem. Retrieved from https://en.wikipedia.org/wiki/Bayes%27_theorem
Dummies. (n.d.). A brief guide to understanding Bayes' theorem. Retrieved from https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/a-brief-guide-to-understanding-bayes-theorem-268197/
NVIDIA. (n.d.). Clustering. Retrieved from https://www.nvidia.com/en-us/glossary/clustering/
ACM Digital Library. (n.d.). AI fairness research. Retrieved from https://dl.acm.org/doi/abs/10.1145/3308560.3317590
BMJ Quality & Safety. (n.d.). AI in healthcare safety. Retrieved from https://qualitysafety.bmj.com/content/28/3/231.abstract
IEEE Xplore. (n.d.). AI ethics research paper. Retrieved from https://ieeexplore.ieee.org/abstract/document/9445793
Wiley Online Library. (n.d.). Bioethics and AI. Retrieved from https://onlinelibrary.wiley.com/doi/abs/10.1111/bioe.12927
NVIDIA Blog. (n.d.). What is a transformer model?. Retrieved from https://blogs.nvidia.com/blog/what-is-a-transformer-model/
Microsoft Azure. (n.d.). What is a qubit?. Retrieved from https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-a-qubit
Wikipedia. (n.d.). Quantum superposition. Retrieved from https://en.wikipedia.org/wiki/Quantum_superposition

Acknowledgement

Header image was generated using artifical intelligence (DALLE)

Project image was generated using artifical intelligence (DALLE)

Many thanks to those who did our survey.

Attachments

View Log Book
(may download a file)