Unit - 4
Uncertainty
We had previously acquired knowledge representation using first-order logic and propositional logic with certainty, implying that we knew the predicates. With this knowledge representation, we could write AB, which implies that if A is true, then B is true; however, if we aren't sure if A is true or not, we won't be able to make this assertion; this is known as uncertainty.
So we require uncertain reasoning or probabilistic reasoning to describe uncertain information, when we aren't confident about the predicates.
Causes of uncertainty:
In the real world, the following are some of the most common sources of uncertainty.
● Information occurred from unreliable sources.
● Experimental Errors
● Equipment fault
● Temperature variation
● Climate change.
Probabilistic reasoning is a knowledge representation method that uses the concept of probability to indicate uncertainty in knowledge. We employ probabilistic reasoning, which integrates probability theory and logic, to deal with uncertainty.
Probability is employed in probabilistic reasoning because it helps us to deal with uncertainty caused by someone's indifference or ignorance.
There are many situations in the real world when nothing is certain, such as "it will rain today," "behavior of someone in specific surroundings," and "a competition between two teams or two players." These are probable sentences for which we can presume that something will happen but are unsure, thus we utilize probabilistic reasoning in this case.
Probability: Probability is a measure of the likelihood of an uncertain event occurring. It's a numerical representation of the likelihood of something happening. The probability value is always between 0 and 1, indicating complete unpredictability.
0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
P(A) = 0, indicates total uncertainty in an event A.
P(A) =1, indicates total certainty in an event A.
We can compute the probability of an uncertain event using the formula below.
P(¬A) = probability of a not happening event.
P(¬A) + P(A) = 1.
Event: An event is the name given to each conceivable consequence of a variable.
Sample space: Sample space is a collection of all potential events.
Random variables: Random variables are used to represent real-world events and objects.
Prior probability: The prior probability of an occurrence is the probability calculated before fresh information is observed.
Posterior Probability: After all evidence or information has been considered, the likelihood is computed. It's a mix of known probabilities and new information.
Conditional probability: The likelihood of an event occurring when another event has already occurred is known as conditional probability.
Key takeaway
- Probabilistic reasoning is a method of knowledge representation in which the concept of probability is used to convey the uncertainty in knowledge.
- Probability is used in probabilistic reasoning because it allows us to deal with the uncertainty that arises from someone's laziness or ignorance.
It is critical to first comprehend what rational choice is, and to do so, one must first comprehend the meanings of the terms rational and choice (Green and Shapiro, 1994; Friedman, 1996). The google dictionary defines rational as "based on or in accordance with reason or logic," while choice is described as "the act of selecting between two or more options."
The process of making rational judgments based on relevant information in a logical, timely, and optimum manner is known as rational choice.
Let's say a man named Then do wants to figure out how much coffee he should drink today, so he phones his sister Denga to find out what color shoes she's wearing and uses that knowledge to figure out how much coffee he should drink that day. This will be an unreasonable decision because Thendo is deciding how much coffee he will drink based on irrelevant information (the color of shoes his sister Denga is wearing).
On the other hand, if Thendo determines that every time he takes a sip of that coffee, he must walk 1km, we will infer that he is acting irrationally because he is wasting energy by walking 1km to take a sip of coffee. This waste of energy is irrational, and it is referred to as an un-optimized solution in mathematical terms. If Then decides to pour the coffee from one glass to another every time he takes it for no other reason than to complete the chore of drinking coffee, he is taking an irrational course of action because it is not only illogical but also inefficient.
Figure shows an illustration of a rational decision-making framework. This diagram depicts how one studies the surroundings, or more technically, a decision-making area, when engaging in a rational decision-making process. The next step is to find the information that will help you make a decision. The input is then submitted to a logical and consistent decision engine, which examines all choices and their associated utilities before selecting the decision with the highest utility. Any flaw in this framework, such as the lack of complete knowledge, imprecise and defective information, or the inability to analyze all options, limits the theory of rational choice, and it becomes a bounded rational choice, as described by Herbert Simon (1991).
Classical economics is predicated on the notion that economic agents make decisions based on the principle of rational choice. Because this agent is a human person, many researchers have studied how humans make decisions, which is now known as behavioral economics. In his book Thinking Fast and Slow, Kahneman delves deeply into behavioral economics (Kahneman, 1991). Artificially intelligent machines are increasingly making decisions today. The assumption of rationality is stronger when decisions are made by artificial intelligent computers than when decisions are made by humans.
In this aspect, machine-made decisions are not as irrational as those produced by humans; but, they are not totally rational, and so are still susceptible to Herbert Simon's bounded rationality theory.
Fig 1: Steps for rational decision making
What are the characteristics of an isolated economic system in which all decision-making agents are human beings, for example? Behavioral economics will describe that economy. If, on the other hand, all of the decision-making agents are artificially intelligent machines that make rational-choice decisions, neoclassical economics will most likely apply. If half of these decision-making agents are people and the other half are machines, a half-neoclassical, half-behavioral economic system will emerge.
● Sample spaces S with events Ai, probabilities P(Ai); union A ∪ B and intersection AB, complement Ac .
● Axioms: P(A) ≤ 1; P(S) = 1; for exclusive Ai, P(∪iAi) = Σi P(Ai).
● Conditional probability: P(A|B) = P(AB)/P(B); P(A) = P(A|B)P(B) + P(A|Bc )P(Bc)
● Random variables (RVs) X; the cumulative distribution function (cdf) F(x) = P{X ≤ x};
For a discrete RV, probability mass function (pmf)
For a continuous RV, probability density function (pdf)
● Generalizations for more than one variable, e.g
Two RVs X and Y: joint cdf F(x, y) = P{X ≤ x, Y ≤ y};
Pmf f(x, y) = P{X = x, Y = y}; or
Pdf f(x, y), with
Independent X and Y iff f(x, y) = fX(x) fY (y)
● Expected value or mean: for RV X, µ = E[X]; discrete RVs
Continuous RVs
Probability Theory provides the formal techniques and principles for manipulating probabilistically represented propositions. The following are the three assumptions of probability theory:
● 0 <= P(A=a) <= 1 for all a in sample space of A
● P(True)=1, P(False)=0
● P(A v B) = P(A) + P(B) - P(A ^ B)
The following qualities can be deduced from these axioms:
● P(~A) = 1 - P(A)
● P(A) = P(A ^ B) + P(A ^ ~B)
● Sum{P(A=a)} = 1, where the sum is over all possible values a in the sample space of A
Bayes' theorem is a mathematical formula for calculating the probability of an event based on ambiguous information. It is also known as Bayes' rule, Bayes' law, or Bayesian reasoning.
In probability theory, it relates the conditional and marginal probabilities of two random events.
The inspiration for Bayes' theorem came from British mathematician Thomas Bayes. Bayesian inference is a technique for applying Bayes' theorem, which is at the heart of Bayesian statistics.
It's a method for calculating the value of P(B|A) using P(A|B) knowledge.
Bayes' theorem can update the probability forecast of an occurrence based on new information from the real world.
Example: If cancer is linked to one's age, we can apply Bayes' theorem to more precisely forecast cancer risk based on age.
Bayes' theorem can be derived using the product rule and conditional probability of event A with known event B:
As a result of the product rule, we can write:
P(A ⋀ B)= P(A|B) P(B) or
In the same way, the likelihood of event B if event A is known is:
P(A ⋀ B)= P(B|A) P(A)
When we combine the right-hand sides of both equations, we get:
The above equation is known as Bayes' rule or Bayes' theorem (a). This equation is the starting point for most current AI systems for probabilistic inference.
It displays the simple relationship between joint and conditional probabilities. Here,
P(A|B) is the posterior, which we must compute, and it stands for Probability of hypothesis A when evidence B is present.
The likelihood is defined as P(B|A), in which the probability of evidence is computed after assuming that the hypothesis is valid.
Prior probability, or the likelihood of a hypothesis before taking into account the evidence, is denoted by P(A).
Marginal probability, or the likelihood of a single piece of evidence, is denoted by P(B).
In general, we can write P (B) = P(A)*P(B|Ai) in the equation (a), therefore the Bayes' rule can be expressed as:
Where A1, A2, A3,..... Is a set of mutually exclusive and exhaustive events, and An is a set of mutually exclusive and exhaustive events.
- Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets, are a data structure for capturing all of the information in a domain's entire joint probability distribution. The Bayesian Net can compute every value in the entire joint probability distribution of a collection of random variables.
- This variable represents all direct causal relationships between variables.
- To construct a Bayesian net for a given set of variables, draw arcs from cause variables to immediate effects.
- It saves space by taking advantage of the fact that variable dependencies are generally local in many real-world problem domains, resulting in a high number of conditionally independent variables.
- In both qualitative and quantitative terms, the link between variables is captured.
- It's possible to use it to construct a case.
- Predictive reasoning (sometimes referred to as causal reasoning) is a sort of reasoning that proceeds from the top down (from causes to effects).
- From effects to causes, diagnostic thinking goes backwards (bottom-up).
- When A has a direct causal influence on B, a Bayesian Net is a directed acyclic graph (DAG) in which each random variable has its own node and a directed arc from A to B. As a result, the nodes represent states of affairs, whereas the arcs represent causal relationships. When A is present, B is more likely to occur, and vice versa. The backward influence is referred to as "diagnostic" or "evidential" support for A due to the presence of B.
- Each node A in a network is conditionally independent of any subset of nodes that are not children of A, given its parents.
A Bayesian network that depicts and solves decision issues under uncertain knowledge is known as an influence diagram.
Fig 2: Bayesian network graph
- Each node represents a random variable that can be continuous or discontinuous in nature.
- Arcs or directed arrows represent the causal relationship or conditional probabilities between random variables. The graph's two nodes are connected by these directed links, also known as arrows.
These ties imply that one node has a direct influence on another, and nodes are independent of one another if there are no directed relationships.
Key takeaway
Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets.
Bayesian Networks are a data structure for capturing all of the information in the whole joint probability distribution for the collection of random variables that define a domain.
● In general, Bayes Net inference is an NP-hard problem (exponential in the size of the graph).
● There are linear time techniques based on belief propagation for singly-connected networks or polytrees with no undirected loops.
● Local evidence messages are sent by each node to their children and parents.
● Based on incoming messages from its neighbors, each node updates its belief in each of its possible values and propagates evidence to its neighbors.
● Inference for generic networks can be approximated using loopy belief propagation, which iteratively refines probabilities until they reach an exact limit.
Temporal model
- Monitoring or filtering
- Prediction
Approximate inference
Direct sampling method
The most basic sort of random sampling process for Bayesian networks generates events from a network with no evidence associated with them. The idea is that each variable is sampled in topological order. The value is drawn from a probability distribution based on the values that have previously been assigned to the variable's parents.
Function PRIOR-SAMPLE(bn) returns an event sampled from the prior specified by bn
Inputs: bn, a Bayesian network specifying joint distribution
an event with elements
For each variable in do
a random sample from parents
Return
Likelihood weighting
Likelihood weighting eliminates the inefficiencies of rejection sampling by creating only events that are consistent with the evidence. It's a tailored version of the general statistical approach of importance sampling for Bayesian network inference.
Function LIKELIHOOD-WEIGHTING returns an estimate of
Inputs: , the query variable
e, observed values for variables
, a Bayesian network specifying joint distribution
, the total number of samples to be generated
Local variables: W, a vector of weighted counts for each value of , initially zero
For to do
WEIGHTED-SAMPLE
where is the value of in
Return NORMALIZE(W)
Inference by Markov chain simulation
Rejection sampling and likelihood weighting are not the same as Markov chain (MCMC) approaches. Rather of starting from scratch, MCMC approaches create each sample by making a random change to the preceding sample. It's easier to think of an MCMC algorithm as being in a current state where each variable has a value and then randomly modifying that state to get a new state.
Key takeaway
- For Bayesian networks, the most basic type of random sampling procedure generates events from a network with no evidence connected with them.
- It's a variant of the broad statistical technique of importance sampling that's been customized for Bayesian network inference.
- Markov chain (MCMC) MARKOV CHAIN methods are not the same as rejection sampling and likelihood weighting.
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The conventional logic block that a computer can understand takes the precise input and produces that of a definite output as TRUE or FALSE, which is equivalent to that of a human’s YES or NO.
The inventor of fuzzy logic, Lotfi Zadeh, observed that unlike the computers, the human decision making includes that of a range of the possibilities between YES and NO, example −
CERTAINLY YES |
POSSIBLY YES |
CANNOT SAY |
POSSIBLY NO |
CERTAINLY NO |
The fuzzy logic works on the levels of that of the possibilities of the input to achieve that of the definite output.
Implementation
● It can be implemented in the systems with the various sizes and the capabilities ranging from that of the small microcontrollers to large, the networked, and the workstation-based control systems.
● It can be implemented in the hardware, the software, or a combination of both.
Why Fuzzy Logic?
Fuzzy logic is useful for commercial and practical purposes.
● It can control the machines and the consumer products.
● It may not give the accurate reasoning, but acceptable to that of the reasoning.
● Fuzzy logic helps to deal with that uncertainty in engineering.
Fuzzy Logic Systems Architecture
It has four main parts as shown below −
● Fuzzification Module − It transforms that of the system inputs, which are crisp numbers, into that of the fuzzy sets. It splits the input signal into five steps example −
LP | x is Large Positive |
MP | x is Medium Positive |
S | x is Small |
MN | x is Medium Negative |
LN | x is Large Negative
|
● Knowledge Base − It stores IF-THEN rules provided by that of the experts.
● Inference Engine − It simulates that of the human reasoning process by making the fuzzy inference on the inputs and IF-THEN rules.
● Defuzzification Module − It transforms the fuzzy set obtained by that of the inference engine into a crisp value.
The membership functions work on that of the fuzzy sets of variables.
Membership Function
Membership functions allow you to quantify that of the linguistic term and represent a fuzzy set graphically. A membership function for that of a fuzzy set A on the universe of that of the discourse X is defined as μA:X → [0,1].
Here, each element of X is mapped to that of a value between 0 and 1. It is called the membership value or that of the degree of membership. It quantifies the degree of that of the membership of the element in X to that of the fuzzy set A.
● x axis represents that of the universe of discourse.
● y axis represents that of the degrees of membership in the [0, 1] interval.
There can be multiple membership functions applicable to that of the fuzzify a numerical value. Simple membership functions are used as use of that of the complex functions does not add more precision in the output.
All membership functions for LP, MP, S, MN, and LN are shown as below −
The triangular membership function shapes are the most of the common among various of the other membership function shapes for example trapezoidal, singleton, and Gaussian.
Here, the input to that of the 5-level fuzzifier varies from that of the -10 volts to +10 volts. Hence the corresponding output will also change.
Example of a Fuzzy Logic System
Let us consider that an air conditioning system with that of the 5-level fuzzy logic system. This system adjusts that of the temperature of the air conditioner by comparing that of the room temperature and that of the target temperature value.
Fig 3: Example
Algorithm
● Define linguistic Variables and terms (start)
● Construct membership functions for them. (start)
● Construct knowledge base of rules (start)
● Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
● Evaluate rules in the rule base. (Inference Engine)
● Combine results from each rule. (Inference Engine)
● Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are the input and the output variables in that of the form of the simple words or the sentences. For that of the room temperature, cold, warm, hot, etc., are the linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is that of a linguistic term and it can cover that of the some portion of overall temperature of the values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −
Step3 − Construct knowledge base rules
Create a matrix of room temperature values versus target temperature values that an air conditioning system is expected to provide.
RoomTemp. /Target | Very_Cold | Cold | Warm | Hot | Very_Hot |
Very_Cold | No_Change | Heat | Heat | Heat | Heat |
Cold | Cool | No_Change | Heat | Heat | Heat |
Warm | Cool | Cool | No_Change | Heat | Heat |
Hot | Cool | Cool | Cool | No_Change | Heat |
Very_Hot | Cool | Cool | Cool | Cool | No_Change |
Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Sr. No. | Condition | Action |
1 | IF temperature=(Cold OR Very_Cold) AND target=Warm THEN | Heat |
2 | IF temperature=(Hot OR Very_Hot) AND target=Warm THEN | Cool |
3 | IF (temperature=Warm) AND (target=Warm) THEN | No_Change |
Step 4 − Obtain fuzzy value
Fuzzy set operations perform evaluation of rules. The operations used for that of the OR and AND are the Max and Min respectively. Combine all of the results of the evaluation to form that of a final result. This result is that of a fuzzy value.
Step 5 − Perform defuzzification
Defuzzification is then performed according to that of the membership function for output variable.
Application Areas of Fuzzy Logic
The key application areas of fuzzy logic are as follows −
Automotive Systems
● Automatic Gearboxes
● Four-Wheel Steering
● Vehicle environment control
Consumer Electronic Goods
● Hi-Fi Systems
● Photocopiers
● Still and Video Cameras
● Television
Domestic Goods
● Microwave Ovens
● Refrigerators
● Toasters
● Vacuum Cleaners
● Washing Machines
Environment Control
● Air Conditioners/Dryers/Heaters
● Humidifiers
Advantages of FLSs
● Mathematical concepts within that of the fuzzy reasoning are very simple.
● You can modify a FLS by just adding or deleting the rules due to the flexibility of fuzzy logic.
● Fuzzy logic Systems can take the imprecise, the distorted, noisy input information.
● FLSs are easy to construct and to understand.
● Fuzzy logic is that of a solution to complex problems in all of the fields of life, including medicine, as it resembles human reasoning and that of the decision making.
Disadvantages of FLSs
● There is no systematic approach to that of fuzzy system designing.
● They are understandable only when they are simple.
● They are suitable for problems which do not need very high accuracy.
Key takeaway
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The study of the rational agent and its environment can be defined as an AI system. The agents use sensors to sense their surroundings and actuators to act on their surroundings. Knowledge, belief, purpose, and other mental qualities can be possessed by an AI agent.
An agent is anything that uses sensors to observe its environment and actuators to act on that environment. The cycle of perceiving, thinking, and action is followed by an Agent. An agent can be any of the following:
● Human-Agent - A human agent has sensors in the form of eyes, ears, and other organs, and actuators in the form of hands, legs, and vocal tract.
● Robotic Agent - Cameras, infrared range finders, NLP for sensors, and various motors for actuators can all be found in a robotic agent.
● Software Agent - A software agent can receive sensory input like keystrokes and file contents, act on those inputs, and show output on the screen.
As a result, the environment around us is replete of agents, including thermostats, cellphones, cameras, and even ourselves.
Before we proceed, we must first understand sensors, effectors, and actuators.
● Sensor - A sensor is an electronic device that detects changes in the environment and transmits the data to other electronic devices. Sensors allow an agent to observe its surroundings.
● Actuators - Actuators are machine components that convert energy to motion. The actuators' sole function is to move and control a system. An actuator can be anything from an electric motor to gears to rails.
● Effectors - Effectors are devices that have an impact on the environment. Legs, wheels, arms, fingers, wings, fins, and a display screen are examples of effectors.
An intelligent agent is a self-contained creature that uses sensors and actuators to interact with its surroundings in order to achieve its objectives. To attain their objectives, an intelligent agent can learn from their surroundings. An intelligent agent is something like a thermostat.
The four main rules for an AI agent are as follows:
● Rule 1: An AI agent must be able to perceive its surroundings.
● Rule 2: Decisions must be made based on the observation.
● Rule 3: A decision must be followed by action.
● Rule 4: An AI agent's actions must be reasonable.
A rational agent is one who has defined preferences, models uncertainty, and acts in such a way that its performance measure is maximized using all available actions.
The proper things are stated to be done by a reasonable agent. AI is concerned with the development of rational agents for application in game theory and decision theory in a variety of real-world contexts.
The rational action is the most crucial for an AI agent because in an AI reinforcement learning algorithm, an agent receives a positive reward for each best feasible action and a negative reward for each incorrect action.
Structure
AI's objective is to create an agent programme that performs the agent function. The architecture and agent programme combine to form the framework of an intelligent agent. It can be summed up as follows:
Agent = Architecture + Agent program
The following are the three most important terms in the structure of an AI agent:
● Architecture - Architecture is the machinery on which an AI agent operates.
● Agent Function - The agent function is used to map a percept to a certain action.
f:P* → A
● Agent program - An agent programme is a programme that performs the function of an agent. To produce function f, an agent programme runs on the physical architecture.
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history. The condition-action rule is used to create the agent function.
A condition-action rule is a rule that connects a state (or condition) to a response (or action). If the condition is met, the action is executed; otherwise, it is not. When the environment is fully observable, this agent function succeeds. Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. If the agent can randomize its activities, it may be feasible to break out of infinite loops.
Simple reflex agents have the following drawbacks:
● Intelligence is very restricted.
● There is no understanding of the state's non-perceptual components.
● Typically, they are too large to generate and store.
● If the environment changes, the collection of rules must be updated.
Fig 4: Reflex agent
Key takeaway
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history.
It operates by looking for a rule that has the same condition as the current circumstance. A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history. The current state is saved within the agent, which keeps track of some form of structure characterizing the parts of the environment that aren't visible.
Updating the state necessitates knowledge of:
● how the world evolves without the intervention of the agent, and
● what effect the agent's activities have on the world
There are two crucial components in a model-based agent:
● Model: A Model-based agent is one that has information about "how things happen in the world."
● Internal State: It is a perceptually based representation of the current state.
Fig 5: Model based agent
Key takeaway
A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history.
These agents make judgments based on how close they are to achieving their goal (description of desirable situations). Every action they take is aimed at reducing the gap between them and the goal. This gives the agent the ability to choose from a variety of options, selecting the one that leads to the desired outcome.
The knowledge that underpins its judgments is explicitly represented and modifiable, allowing these agents to be more adaptable. They usually necessitate some searching and planning. The behaviour of the goal-based agent can be simply modified.
Fig 6: Goal based agent
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best. They choose actions for each state based on a preference (utility). Sometimes getting the desired outcome is insufficient. To get to a place, we might look for a faster, safer, and less expensive option.
The happiness of the agents should be considered. The agent's "happiness" is described by utility. A utility agent chooses the behaviour that maximizes the predicted utility due to the uncertainty in the world. A utility function converts a state into a real number that represents the degree of happiness associated with it.
Fig 7: Utility based agent
Key takeaway
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best.
Some programmes work in a completely artificial world, with just keyboard input, databases, computer file systems, and character output on a screen.
Some software agents (software robots or softbots) exist in rich, limitless softbots domains, on the other hand. The environment in the simulator is extremely detailed and complicated. In real time, the software agent must choose from a large number of options. A softbot that scans a customer's internet preferences and displays intriguing goods to them operates in both a real and an artificial environment.
The Turing Test environment is the most well-known artificial setting, in which one real and other artificial agents are tested on an equal footing.
Turing Test
The Turing Test can be used to assess the success of a system's intelligent behaviour.
The test involves two people and a machine that will be evaluated. One of the two people takes on the job of tester. They are all in different rooms. The tester has no idea who is a human and who is a machine. He types the queries and sends them to both intelligences, who respond to him with typed responses.
The goal of this test is to deceive the tester. The machine is deemed to be intelligent if the tester is unable to distinguish the machine's response from the human response.
Properties of Environment
The environment has a variety of characteristics.
● Discrete / Continuous: If the environment has a restricted number of different, clearly defined states (for example, chess), it is discrete; otherwise, it is continuous (For example, driving).
● Observable / Partially Observable - If the whole state of the environment can be determined from the percepts at each time point, it is observable; otherwise, it is only partially observable.
● Static / Dynamic - The environment is static if it does not change while an agent is acting; otherwise, it is dynamic.
● Multiple agents / single agent - The environment may contain other agents of the same or other kind as the agent.
● Accessible / Inaccessible − If an agent's sensory apparatus can access the entire state of the environment, the environment is accessible to that agent.
● Deterministic / Non-deterministic − If the current state of the environment and the actions of the agent totally determine the next state of the environment, the environment is deterministic; otherwise, it is non-deterministic.
● Episodic / Non-episodic - Each episode in an episodic environment consists of the agent seeing and then acting. The quality of its action is solely determined by the episode. The actions in prior episodes have no bearing on subsequent episodes.
References:
- Elaine Rich, Kevin Knight and Shivashankar B Nair, “Artificial Intelligence”, Mc Graw Hill Publication, 2009.
- Dan W. Patterson, “Introduction to Artificial Intelligence and Expert System”, Pearson Publication,2015.
- Saroj Kaushik, “Artificial Intelligence”, Cengage Learning, 2011.
Unit - 4
Uncertainty
We had previously acquired knowledge representation using first-order logic and propositional logic with certainty, implying that we knew the predicates. With this knowledge representation, we could write AB, which implies that if A is true, then B is true; however, if we aren't sure if A is true or not, we won't be able to make this assertion; this is known as uncertainty.
So we require uncertain reasoning or probabilistic reasoning to describe uncertain information, when we aren't confident about the predicates.
Causes of uncertainty:
In the real world, the following are some of the most common sources of uncertainty.
● Information occurred from unreliable sources.
● Experimental Errors
● Equipment fault
● Temperature variation
● Climate change.
Probabilistic reasoning is a knowledge representation method that uses the concept of probability to indicate uncertainty in knowledge. We employ probabilistic reasoning, which integrates probability theory and logic, to deal with uncertainty.
Probability is employed in probabilistic reasoning because it helps us to deal with uncertainty caused by someone's indifference or ignorance.
There are many situations in the real world when nothing is certain, such as "it will rain today," "behavior of someone in specific surroundings," and "a competition between two teams or two players." These are probable sentences for which we can presume that something will happen but are unsure, thus we utilize probabilistic reasoning in this case.
Probability: Probability is a measure of the likelihood of an uncertain event occurring. It's a numerical representation of the likelihood of something happening. The probability value is always between 0 and 1, indicating complete unpredictability.
0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
P(A) = 0, indicates total uncertainty in an event A.
P(A) =1, indicates total certainty in an event A.
We can compute the probability of an uncertain event using the formula below.
P(¬A) = probability of a not happening event.
P(¬A) + P(A) = 1.
Event: An event is the name given to each conceivable consequence of a variable.
Sample space: Sample space is a collection of all potential events.
Random variables: Random variables are used to represent real-world events and objects.
Prior probability: The prior probability of an occurrence is the probability calculated before fresh information is observed.
Posterior Probability: After all evidence or information has been considered, the likelihood is computed. It's a mix of known probabilities and new information.
Conditional probability: The likelihood of an event occurring when another event has already occurred is known as conditional probability.
Key takeaway
- Probabilistic reasoning is a method of knowledge representation in which the concept of probability is used to convey the uncertainty in knowledge.
- Probability is used in probabilistic reasoning because it allows us to deal with the uncertainty that arises from someone's laziness or ignorance.
It is critical to first comprehend what rational choice is, and to do so, one must first comprehend the meanings of the terms rational and choice (Green and Shapiro, 1994; Friedman, 1996). The google dictionary defines rational as "based on or in accordance with reason or logic," while choice is described as "the act of selecting between two or more options."
The process of making rational judgments based on relevant information in a logical, timely, and optimum manner is known as rational choice.
Let's say a man named Then do wants to figure out how much coffee he should drink today, so he phones his sister Denga to find out what color shoes she's wearing and uses that knowledge to figure out how much coffee he should drink that day. This will be an unreasonable decision because Thendo is deciding how much coffee he will drink based on irrelevant information (the color of shoes his sister Denga is wearing).
On the other hand, if Thendo determines that every time he takes a sip of that coffee, he must walk 1km, we will infer that he is acting irrationally because he is wasting energy by walking 1km to take a sip of coffee. This waste of energy is irrational, and it is referred to as an un-optimized solution in mathematical terms. If Then decides to pour the coffee from one glass to another every time he takes it for no other reason than to complete the chore of drinking coffee, he is taking an irrational course of action because it is not only illogical but also inefficient.
Figure shows an illustration of a rational decision-making framework. This diagram depicts how one studies the surroundings, or more technically, a decision-making area, when engaging in a rational decision-making process. The next step is to find the information that will help you make a decision. The input is then submitted to a logical and consistent decision engine, which examines all choices and their associated utilities before selecting the decision with the highest utility. Any flaw in this framework, such as the lack of complete knowledge, imprecise and defective information, or the inability to analyze all options, limits the theory of rational choice, and it becomes a bounded rational choice, as described by Herbert Simon (1991).
Classical economics is predicated on the notion that economic agents make decisions based on the principle of rational choice. Because this agent is a human person, many researchers have studied how humans make decisions, which is now known as behavioral economics. In his book Thinking Fast and Slow, Kahneman delves deeply into behavioral economics (Kahneman, 1991). Artificially intelligent machines are increasingly making decisions today. The assumption of rationality is stronger when decisions are made by artificial intelligent computers than when decisions are made by humans.
In this aspect, machine-made decisions are not as irrational as those produced by humans; but, they are not totally rational, and so are still susceptible to Herbert Simon's bounded rationality theory.
Fig 1: Steps for rational decision making
What are the characteristics of an isolated economic system in which all decision-making agents are human beings, for example? Behavioral economics will describe that economy. If, on the other hand, all of the decision-making agents are artificially intelligent machines that make rational-choice decisions, neoclassical economics will most likely apply. If half of these decision-making agents are people and the other half are machines, a half-neoclassical, half-behavioral economic system will emerge.
● Sample spaces S with events Ai, probabilities P(Ai); union A ∪ B and intersection AB, complement Ac .
● Axioms: P(A) ≤ 1; P(S) = 1; for exclusive Ai, P(∪iAi) = Σi P(Ai).
● Conditional probability: P(A|B) = P(AB)/P(B); P(A) = P(A|B)P(B) + P(A|Bc )P(Bc)
● Random variables (RVs) X; the cumulative distribution function (cdf) F(x) = P{X ≤ x};
For a discrete RV, probability mass function (pmf)
For a continuous RV, probability density function (pdf)
● Generalizations for more than one variable, e.g
Two RVs X and Y: joint cdf F(x, y) = P{X ≤ x, Y ≤ y};
Pmf f(x, y) = P{X = x, Y = y}; or
Pdf f(x, y), with
Independent X and Y iff f(x, y) = fX(x) fY (y)
● Expected value or mean: for RV X, µ = E[X]; discrete RVs
Continuous RVs
Probability Theory provides the formal techniques and principles for manipulating probabilistically represented propositions. The following are the three assumptions of probability theory:
● 0 <= P(A=a) <= 1 for all a in sample space of A
● P(True)=1, P(False)=0
● P(A v B) = P(A) + P(B) - P(A ^ B)
The following qualities can be deduced from these axioms:
● P(~A) = 1 - P(A)
● P(A) = P(A ^ B) + P(A ^ ~B)
● Sum{P(A=a)} = 1, where the sum is over all possible values a in the sample space of A
Bayes' theorem is a mathematical formula for calculating the probability of an event based on ambiguous information. It is also known as Bayes' rule, Bayes' law, or Bayesian reasoning.
In probability theory, it relates the conditional and marginal probabilities of two random events.
The inspiration for Bayes' theorem came from British mathematician Thomas Bayes. Bayesian inference is a technique for applying Bayes' theorem, which is at the heart of Bayesian statistics.
It's a method for calculating the value of P(B|A) using P(A|B) knowledge.
Bayes' theorem can update the probability forecast of an occurrence based on new information from the real world.
Example: If cancer is linked to one's age, we can apply Bayes' theorem to more precisely forecast cancer risk based on age.
Bayes' theorem can be derived using the product rule and conditional probability of event A with known event B:
As a result of the product rule, we can write:
P(A ⋀ B)= P(A|B) P(B) or
In the same way, the likelihood of event B if event A is known is:
P(A ⋀ B)= P(B|A) P(A)
When we combine the right-hand sides of both equations, we get:
The above equation is known as Bayes' rule or Bayes' theorem (a). This equation is the starting point for most current AI systems for probabilistic inference.
It displays the simple relationship between joint and conditional probabilities. Here,
P(A|B) is the posterior, which we must compute, and it stands for Probability of hypothesis A when evidence B is present.
The likelihood is defined as P(B|A), in which the probability of evidence is computed after assuming that the hypothesis is valid.
Prior probability, or the likelihood of a hypothesis before taking into account the evidence, is denoted by P(A).
Marginal probability, or the likelihood of a single piece of evidence, is denoted by P(B).
In general, we can write P (B) = P(A)*P(B|Ai) in the equation (a), therefore the Bayes' rule can be expressed as:
Where A1, A2, A3,..... Is a set of mutually exclusive and exhaustive events, and An is a set of mutually exclusive and exhaustive events.
- Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets, are a data structure for capturing all of the information in a domain's entire joint probability distribution. The Bayesian Net can compute every value in the entire joint probability distribution of a collection of random variables.
- This variable represents all direct causal relationships between variables.
- To construct a Bayesian net for a given set of variables, draw arcs from cause variables to immediate effects.
- It saves space by taking advantage of the fact that variable dependencies are generally local in many real-world problem domains, resulting in a high number of conditionally independent variables.
- In both qualitative and quantitative terms, the link between variables is captured.
- It's possible to use it to construct a case.
- Predictive reasoning (sometimes referred to as causal reasoning) is a sort of reasoning that proceeds from the top down (from causes to effects).
- From effects to causes, diagnostic thinking goes backwards (bottom-up).
- When A has a direct causal influence on B, a Bayesian Net is a directed acyclic graph (DAG) in which each random variable has its own node and a directed arc from A to B. As a result, the nodes represent states of affairs, whereas the arcs represent causal relationships. When A is present, B is more likely to occur, and vice versa. The backward influence is referred to as "diagnostic" or "evidential" support for A due to the presence of B.
- Each node A in a network is conditionally independent of any subset of nodes that are not children of A, given its parents.
A Bayesian network that depicts and solves decision issues under uncertain knowledge is known as an influence diagram.
Fig 2: Bayesian network graph
- Each node represents a random variable that can be continuous or discontinuous in nature.
- Arcs or directed arrows represent the causal relationship or conditional probabilities between random variables. The graph's two nodes are connected by these directed links, also known as arrows.
These ties imply that one node has a direct influence on another, and nodes are independent of one another if there are no directed relationships.
Key takeaway
Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets.
Bayesian Networks are a data structure for capturing all of the information in the whole joint probability distribution for the collection of random variables that define a domain.
● In general, Bayes Net inference is an NP-hard problem (exponential in the size of the graph).
● There are linear time techniques based on belief propagation for singly-connected networks or polytrees with no undirected loops.
● Local evidence messages are sent by each node to their children and parents.
● Based on incoming messages from its neighbors, each node updates its belief in each of its possible values and propagates evidence to its neighbors.
● Inference for generic networks can be approximated using loopy belief propagation, which iteratively refines probabilities until they reach an exact limit.
Temporal model
- Monitoring or filtering
- Prediction
Approximate inference
Direct sampling method
The most basic sort of random sampling process for Bayesian networks generates events from a network with no evidence associated with them. The idea is that each variable is sampled in topological order. The value is drawn from a probability distribution based on the values that have previously been assigned to the variable's parents.
Function PRIOR-SAMPLE(bn) returns an event sampled from the prior specified by bn
Inputs: bn, a Bayesian network specifying joint distribution
an event with elements
For each variable in do
a random sample from parents
Return
Likelihood weighting
Likelihood weighting eliminates the inefficiencies of rejection sampling by creating only events that are consistent with the evidence. It's a tailored version of the general statistical approach of importance sampling for Bayesian network inference.
Function LIKELIHOOD-WEIGHTING returns an estimate of
Inputs: , the query variable
e, observed values for variables
, a Bayesian network specifying joint distribution
, the total number of samples to be generated
Local variables: W, a vector of weighted counts for each value of , initially zero
For to do
WEIGHTED-SAMPLE
where is the value of in
Return NORMALIZE(W)
Inference by Markov chain simulation
Rejection sampling and likelihood weighting are not the same as Markov chain (MCMC) approaches. Rather of starting from scratch, MCMC approaches create each sample by making a random change to the preceding sample. It's easier to think of an MCMC algorithm as being in a current state where each variable has a value and then randomly modifying that state to get a new state.
Key takeaway
- For Bayesian networks, the most basic type of random sampling procedure generates events from a network with no evidence connected with them.
- It's a variant of the broad statistical technique of importance sampling that's been customized for Bayesian network inference.
- Markov chain (MCMC) MARKOV CHAIN methods are not the same as rejection sampling and likelihood weighting.
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The conventional logic block that a computer can understand takes the precise input and produces that of a definite output as TRUE or FALSE, which is equivalent to that of a human’s YES or NO.
The inventor of fuzzy logic, Lotfi Zadeh, observed that unlike the computers, the human decision making includes that of a range of the possibilities between YES and NO, example −
CERTAINLY YES |
POSSIBLY YES |
CANNOT SAY |
POSSIBLY NO |
CERTAINLY NO |
The fuzzy logic works on the levels of that of the possibilities of the input to achieve that of the definite output.
Implementation
● It can be implemented in the systems with the various sizes and the capabilities ranging from that of the small microcontrollers to large, the networked, and the workstation-based control systems.
● It can be implemented in the hardware, the software, or a combination of both.
Why Fuzzy Logic?
Fuzzy logic is useful for commercial and practical purposes.
● It can control the machines and the consumer products.
● It may not give the accurate reasoning, but acceptable to that of the reasoning.
● Fuzzy logic helps to deal with that uncertainty in engineering.
Fuzzy Logic Systems Architecture
It has four main parts as shown below −
● Fuzzification Module − It transforms that of the system inputs, which are crisp numbers, into that of the fuzzy sets. It splits the input signal into five steps example −
LP | x is Large Positive |
MP | x is Medium Positive |
S | x is Small |
MN | x is Medium Negative |
LN | x is Large Negative
|
● Knowledge Base − It stores IF-THEN rules provided by that of the experts.
● Inference Engine − It simulates that of the human reasoning process by making the fuzzy inference on the inputs and IF-THEN rules.
● Defuzzification Module − It transforms the fuzzy set obtained by that of the inference engine into a crisp value.
The membership functions work on that of the fuzzy sets of variables.
Membership Function
Membership functions allow you to quantify that of the linguistic term and represent a fuzzy set graphically. A membership function for that of a fuzzy set A on the universe of that of the discourse X is defined as μA:X → [0,1].
Here, each element of X is mapped to that of a value between 0 and 1. It is called the membership value or that of the degree of membership. It quantifies the degree of that of the membership of the element in X to that of the fuzzy set A.
● x axis represents that of the universe of discourse.
● y axis represents that of the degrees of membership in the [0, 1] interval.
There can be multiple membership functions applicable to that of the fuzzify a numerical value. Simple membership functions are used as use of that of the complex functions does not add more precision in the output.
All membership functions for LP, MP, S, MN, and LN are shown as below −
The triangular membership function shapes are the most of the common among various of the other membership function shapes for example trapezoidal, singleton, and Gaussian.
Here, the input to that of the 5-level fuzzifier varies from that of the -10 volts to +10 volts. Hence the corresponding output will also change.
Example of a Fuzzy Logic System
Let us consider that an air conditioning system with that of the 5-level fuzzy logic system. This system adjusts that of the temperature of the air conditioner by comparing that of the room temperature and that of the target temperature value.
Fig 3: Example
Algorithm
● Define linguistic Variables and terms (start)
● Construct membership functions for them. (start)
● Construct knowledge base of rules (start)
● Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
● Evaluate rules in the rule base. (Inference Engine)
● Combine results from each rule. (Inference Engine)
● Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are the input and the output variables in that of the form of the simple words or the sentences. For that of the room temperature, cold, warm, hot, etc., are the linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is that of a linguistic term and it can cover that of the some portion of overall temperature of the values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −
Step3 − Construct knowledge base rules
Create a matrix of room temperature values versus target temperature values that an air conditioning system is expected to provide.
RoomTemp. /Target | Very_Cold | Cold | Warm | Hot | Very_Hot |
Very_Cold | No_Change | Heat | Heat | Heat | Heat |
Cold | Cool | No_Change | Heat | Heat | Heat |
Warm | Cool | Cool | No_Change | Heat | Heat |
Hot | Cool | Cool | Cool | No_Change | Heat |
Very_Hot | Cool | Cool | Cool | Cool | No_Change |
Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Sr. No. | Condition | Action |
1 | IF temperature=(Cold OR Very_Cold) AND target=Warm THEN | Heat |
2 | IF temperature=(Hot OR Very_Hot) AND target=Warm THEN | Cool |
3 | IF (temperature=Warm) AND (target=Warm) THEN | No_Change |
Step 4 − Obtain fuzzy value
Fuzzy set operations perform evaluation of rules. The operations used for that of the OR and AND are the Max and Min respectively. Combine all of the results of the evaluation to form that of a final result. This result is that of a fuzzy value.
Step 5 − Perform defuzzification
Defuzzification is then performed according to that of the membership function for output variable.
Application Areas of Fuzzy Logic
The key application areas of fuzzy logic are as follows −
Automotive Systems
● Automatic Gearboxes
● Four-Wheel Steering
● Vehicle environment control
Consumer Electronic Goods
● Hi-Fi Systems
● Photocopiers
● Still and Video Cameras
● Television
Domestic Goods
● Microwave Ovens
● Refrigerators
● Toasters
● Vacuum Cleaners
● Washing Machines
Environment Control
● Air Conditioners/Dryers/Heaters
● Humidifiers
Advantages of FLSs
● Mathematical concepts within that of the fuzzy reasoning are very simple.
● You can modify a FLS by just adding or deleting the rules due to the flexibility of fuzzy logic.
● Fuzzy logic Systems can take the imprecise, the distorted, noisy input information.
● FLSs are easy to construct and to understand.
● Fuzzy logic is that of a solution to complex problems in all of the fields of life, including medicine, as it resembles human reasoning and that of the decision making.
Disadvantages of FLSs
● There is no systematic approach to that of fuzzy system designing.
● They are understandable only when they are simple.
● They are suitable for problems which do not need very high accuracy.
Key takeaway
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The study of the rational agent and its environment can be defined as an AI system. The agents use sensors to sense their surroundings and actuators to act on their surroundings. Knowledge, belief, purpose, and other mental qualities can be possessed by an AI agent.
An agent is anything that uses sensors to observe its environment and actuators to act on that environment. The cycle of perceiving, thinking, and action is followed by an Agent. An agent can be any of the following:
● Human-Agent - A human agent has sensors in the form of eyes, ears, and other organs, and actuators in the form of hands, legs, and vocal tract.
● Robotic Agent - Cameras, infrared range finders, NLP for sensors, and various motors for actuators can all be found in a robotic agent.
● Software Agent - A software agent can receive sensory input like keystrokes and file contents, act on those inputs, and show output on the screen.
As a result, the environment around us is replete of agents, including thermostats, cellphones, cameras, and even ourselves.
Before we proceed, we must first understand sensors, effectors, and actuators.
● Sensor - A sensor is an electronic device that detects changes in the environment and transmits the data to other electronic devices. Sensors allow an agent to observe its surroundings.
● Actuators - Actuators are machine components that convert energy to motion. The actuators' sole function is to move and control a system. An actuator can be anything from an electric motor to gears to rails.
● Effectors - Effectors are devices that have an impact on the environment. Legs, wheels, arms, fingers, wings, fins, and a display screen are examples of effectors.
An intelligent agent is a self-contained creature that uses sensors and actuators to interact with its surroundings in order to achieve its objectives. To attain their objectives, an intelligent agent can learn from their surroundings. An intelligent agent is something like a thermostat.
The four main rules for an AI agent are as follows:
● Rule 1: An AI agent must be able to perceive its surroundings.
● Rule 2: Decisions must be made based on the observation.
● Rule 3: A decision must be followed by action.
● Rule 4: An AI agent's actions must be reasonable.
A rational agent is one who has defined preferences, models uncertainty, and acts in such a way that its performance measure is maximized using all available actions.
The proper things are stated to be done by a reasonable agent. AI is concerned with the development of rational agents for application in game theory and decision theory in a variety of real-world contexts.
The rational action is the most crucial for an AI agent because in an AI reinforcement learning algorithm, an agent receives a positive reward for each best feasible action and a negative reward for each incorrect action.
Structure
AI's objective is to create an agent programme that performs the agent function. The architecture and agent programme combine to form the framework of an intelligent agent. It can be summed up as follows:
Agent = Architecture + Agent program
The following are the three most important terms in the structure of an AI agent:
● Architecture - Architecture is the machinery on which an AI agent operates.
● Agent Function - The agent function is used to map a percept to a certain action.
f:P* → A
● Agent program - An agent programme is a programme that performs the function of an agent. To produce function f, an agent programme runs on the physical architecture.
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history. The condition-action rule is used to create the agent function.
A condition-action rule is a rule that connects a state (or condition) to a response (or action). If the condition is met, the action is executed; otherwise, it is not. When the environment is fully observable, this agent function succeeds. Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. If the agent can randomize its activities, it may be feasible to break out of infinite loops.
Simple reflex agents have the following drawbacks:
● Intelligence is very restricted.
● There is no understanding of the state's non-perceptual components.
● Typically, they are too large to generate and store.
● If the environment changes, the collection of rules must be updated.
Fig 4: Reflex agent
Key takeaway
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history.
It operates by looking for a rule that has the same condition as the current circumstance. A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history. The current state is saved within the agent, which keeps track of some form of structure characterizing the parts of the environment that aren't visible.
Updating the state necessitates knowledge of:
● how the world evolves without the intervention of the agent, and
● what effect the agent's activities have on the world
There are two crucial components in a model-based agent:
● Model: A Model-based agent is one that has information about "how things happen in the world."
● Internal State: It is a perceptually based representation of the current state.
Fig 5: Model based agent
Key takeaway
A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history.
These agents make judgments based on how close they are to achieving their goal (description of desirable situations). Every action they take is aimed at reducing the gap between them and the goal. This gives the agent the ability to choose from a variety of options, selecting the one that leads to the desired outcome.
The knowledge that underpins its judgments is explicitly represented and modifiable, allowing these agents to be more adaptable. They usually necessitate some searching and planning. The behaviour of the goal-based agent can be simply modified.
Fig 6: Goal based agent
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best. They choose actions for each state based on a preference (utility). Sometimes getting the desired outcome is insufficient. To get to a place, we might look for a faster, safer, and less expensive option.
The happiness of the agents should be considered. The agent's "happiness" is described by utility. A utility agent chooses the behaviour that maximizes the predicted utility due to the uncertainty in the world. A utility function converts a state into a real number that represents the degree of happiness associated with it.
Fig 7: Utility based agent
Key takeaway
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best.
Some programmes work in a completely artificial world, with just keyboard input, databases, computer file systems, and character output on a screen.
Some software agents (software robots or softbots) exist in rich, limitless softbots domains, on the other hand. The environment in the simulator is extremely detailed and complicated. In real time, the software agent must choose from a large number of options. A softbot that scans a customer's internet preferences and displays intriguing goods to them operates in both a real and an artificial environment.
The Turing Test environment is the most well-known artificial setting, in which one real and other artificial agents are tested on an equal footing.
Turing Test
The Turing Test can be used to assess the success of a system's intelligent behaviour.
The test involves two people and a machine that will be evaluated. One of the two people takes on the job of tester. They are all in different rooms. The tester has no idea who is a human and who is a machine. He types the queries and sends them to both intelligences, who respond to him with typed responses.
The goal of this test is to deceive the tester. The machine is deemed to be intelligent if the tester is unable to distinguish the machine's response from the human response.
Properties of Environment
The environment has a variety of characteristics.
● Discrete / Continuous: If the environment has a restricted number of different, clearly defined states (for example, chess), it is discrete; otherwise, it is continuous (For example, driving).
● Observable / Partially Observable - If the whole state of the environment can be determined from the percepts at each time point, it is observable; otherwise, it is only partially observable.
● Static / Dynamic - The environment is static if it does not change while an agent is acting; otherwise, it is dynamic.
● Multiple agents / single agent - The environment may contain other agents of the same or other kind as the agent.
● Accessible / Inaccessible − If an agent's sensory apparatus can access the entire state of the environment, the environment is accessible to that agent.
● Deterministic / Non-deterministic − If the current state of the environment and the actions of the agent totally determine the next state of the environment, the environment is deterministic; otherwise, it is non-deterministic.
● Episodic / Non-episodic - Each episode in an episodic environment consists of the agent seeing and then acting. The quality of its action is solely determined by the episode. The actions in prior episodes have no bearing on subsequent episodes.
References:
- Elaine Rich, Kevin Knight and Shivashankar B Nair, “Artificial Intelligence”, Mc Graw Hill Publication, 2009.
- Dan W. Patterson, “Introduction to Artificial Intelligence and Expert System”, Pearson Publication,2015.
- Saroj Kaushik, “Artificial Intelligence”, Cengage Learning, 2011.
Unit - 4
Uncertainty
We had previously acquired knowledge representation using first-order logic and propositional logic with certainty, implying that we knew the predicates. With this knowledge representation, we could write AB, which implies that if A is true, then B is true; however, if we aren't sure if A is true or not, we won't be able to make this assertion; this is known as uncertainty.
So we require uncertain reasoning or probabilistic reasoning to describe uncertain information, when we aren't confident about the predicates.
Causes of uncertainty:
In the real world, the following are some of the most common sources of uncertainty.
● Information occurred from unreliable sources.
● Experimental Errors
● Equipment fault
● Temperature variation
● Climate change.
Probabilistic reasoning is a knowledge representation method that uses the concept of probability to indicate uncertainty in knowledge. We employ probabilistic reasoning, which integrates probability theory and logic, to deal with uncertainty.
Probability is employed in probabilistic reasoning because it helps us to deal with uncertainty caused by someone's indifference or ignorance.
There are many situations in the real world when nothing is certain, such as "it will rain today," "behavior of someone in specific surroundings," and "a competition between two teams or two players." These are probable sentences for which we can presume that something will happen but are unsure, thus we utilize probabilistic reasoning in this case.
Probability: Probability is a measure of the likelihood of an uncertain event occurring. It's a numerical representation of the likelihood of something happening. The probability value is always between 0 and 1, indicating complete unpredictability.
0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
P(A) = 0, indicates total uncertainty in an event A.
P(A) =1, indicates total certainty in an event A.
We can compute the probability of an uncertain event using the formula below.
P(¬A) = probability of a not happening event.
P(¬A) + P(A) = 1.
Event: An event is the name given to each conceivable consequence of a variable.
Sample space: Sample space is a collection of all potential events.
Random variables: Random variables are used to represent real-world events and objects.
Prior probability: The prior probability of an occurrence is the probability calculated before fresh information is observed.
Posterior Probability: After all evidence or information has been considered, the likelihood is computed. It's a mix of known probabilities and new information.
Conditional probability: The likelihood of an event occurring when another event has already occurred is known as conditional probability.
Key takeaway
- Probabilistic reasoning is a method of knowledge representation in which the concept of probability is used to convey the uncertainty in knowledge.
- Probability is used in probabilistic reasoning because it allows us to deal with the uncertainty that arises from someone's laziness or ignorance.
It is critical to first comprehend what rational choice is, and to do so, one must first comprehend the meanings of the terms rational and choice (Green and Shapiro, 1994; Friedman, 1996). The google dictionary defines rational as "based on or in accordance with reason or logic," while choice is described as "the act of selecting between two or more options."
The process of making rational judgments based on relevant information in a logical, timely, and optimum manner is known as rational choice.
Let's say a man named Then do wants to figure out how much coffee he should drink today, so he phones his sister Denga to find out what color shoes she's wearing and uses that knowledge to figure out how much coffee he should drink that day. This will be an unreasonable decision because Thendo is deciding how much coffee he will drink based on irrelevant information (the color of shoes his sister Denga is wearing).
On the other hand, if Thendo determines that every time he takes a sip of that coffee, he must walk 1km, we will infer that he is acting irrationally because he is wasting energy by walking 1km to take a sip of coffee. This waste of energy is irrational, and it is referred to as an un-optimized solution in mathematical terms. If Then decides to pour the coffee from one glass to another every time he takes it for no other reason than to complete the chore of drinking coffee, he is taking an irrational course of action because it is not only illogical but also inefficient.
Figure shows an illustration of a rational decision-making framework. This diagram depicts how one studies the surroundings, or more technically, a decision-making area, when engaging in a rational decision-making process. The next step is to find the information that will help you make a decision. The input is then submitted to a logical and consistent decision engine, which examines all choices and their associated utilities before selecting the decision with the highest utility. Any flaw in this framework, such as the lack of complete knowledge, imprecise and defective information, or the inability to analyze all options, limits the theory of rational choice, and it becomes a bounded rational choice, as described by Herbert Simon (1991).
Classical economics is predicated on the notion that economic agents make decisions based on the principle of rational choice. Because this agent is a human person, many researchers have studied how humans make decisions, which is now known as behavioral economics. In his book Thinking Fast and Slow, Kahneman delves deeply into behavioral economics (Kahneman, 1991). Artificially intelligent machines are increasingly making decisions today. The assumption of rationality is stronger when decisions are made by artificial intelligent computers than when decisions are made by humans.
In this aspect, machine-made decisions are not as irrational as those produced by humans; but, they are not totally rational, and so are still susceptible to Herbert Simon's bounded rationality theory.
Fig 1: Steps for rational decision making
What are the characteristics of an isolated economic system in which all decision-making agents are human beings, for example? Behavioral economics will describe that economy. If, on the other hand, all of the decision-making agents are artificially intelligent machines that make rational-choice decisions, neoclassical economics will most likely apply. If half of these decision-making agents are people and the other half are machines, a half-neoclassical, half-behavioral economic system will emerge.
● Sample spaces S with events Ai, probabilities P(Ai); union A ∪ B and intersection AB, complement Ac .
● Axioms: P(A) ≤ 1; P(S) = 1; for exclusive Ai, P(∪iAi) = Σi P(Ai).
● Conditional probability: P(A|B) = P(AB)/P(B); P(A) = P(A|B)P(B) + P(A|Bc )P(Bc)
● Random variables (RVs) X; the cumulative distribution function (cdf) F(x) = P{X ≤ x};
For a discrete RV, probability mass function (pmf)
For a continuous RV, probability density function (pdf)
● Generalizations for more than one variable, e.g
Two RVs X and Y: joint cdf F(x, y) = P{X ≤ x, Y ≤ y};
Pmf f(x, y) = P{X = x, Y = y}; or
Pdf f(x, y), with
Independent X and Y iff f(x, y) = fX(x) fY (y)
● Expected value or mean: for RV X, µ = E[X]; discrete RVs
Continuous RVs
Probability Theory provides the formal techniques and principles for manipulating probabilistically represented propositions. The following are the three assumptions of probability theory:
● 0 <= P(A=a) <= 1 for all a in sample space of A
● P(True)=1, P(False)=0
● P(A v B) = P(A) + P(B) - P(A ^ B)
The following qualities can be deduced from these axioms:
● P(~A) = 1 - P(A)
● P(A) = P(A ^ B) + P(A ^ ~B)
● Sum{P(A=a)} = 1, where the sum is over all possible values a in the sample space of A
Bayes' theorem is a mathematical formula for calculating the probability of an event based on ambiguous information. It is also known as Bayes' rule, Bayes' law, or Bayesian reasoning.
In probability theory, it relates the conditional and marginal probabilities of two random events.
The inspiration for Bayes' theorem came from British mathematician Thomas Bayes. Bayesian inference is a technique for applying Bayes' theorem, which is at the heart of Bayesian statistics.
It's a method for calculating the value of P(B|A) using P(A|B) knowledge.
Bayes' theorem can update the probability forecast of an occurrence based on new information from the real world.
Example: If cancer is linked to one's age, we can apply Bayes' theorem to more precisely forecast cancer risk based on age.
Bayes' theorem can be derived using the product rule and conditional probability of event A with known event B:
As a result of the product rule, we can write:
P(A ⋀ B)= P(A|B) P(B) or
In the same way, the likelihood of event B if event A is known is:
P(A ⋀ B)= P(B|A) P(A)
When we combine the right-hand sides of both equations, we get:
The above equation is known as Bayes' rule or Bayes' theorem (a). This equation is the starting point for most current AI systems for probabilistic inference.
It displays the simple relationship between joint and conditional probabilities. Here,
P(A|B) is the posterior, which we must compute, and it stands for Probability of hypothesis A when evidence B is present.
The likelihood is defined as P(B|A), in which the probability of evidence is computed after assuming that the hypothesis is valid.
Prior probability, or the likelihood of a hypothesis before taking into account the evidence, is denoted by P(A).
Marginal probability, or the likelihood of a single piece of evidence, is denoted by P(B).
In general, we can write P (B) = P(A)*P(B|Ai) in the equation (a), therefore the Bayes' rule can be expressed as:
Where A1, A2, A3,..... Is a set of mutually exclusive and exhaustive events, and An is a set of mutually exclusive and exhaustive events.
- Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets, are a data structure for capturing all of the information in a domain's entire joint probability distribution. The Bayesian Net can compute every value in the entire joint probability distribution of a collection of random variables.
- This variable represents all direct causal relationships between variables.
- To construct a Bayesian net for a given set of variables, draw arcs from cause variables to immediate effects.
- It saves space by taking advantage of the fact that variable dependencies are generally local in many real-world problem domains, resulting in a high number of conditionally independent variables.
- In both qualitative and quantitative terms, the link between variables is captured.
- It's possible to use it to construct a case.
- Predictive reasoning (sometimes referred to as causal reasoning) is a sort of reasoning that proceeds from the top down (from causes to effects).
- From effects to causes, diagnostic thinking goes backwards (bottom-up).
- When A has a direct causal influence on B, a Bayesian Net is a directed acyclic graph (DAG) in which each random variable has its own node and a directed arc from A to B. As a result, the nodes represent states of affairs, whereas the arcs represent causal relationships. When A is present, B is more likely to occur, and vice versa. The backward influence is referred to as "diagnostic" or "evidential" support for A due to the presence of B.
- Each node A in a network is conditionally independent of any subset of nodes that are not children of A, given its parents.
A Bayesian network that depicts and solves decision issues under uncertain knowledge is known as an influence diagram.
Fig 2: Bayesian network graph
- Each node represents a random variable that can be continuous or discontinuous in nature.
- Arcs or directed arrows represent the causal relationship or conditional probabilities between random variables. The graph's two nodes are connected by these directed links, also known as arrows.
These ties imply that one node has a direct influence on another, and nodes are independent of one another if there are no directed relationships.
Key takeaway
Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets.
Bayesian Networks are a data structure for capturing all of the information in the whole joint probability distribution for the collection of random variables that define a domain.
● In general, Bayes Net inference is an NP-hard problem (exponential in the size of the graph).
● There are linear time techniques based on belief propagation for singly-connected networks or polytrees with no undirected loops.
● Local evidence messages are sent by each node to their children and parents.
● Based on incoming messages from its neighbors, each node updates its belief in each of its possible values and propagates evidence to its neighbors.
● Inference for generic networks can be approximated using loopy belief propagation, which iteratively refines probabilities until they reach an exact limit.
Temporal model
- Monitoring or filtering
- Prediction
Approximate inference
Direct sampling method
The most basic sort of random sampling process for Bayesian networks generates events from a network with no evidence associated with them. The idea is that each variable is sampled in topological order. The value is drawn from a probability distribution based on the values that have previously been assigned to the variable's parents.
Function PRIOR-SAMPLE(bn) returns an event sampled from the prior specified by bn
Inputs: bn, a Bayesian network specifying joint distribution
an event with elements
For each variable in do
a random sample from parents
Return
Likelihood weighting
Likelihood weighting eliminates the inefficiencies of rejection sampling by creating only events that are consistent with the evidence. It's a tailored version of the general statistical approach of importance sampling for Bayesian network inference.
Function LIKELIHOOD-WEIGHTING returns an estimate of
Inputs: , the query variable
e, observed values for variables
, a Bayesian network specifying joint distribution
, the total number of samples to be generated
Local variables: W, a vector of weighted counts for each value of , initially zero
For to do
WEIGHTED-SAMPLE
where is the value of in
Return NORMALIZE(W)
Inference by Markov chain simulation
Rejection sampling and likelihood weighting are not the same as Markov chain (MCMC) approaches. Rather of starting from scratch, MCMC approaches create each sample by making a random change to the preceding sample. It's easier to think of an MCMC algorithm as being in a current state where each variable has a value and then randomly modifying that state to get a new state.
Key takeaway
- For Bayesian networks, the most basic type of random sampling procedure generates events from a network with no evidence connected with them.
- It's a variant of the broad statistical technique of importance sampling that's been customized for Bayesian network inference.
- Markov chain (MCMC) MARKOV CHAIN methods are not the same as rejection sampling and likelihood weighting.
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The conventional logic block that a computer can understand takes the precise input and produces that of a definite output as TRUE or FALSE, which is equivalent to that of a human’s YES or NO.
The inventor of fuzzy logic, Lotfi Zadeh, observed that unlike the computers, the human decision making includes that of a range of the possibilities between YES and NO, example −
CERTAINLY YES |
POSSIBLY YES |
CANNOT SAY |
POSSIBLY NO |
CERTAINLY NO |
The fuzzy logic works on the levels of that of the possibilities of the input to achieve that of the definite output.
Implementation
● It can be implemented in the systems with the various sizes and the capabilities ranging from that of the small microcontrollers to large, the networked, and the workstation-based control systems.
● It can be implemented in the hardware, the software, or a combination of both.
Why Fuzzy Logic?
Fuzzy logic is useful for commercial and practical purposes.
● It can control the machines and the consumer products.
● It may not give the accurate reasoning, but acceptable to that of the reasoning.
● Fuzzy logic helps to deal with that uncertainty in engineering.
Fuzzy Logic Systems Architecture
It has four main parts as shown below −
● Fuzzification Module − It transforms that of the system inputs, which are crisp numbers, into that of the fuzzy sets. It splits the input signal into five steps example −
LP | x is Large Positive |
MP | x is Medium Positive |
S | x is Small |
MN | x is Medium Negative |
LN | x is Large Negative
|
● Knowledge Base − It stores IF-THEN rules provided by that of the experts.
● Inference Engine − It simulates that of the human reasoning process by making the fuzzy inference on the inputs and IF-THEN rules.
● Defuzzification Module − It transforms the fuzzy set obtained by that of the inference engine into a crisp value.
The membership functions work on that of the fuzzy sets of variables.
Membership Function
Membership functions allow you to quantify that of the linguistic term and represent a fuzzy set graphically. A membership function for that of a fuzzy set A on the universe of that of the discourse X is defined as μA:X → [0,1].
Here, each element of X is mapped to that of a value between 0 and 1. It is called the membership value or that of the degree of membership. It quantifies the degree of that of the membership of the element in X to that of the fuzzy set A.
● x axis represents that of the universe of discourse.
● y axis represents that of the degrees of membership in the [0, 1] interval.
There can be multiple membership functions applicable to that of the fuzzify a numerical value. Simple membership functions are used as use of that of the complex functions does not add more precision in the output.
All membership functions for LP, MP, S, MN, and LN are shown as below −
The triangular membership function shapes are the most of the common among various of the other membership function shapes for example trapezoidal, singleton, and Gaussian.
Here, the input to that of the 5-level fuzzifier varies from that of the -10 volts to +10 volts. Hence the corresponding output will also change.
Example of a Fuzzy Logic System
Let us consider that an air conditioning system with that of the 5-level fuzzy logic system. This system adjusts that of the temperature of the air conditioner by comparing that of the room temperature and that of the target temperature value.
Fig 3: Example
Algorithm
● Define linguistic Variables and terms (start)
● Construct membership functions for them. (start)
● Construct knowledge base of rules (start)
● Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
● Evaluate rules in the rule base. (Inference Engine)
● Combine results from each rule. (Inference Engine)
● Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are the input and the output variables in that of the form of the simple words or the sentences. For that of the room temperature, cold, warm, hot, etc., are the linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is that of a linguistic term and it can cover that of the some portion of overall temperature of the values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −
Step3 − Construct knowledge base rules
Create a matrix of room temperature values versus target temperature values that an air conditioning system is expected to provide.
RoomTemp. /Target | Very_Cold | Cold | Warm | Hot | Very_Hot |
Very_Cold | No_Change | Heat | Heat | Heat | Heat |
Cold | Cool | No_Change | Heat | Heat | Heat |
Warm | Cool | Cool | No_Change | Heat | Heat |
Hot | Cool | Cool | Cool | No_Change | Heat |
Very_Hot | Cool | Cool | Cool | Cool | No_Change |
Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Sr. No. | Condition | Action |
1 | IF temperature=(Cold OR Very_Cold) AND target=Warm THEN | Heat |
2 | IF temperature=(Hot OR Very_Hot) AND target=Warm THEN | Cool |
3 | IF (temperature=Warm) AND (target=Warm) THEN | No_Change |
Step 4 − Obtain fuzzy value
Fuzzy set operations perform evaluation of rules. The operations used for that of the OR and AND are the Max and Min respectively. Combine all of the results of the evaluation to form that of a final result. This result is that of a fuzzy value.
Step 5 − Perform defuzzification
Defuzzification is then performed according to that of the membership function for output variable.
Application Areas of Fuzzy Logic
The key application areas of fuzzy logic are as follows −
Automotive Systems
● Automatic Gearboxes
● Four-Wheel Steering
● Vehicle environment control
Consumer Electronic Goods
● Hi-Fi Systems
● Photocopiers
● Still and Video Cameras
● Television
Domestic Goods
● Microwave Ovens
● Refrigerators
● Toasters
● Vacuum Cleaners
● Washing Machines
Environment Control
● Air Conditioners/Dryers/Heaters
● Humidifiers
Advantages of FLSs
● Mathematical concepts within that of the fuzzy reasoning are very simple.
● You can modify a FLS by just adding or deleting the rules due to the flexibility of fuzzy logic.
● Fuzzy logic Systems can take the imprecise, the distorted, noisy input information.
● FLSs are easy to construct and to understand.
● Fuzzy logic is that of a solution to complex problems in all of the fields of life, including medicine, as it resembles human reasoning and that of the decision making.
Disadvantages of FLSs
● There is no systematic approach to that of fuzzy system designing.
● They are understandable only when they are simple.
● They are suitable for problems which do not need very high accuracy.
Key takeaway
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The study of the rational agent and its environment can be defined as an AI system. The agents use sensors to sense their surroundings and actuators to act on their surroundings. Knowledge, belief, purpose, and other mental qualities can be possessed by an AI agent.
An agent is anything that uses sensors to observe its environment and actuators to act on that environment. The cycle of perceiving, thinking, and action is followed by an Agent. An agent can be any of the following:
● Human-Agent - A human agent has sensors in the form of eyes, ears, and other organs, and actuators in the form of hands, legs, and vocal tract.
● Robotic Agent - Cameras, infrared range finders, NLP for sensors, and various motors for actuators can all be found in a robotic agent.
● Software Agent - A software agent can receive sensory input like keystrokes and file contents, act on those inputs, and show output on the screen.
As a result, the environment around us is replete of agents, including thermostats, cellphones, cameras, and even ourselves.
Before we proceed, we must first understand sensors, effectors, and actuators.
● Sensor - A sensor is an electronic device that detects changes in the environment and transmits the data to other electronic devices. Sensors allow an agent to observe its surroundings.
● Actuators - Actuators are machine components that convert energy to motion. The actuators' sole function is to move and control a system. An actuator can be anything from an electric motor to gears to rails.
● Effectors - Effectors are devices that have an impact on the environment. Legs, wheels, arms, fingers, wings, fins, and a display screen are examples of effectors.
An intelligent agent is a self-contained creature that uses sensors and actuators to interact with its surroundings in order to achieve its objectives. To attain their objectives, an intelligent agent can learn from their surroundings. An intelligent agent is something like a thermostat.
The four main rules for an AI agent are as follows:
● Rule 1: An AI agent must be able to perceive its surroundings.
● Rule 2: Decisions must be made based on the observation.
● Rule 3: A decision must be followed by action.
● Rule 4: An AI agent's actions must be reasonable.
A rational agent is one who has defined preferences, models uncertainty, and acts in such a way that its performance measure is maximized using all available actions.
The proper things are stated to be done by a reasonable agent. AI is concerned with the development of rational agents for application in game theory and decision theory in a variety of real-world contexts.
The rational action is the most crucial for an AI agent because in an AI reinforcement learning algorithm, an agent receives a positive reward for each best feasible action and a negative reward for each incorrect action.
Structure
AI's objective is to create an agent programme that performs the agent function. The architecture and agent programme combine to form the framework of an intelligent agent. It can be summed up as follows:
Agent = Architecture + Agent program
The following are the three most important terms in the structure of an AI agent:
● Architecture - Architecture is the machinery on which an AI agent operates.
● Agent Function - The agent function is used to map a percept to a certain action.
f:P* → A
● Agent program - An agent programme is a programme that performs the function of an agent. To produce function f, an agent programme runs on the physical architecture.
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history. The condition-action rule is used to create the agent function.
A condition-action rule is a rule that connects a state (or condition) to a response (or action). If the condition is met, the action is executed; otherwise, it is not. When the environment is fully observable, this agent function succeeds. Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. If the agent can randomize its activities, it may be feasible to break out of infinite loops.
Simple reflex agents have the following drawbacks:
● Intelligence is very restricted.
● There is no understanding of the state's non-perceptual components.
● Typically, they are too large to generate and store.
● If the environment changes, the collection of rules must be updated.
Fig 4: Reflex agent
Key takeaway
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history.
It operates by looking for a rule that has the same condition as the current circumstance. A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history. The current state is saved within the agent, which keeps track of some form of structure characterizing the parts of the environment that aren't visible.
Updating the state necessitates knowledge of:
● how the world evolves without the intervention of the agent, and
● what effect the agent's activities have on the world
There are two crucial components in a model-based agent:
● Model: A Model-based agent is one that has information about "how things happen in the world."
● Internal State: It is a perceptually based representation of the current state.
Fig 5: Model based agent
Key takeaway
A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history.
These agents make judgments based on how close they are to achieving their goal (description of desirable situations). Every action they take is aimed at reducing the gap between them and the goal. This gives the agent the ability to choose from a variety of options, selecting the one that leads to the desired outcome.
The knowledge that underpins its judgments is explicitly represented and modifiable, allowing these agents to be more adaptable. They usually necessitate some searching and planning. The behaviour of the goal-based agent can be simply modified.
Fig 6: Goal based agent
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best. They choose actions for each state based on a preference (utility). Sometimes getting the desired outcome is insufficient. To get to a place, we might look for a faster, safer, and less expensive option.
The happiness of the agents should be considered. The agent's "happiness" is described by utility. A utility agent chooses the behaviour that maximizes the predicted utility due to the uncertainty in the world. A utility function converts a state into a real number that represents the degree of happiness associated with it.
Fig 7: Utility based agent
Key takeaway
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best.
Some programmes work in a completely artificial world, with just keyboard input, databases, computer file systems, and character output on a screen.
Some software agents (software robots or softbots) exist in rich, limitless softbots domains, on the other hand. The environment in the simulator is extremely detailed and complicated. In real time, the software agent must choose from a large number of options. A softbot that scans a customer's internet preferences and displays intriguing goods to them operates in both a real and an artificial environment.
The Turing Test environment is the most well-known artificial setting, in which one real and other artificial agents are tested on an equal footing.
Turing Test
The Turing Test can be used to assess the success of a system's intelligent behaviour.
The test involves two people and a machine that will be evaluated. One of the two people takes on the job of tester. They are all in different rooms. The tester has no idea who is a human and who is a machine. He types the queries and sends them to both intelligences, who respond to him with typed responses.
The goal of this test is to deceive the tester. The machine is deemed to be intelligent if the tester is unable to distinguish the machine's response from the human response.
Properties of Environment
The environment has a variety of characteristics.
● Discrete / Continuous: If the environment has a restricted number of different, clearly defined states (for example, chess), it is discrete; otherwise, it is continuous (For example, driving).
● Observable / Partially Observable - If the whole state of the environment can be determined from the percepts at each time point, it is observable; otherwise, it is only partially observable.
● Static / Dynamic - The environment is static if it does not change while an agent is acting; otherwise, it is dynamic.
● Multiple agents / single agent - The environment may contain other agents of the same or other kind as the agent.
● Accessible / Inaccessible − If an agent's sensory apparatus can access the entire state of the environment, the environment is accessible to that agent.
● Deterministic / Non-deterministic − If the current state of the environment and the actions of the agent totally determine the next state of the environment, the environment is deterministic; otherwise, it is non-deterministic.
● Episodic / Non-episodic - Each episode in an episodic environment consists of the agent seeing and then acting. The quality of its action is solely determined by the episode. The actions in prior episodes have no bearing on subsequent episodes.
References:
- Elaine Rich, Kevin Knight and Shivashankar B Nair, “Artificial Intelligence”, Mc Graw Hill Publication, 2009.
- Dan W. Patterson, “Introduction to Artificial Intelligence and Expert System”, Pearson Publication,2015.
- Saroj Kaushik, “Artificial Intelligence”, Cengage Learning, 2011.
Unit - 4
Uncertainty
We had previously acquired knowledge representation using first-order logic and propositional logic with certainty, implying that we knew the predicates. With this knowledge representation, we could write AB, which implies that if A is true, then B is true; however, if we aren't sure if A is true or not, we won't be able to make this assertion; this is known as uncertainty.
So we require uncertain reasoning or probabilistic reasoning to describe uncertain information, when we aren't confident about the predicates.
Causes of uncertainty:
In the real world, the following are some of the most common sources of uncertainty.
● Information occurred from unreliable sources.
● Experimental Errors
● Equipment fault
● Temperature variation
● Climate change.
Probabilistic reasoning is a knowledge representation method that uses the concept of probability to indicate uncertainty in knowledge. We employ probabilistic reasoning, which integrates probability theory and logic, to deal with uncertainty.
Probability is employed in probabilistic reasoning because it helps us to deal with uncertainty caused by someone's indifference or ignorance.
There are many situations in the real world when nothing is certain, such as "it will rain today," "behavior of someone in specific surroundings," and "a competition between two teams or two players." These are probable sentences for which we can presume that something will happen but are unsure, thus we utilize probabilistic reasoning in this case.
Probability: Probability is a measure of the likelihood of an uncertain event occurring. It's a numerical representation of the likelihood of something happening. The probability value is always between 0 and 1, indicating complete unpredictability.
0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
P(A) = 0, indicates total uncertainty in an event A.
P(A) =1, indicates total certainty in an event A.
We can compute the probability of an uncertain event using the formula below.
P(¬A) = probability of a not happening event.
P(¬A) + P(A) = 1.
Event: An event is the name given to each conceivable consequence of a variable.
Sample space: Sample space is a collection of all potential events.
Random variables: Random variables are used to represent real-world events and objects.
Prior probability: The prior probability of an occurrence is the probability calculated before fresh information is observed.
Posterior Probability: After all evidence or information has been considered, the likelihood is computed. It's a mix of known probabilities and new information.
Conditional probability: The likelihood of an event occurring when another event has already occurred is known as conditional probability.
Key takeaway
- Probabilistic reasoning is a method of knowledge representation in which the concept of probability is used to convey the uncertainty in knowledge.
- Probability is used in probabilistic reasoning because it allows us to deal with the uncertainty that arises from someone's laziness or ignorance.
It is critical to first comprehend what rational choice is, and to do so, one must first comprehend the meanings of the terms rational and choice (Green and Shapiro, 1994; Friedman, 1996). The google dictionary defines rational as "based on or in accordance with reason or logic," while choice is described as "the act of selecting between two or more options."
The process of making rational judgments based on relevant information in a logical, timely, and optimum manner is known as rational choice.
Let's say a man named Then do wants to figure out how much coffee he should drink today, so he phones his sister Denga to find out what color shoes she's wearing and uses that knowledge to figure out how much coffee he should drink that day. This will be an unreasonable decision because Thendo is deciding how much coffee he will drink based on irrelevant information (the color of shoes his sister Denga is wearing).
On the other hand, if Thendo determines that every time he takes a sip of that coffee, he must walk 1km, we will infer that he is acting irrationally because he is wasting energy by walking 1km to take a sip of coffee. This waste of energy is irrational, and it is referred to as an un-optimized solution in mathematical terms. If Then decides to pour the coffee from one glass to another every time he takes it for no other reason than to complete the chore of drinking coffee, he is taking an irrational course of action because it is not only illogical but also inefficient.
Figure shows an illustration of a rational decision-making framework. This diagram depicts how one studies the surroundings, or more technically, a decision-making area, when engaging in a rational decision-making process. The next step is to find the information that will help you make a decision. The input is then submitted to a logical and consistent decision engine, which examines all choices and their associated utilities before selecting the decision with the highest utility. Any flaw in this framework, such as the lack of complete knowledge, imprecise and defective information, or the inability to analyze all options, limits the theory of rational choice, and it becomes a bounded rational choice, as described by Herbert Simon (1991).
Classical economics is predicated on the notion that economic agents make decisions based on the principle of rational choice. Because this agent is a human person, many researchers have studied how humans make decisions, which is now known as behavioral economics. In his book Thinking Fast and Slow, Kahneman delves deeply into behavioral economics (Kahneman, 1991). Artificially intelligent machines are increasingly making decisions today. The assumption of rationality is stronger when decisions are made by artificial intelligent computers than when decisions are made by humans.
In this aspect, machine-made decisions are not as irrational as those produced by humans; but, they are not totally rational, and so are still susceptible to Herbert Simon's bounded rationality theory.
Fig 1: Steps for rational decision making
What are the characteristics of an isolated economic system in which all decision-making agents are human beings, for example? Behavioral economics will describe that economy. If, on the other hand, all of the decision-making agents are artificially intelligent machines that make rational-choice decisions, neoclassical economics will most likely apply. If half of these decision-making agents are people and the other half are machines, a half-neoclassical, half-behavioral economic system will emerge.
● Sample spaces S with events Ai, probabilities P(Ai); union A ∪ B and intersection AB, complement Ac .
● Axioms: P(A) ≤ 1; P(S) = 1; for exclusive Ai, P(∪iAi) = Σi P(Ai).
● Conditional probability: P(A|B) = P(AB)/P(B); P(A) = P(A|B)P(B) + P(A|Bc )P(Bc)
● Random variables (RVs) X; the cumulative distribution function (cdf) F(x) = P{X ≤ x};
For a discrete RV, probability mass function (pmf)
For a continuous RV, probability density function (pdf)
● Generalizations for more than one variable, e.g
Two RVs X and Y: joint cdf F(x, y) = P{X ≤ x, Y ≤ y};
Pmf f(x, y) = P{X = x, Y = y}; or
Pdf f(x, y), with
Independent X and Y iff f(x, y) = fX(x) fY (y)
● Expected value or mean: for RV X, µ = E[X]; discrete RVs
Continuous RVs
Probability Theory provides the formal techniques and principles for manipulating probabilistically represented propositions. The following are the three assumptions of probability theory:
● 0 <= P(A=a) <= 1 for all a in sample space of A
● P(True)=1, P(False)=0
● P(A v B) = P(A) + P(B) - P(A ^ B)
The following qualities can be deduced from these axioms:
● P(~A) = 1 - P(A)
● P(A) = P(A ^ B) + P(A ^ ~B)
● Sum{P(A=a)} = 1, where the sum is over all possible values a in the sample space of A
Bayes' theorem is a mathematical formula for calculating the probability of an event based on ambiguous information. It is also known as Bayes' rule, Bayes' law, or Bayesian reasoning.
In probability theory, it relates the conditional and marginal probabilities of two random events.
The inspiration for Bayes' theorem came from British mathematician Thomas Bayes. Bayesian inference is a technique for applying Bayes' theorem, which is at the heart of Bayesian statistics.
It's a method for calculating the value of P(B|A) using P(A|B) knowledge.
Bayes' theorem can update the probability forecast of an occurrence based on new information from the real world.
Example: If cancer is linked to one's age, we can apply Bayes' theorem to more precisely forecast cancer risk based on age.
Bayes' theorem can be derived using the product rule and conditional probability of event A with known event B:
As a result of the product rule, we can write:
P(A ⋀ B)= P(A|B) P(B) or
In the same way, the likelihood of event B if event A is known is:
P(A ⋀ B)= P(B|A) P(A)
When we combine the right-hand sides of both equations, we get:
The above equation is known as Bayes' rule or Bayes' theorem (a). This equation is the starting point for most current AI systems for probabilistic inference.
It displays the simple relationship between joint and conditional probabilities. Here,
P(A|B) is the posterior, which we must compute, and it stands for Probability of hypothesis A when evidence B is present.
The likelihood is defined as P(B|A), in which the probability of evidence is computed after assuming that the hypothesis is valid.
Prior probability, or the likelihood of a hypothesis before taking into account the evidence, is denoted by P(A).
Marginal probability, or the likelihood of a single piece of evidence, is denoted by P(B).
In general, we can write P (B) = P(A)*P(B|Ai) in the equation (a), therefore the Bayes' rule can be expressed as:
Where A1, A2, A3,..... Is a set of mutually exclusive and exhaustive events, and An is a set of mutually exclusive and exhaustive events.
- Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets, are a data structure for capturing all of the information in a domain's entire joint probability distribution. The Bayesian Net can compute every value in the entire joint probability distribution of a collection of random variables.
- This variable represents all direct causal relationships between variables.
- To construct a Bayesian net for a given set of variables, draw arcs from cause variables to immediate effects.
- It saves space by taking advantage of the fact that variable dependencies are generally local in many real-world problem domains, resulting in a high number of conditionally independent variables.
- In both qualitative and quantitative terms, the link between variables is captured.
- It's possible to use it to construct a case.
- Predictive reasoning (sometimes referred to as causal reasoning) is a sort of reasoning that proceeds from the top down (from causes to effects).
- From effects to causes, diagnostic thinking goes backwards (bottom-up).
- When A has a direct causal influence on B, a Bayesian Net is a directed acyclic graph (DAG) in which each random variable has its own node and a directed arc from A to B. As a result, the nodes represent states of affairs, whereas the arcs represent causal relationships. When A is present, B is more likely to occur, and vice versa. The backward influence is referred to as "diagnostic" or "evidential" support for A due to the presence of B.
- Each node A in a network is conditionally independent of any subset of nodes that are not children of A, given its parents.
A Bayesian network that depicts and solves decision issues under uncertain knowledge is known as an influence diagram.
Fig 2: Bayesian network graph
- Each node represents a random variable that can be continuous or discontinuous in nature.
- Arcs or directed arrows represent the causal relationship or conditional probabilities between random variables. The graph's two nodes are connected by these directed links, also known as arrows.
These ties imply that one node has a direct influence on another, and nodes are independent of one another if there are no directed relationships.
Key takeaway
Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability Nets.
Bayesian Networks are a data structure for capturing all of the information in the whole joint probability distribution for the collection of random variables that define a domain.
● In general, Bayes Net inference is an NP-hard problem (exponential in the size of the graph).
● There are linear time techniques based on belief propagation for singly-connected networks or polytrees with no undirected loops.
● Local evidence messages are sent by each node to their children and parents.
● Based on incoming messages from its neighbors, each node updates its belief in each of its possible values and propagates evidence to its neighbors.
● Inference for generic networks can be approximated using loopy belief propagation, which iteratively refines probabilities until they reach an exact limit.
Temporal model
- Monitoring or filtering
- Prediction
Approximate inference
Direct sampling method
The most basic sort of random sampling process for Bayesian networks generates events from a network with no evidence associated with them. The idea is that each variable is sampled in topological order. The value is drawn from a probability distribution based on the values that have previously been assigned to the variable's parents.
Function PRIOR-SAMPLE(bn) returns an event sampled from the prior specified by bn
Inputs: bn, a Bayesian network specifying joint distribution
an event with elements
For each variable in do
a random sample from parents
Return
Likelihood weighting
Likelihood weighting eliminates the inefficiencies of rejection sampling by creating only events that are consistent with the evidence. It's a tailored version of the general statistical approach of importance sampling for Bayesian network inference.
Function LIKELIHOOD-WEIGHTING returns an estimate of
Inputs: , the query variable
e, observed values for variables
, a Bayesian network specifying joint distribution
, the total number of samples to be generated
Local variables: W, a vector of weighted counts for each value of , initially zero
For to do
WEIGHTED-SAMPLE
where is the value of in
Return NORMALIZE(W)
Inference by Markov chain simulation
Rejection sampling and likelihood weighting are not the same as Markov chain (MCMC) approaches. Rather of starting from scratch, MCMC approaches create each sample by making a random change to the preceding sample. It's easier to think of an MCMC algorithm as being in a current state where each variable has a value and then randomly modifying that state to get a new state.
Key takeaway
- For Bayesian networks, the most basic type of random sampling procedure generates events from a network with no evidence connected with them.
- It's a variant of the broad statistical technique of importance sampling that's been customized for Bayesian network inference.
- Markov chain (MCMC) MARKOV CHAIN methods are not the same as rejection sampling and likelihood weighting.
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The conventional logic block that a computer can understand takes the precise input and produces that of a definite output as TRUE or FALSE, which is equivalent to that of a human’s YES or NO.
The inventor of fuzzy logic, Lotfi Zadeh, observed that unlike the computers, the human decision making includes that of a range of the possibilities between YES and NO, example −
CERTAINLY YES |
POSSIBLY YES |
CANNOT SAY |
POSSIBLY NO |
CERTAINLY NO |
The fuzzy logic works on the levels of that of the possibilities of the input to achieve that of the definite output.
Implementation
● It can be implemented in the systems with the various sizes and the capabilities ranging from that of the small microcontrollers to large, the networked, and the workstation-based control systems.
● It can be implemented in the hardware, the software, or a combination of both.
Why Fuzzy Logic?
Fuzzy logic is useful for commercial and practical purposes.
● It can control the machines and the consumer products.
● It may not give the accurate reasoning, but acceptable to that of the reasoning.
● Fuzzy logic helps to deal with that uncertainty in engineering.
Fuzzy Logic Systems Architecture
It has four main parts as shown below −
● Fuzzification Module − It transforms that of the system inputs, which are crisp numbers, into that of the fuzzy sets. It splits the input signal into five steps example −
LP | x is Large Positive |
MP | x is Medium Positive |
S | x is Small |
MN | x is Medium Negative |
LN | x is Large Negative
|
● Knowledge Base − It stores IF-THEN rules provided by that of the experts.
● Inference Engine − It simulates that of the human reasoning process by making the fuzzy inference on the inputs and IF-THEN rules.
● Defuzzification Module − It transforms the fuzzy set obtained by that of the inference engine into a crisp value.
The membership functions work on that of the fuzzy sets of variables.
Membership Function
Membership functions allow you to quantify that of the linguistic term and represent a fuzzy set graphically. A membership function for that of a fuzzy set A on the universe of that of the discourse X is defined as μA:X → [0,1].
Here, each element of X is mapped to that of a value between 0 and 1. It is called the membership value or that of the degree of membership. It quantifies the degree of that of the membership of the element in X to that of the fuzzy set A.
● x axis represents that of the universe of discourse.
● y axis represents that of the degrees of membership in the [0, 1] interval.
There can be multiple membership functions applicable to that of the fuzzify a numerical value. Simple membership functions are used as use of that of the complex functions does not add more precision in the output.
All membership functions for LP, MP, S, MN, and LN are shown as below −
The triangular membership function shapes are the most of the common among various of the other membership function shapes for example trapezoidal, singleton, and Gaussian.
Here, the input to that of the 5-level fuzzifier varies from that of the -10 volts to +10 volts. Hence the corresponding output will also change.
Example of a Fuzzy Logic System
Let us consider that an air conditioning system with that of the 5-level fuzzy logic system. This system adjusts that of the temperature of the air conditioner by comparing that of the room temperature and that of the target temperature value.
Fig 3: Example
Algorithm
● Define linguistic Variables and terms (start)
● Construct membership functions for them. (start)
● Construct knowledge base of rules (start)
● Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
● Evaluate rules in the rule base. (Inference Engine)
● Combine results from each rule. (Inference Engine)
● Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are the input and the output variables in that of the form of the simple words or the sentences. For that of the room temperature, cold, warm, hot, etc., are the linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is that of a linguistic term and it can cover that of the some portion of overall temperature of the values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −
Step3 − Construct knowledge base rules
Create a matrix of room temperature values versus target temperature values that an air conditioning system is expected to provide.
RoomTemp. /Target | Very_Cold | Cold | Warm | Hot | Very_Hot |
Very_Cold | No_Change | Heat | Heat | Heat | Heat |
Cold | Cool | No_Change | Heat | Heat | Heat |
Warm | Cool | Cool | No_Change | Heat | Heat |
Hot | Cool | Cool | Cool | No_Change | Heat |
Very_Hot | Cool | Cool | Cool | Cool | No_Change |
Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Sr. No. | Condition | Action |
1 | IF temperature=(Cold OR Very_Cold) AND target=Warm THEN | Heat |
2 | IF temperature=(Hot OR Very_Hot) AND target=Warm THEN | Cool |
3 | IF (temperature=Warm) AND (target=Warm) THEN | No_Change |
Step 4 − Obtain fuzzy value
Fuzzy set operations perform evaluation of rules. The operations used for that of the OR and AND are the Max and Min respectively. Combine all of the results of the evaluation to form that of a final result. This result is that of a fuzzy value.
Step 5 − Perform defuzzification
Defuzzification is then performed according to that of the membership function for output variable.
Application Areas of Fuzzy Logic
The key application areas of fuzzy logic are as follows −
Automotive Systems
● Automatic Gearboxes
● Four-Wheel Steering
● Vehicle environment control
Consumer Electronic Goods
● Hi-Fi Systems
● Photocopiers
● Still and Video Cameras
● Television
Domestic Goods
● Microwave Ovens
● Refrigerators
● Toasters
● Vacuum Cleaners
● Washing Machines
Environment Control
● Air Conditioners/Dryers/Heaters
● Humidifiers
Advantages of FLSs
● Mathematical concepts within that of the fuzzy reasoning are very simple.
● You can modify a FLS by just adding or deleting the rules due to the flexibility of fuzzy logic.
● Fuzzy logic Systems can take the imprecise, the distorted, noisy input information.
● FLSs are easy to construct and to understand.
● Fuzzy logic is that of a solution to complex problems in all of the fields of life, including medicine, as it resembles human reasoning and that of the decision making.
Disadvantages of FLSs
● There is no systematic approach to that of fuzzy system designing.
● They are understandable only when they are simple.
● They are suitable for problems which do not need very high accuracy.
Key takeaway
Fuzzy Logic (FL) is that of a method of the reasoning that resembles human reasoning. The approach of the FL imitates the way of decision making in humans that involves all intermediate possibilities between that of the digital values YES and NO.
The study of the rational agent and its environment can be defined as an AI system. The agents use sensors to sense their surroundings and actuators to act on their surroundings. Knowledge, belief, purpose, and other mental qualities can be possessed by an AI agent.
An agent is anything that uses sensors to observe its environment and actuators to act on that environment. The cycle of perceiving, thinking, and action is followed by an Agent. An agent can be any of the following:
● Human-Agent - A human agent has sensors in the form of eyes, ears, and other organs, and actuators in the form of hands, legs, and vocal tract.
● Robotic Agent - Cameras, infrared range finders, NLP for sensors, and various motors for actuators can all be found in a robotic agent.
● Software Agent - A software agent can receive sensory input like keystrokes and file contents, act on those inputs, and show output on the screen.
As a result, the environment around us is replete of agents, including thermostats, cellphones, cameras, and even ourselves.
Before we proceed, we must first understand sensors, effectors, and actuators.
● Sensor - A sensor is an electronic device that detects changes in the environment and transmits the data to other electronic devices. Sensors allow an agent to observe its surroundings.
● Actuators - Actuators are machine components that convert energy to motion. The actuators' sole function is to move and control a system. An actuator can be anything from an electric motor to gears to rails.
● Effectors - Effectors are devices that have an impact on the environment. Legs, wheels, arms, fingers, wings, fins, and a display screen are examples of effectors.
An intelligent agent is a self-contained creature that uses sensors and actuators to interact with its surroundings in order to achieve its objectives. To attain their objectives, an intelligent agent can learn from their surroundings. An intelligent agent is something like a thermostat.
The four main rules for an AI agent are as follows:
● Rule 1: An AI agent must be able to perceive its surroundings.
● Rule 2: Decisions must be made based on the observation.
● Rule 3: A decision must be followed by action.
● Rule 4: An AI agent's actions must be reasonable.
A rational agent is one who has defined preferences, models uncertainty, and acts in such a way that its performance measure is maximized using all available actions.
The proper things are stated to be done by a reasonable agent. AI is concerned with the development of rational agents for application in game theory and decision theory in a variety of real-world contexts.
The rational action is the most crucial for an AI agent because in an AI reinforcement learning algorithm, an agent receives a positive reward for each best feasible action and a negative reward for each incorrect action.
Structure
AI's objective is to create an agent programme that performs the agent function. The architecture and agent programme combine to form the framework of an intelligent agent. It can be summed up as follows:
Agent = Architecture + Agent program
The following are the three most important terms in the structure of an AI agent:
● Architecture - Architecture is the machinery on which an AI agent operates.
● Agent Function - The agent function is used to map a percept to a certain action.
f:P* → A
● Agent program - An agent programme is a programme that performs the function of an agent. To produce function f, an agent programme runs on the physical architecture.
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history. The condition-action rule is used to create the agent function.
A condition-action rule is a rule that connects a state (or condition) to a response (or action). If the condition is met, the action is executed; otherwise, it is not. When the environment is fully observable, this agent function succeeds. Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. If the agent can randomize its activities, it may be feasible to break out of infinite loops.
Simple reflex agents have the following drawbacks:
● Intelligence is very restricted.
● There is no understanding of the state's non-perceptual components.
● Typically, they are too large to generate and store.
● If the environment changes, the collection of rules must be updated.
Fig 4: Reflex agent
Key takeaway
Simple reflex agents behave solely on the basis of the present percept, disregarding the remainder of the percept history. The history of all that an agent has perceived up to this point is known as percept history.
It operates by looking for a rule that has the same condition as the current circumstance. A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history. The current state is saved within the agent, which keeps track of some form of structure characterizing the parts of the environment that aren't visible.
Updating the state necessitates knowledge of:
● how the world evolves without the intervention of the agent, and
● what effect the agent's activities have on the world
There are two crucial components in a model-based agent:
● Model: A Model-based agent is one that has information about "how things happen in the world."
● Internal State: It is a perceptually based representation of the current state.
Fig 5: Model based agent
Key takeaway
A model-based agent can deal with partially observable surroundings by using a world model. The agent must keep track of his or her internal state, which is affected by each percept and is determined by the percept history.
These agents make judgments based on how close they are to achieving their goal (description of desirable situations). Every action they take is aimed at reducing the gap between them and the goal. This gives the agent the ability to choose from a variety of options, selecting the one that leads to the desired outcome.
The knowledge that underpins its judgments is explicitly represented and modifiable, allowing these agents to be more adaptable. They usually necessitate some searching and planning. The behaviour of the goal-based agent can be simply modified.
Fig 6: Goal based agent
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best. They choose actions for each state based on a preference (utility). Sometimes getting the desired outcome is insufficient. To get to a place, we might look for a faster, safer, and less expensive option.
The happiness of the agents should be considered. The agent's "happiness" is described by utility. A utility agent chooses the behaviour that maximizes the predicted utility due to the uncertainty in the world. A utility function converts a state into a real number that represents the degree of happiness associated with it.
Fig 7: Utility based agent
Key takeaway
Utility-based agents are those that are created with their end usage as building blocks in mind. When there are several viable options, utility-based agents are employed to determine which is the best.
Some programmes work in a completely artificial world, with just keyboard input, databases, computer file systems, and character output on a screen.
Some software agents (software robots or softbots) exist in rich, limitless softbots domains, on the other hand. The environment in the simulator is extremely detailed and complicated. In real time, the software agent must choose from a large number of options. A softbot that scans a customer's internet preferences and displays intriguing goods to them operates in both a real and an artificial environment.
The Turing Test environment is the most well-known artificial setting, in which one real and other artificial agents are tested on an equal footing.
Turing Test
The Turing Test can be used to assess the success of a system's intelligent behaviour.
The test involves two people and a machine that will be evaluated. One of the two people takes on the job of tester. They are all in different rooms. The tester has no idea who is a human and who is a machine. He types the queries and sends them to both intelligences, who respond to him with typed responses.
The goal of this test is to deceive the tester. The machine is deemed to be intelligent if the tester is unable to distinguish the machine's response from the human response.
Properties of Environment
The environment has a variety of characteristics.
● Discrete / Continuous: If the environment has a restricted number of different, clearly defined states (for example, chess), it is discrete; otherwise, it is continuous (For example, driving).
● Observable / Partially Observable - If the whole state of the environment can be determined from the percepts at each time point, it is observable; otherwise, it is only partially observable.
● Static / Dynamic - The environment is static if it does not change while an agent is acting; otherwise, it is dynamic.
● Multiple agents / single agent - The environment may contain other agents of the same or other kind as the agent.
● Accessible / Inaccessible − If an agent's sensory apparatus can access the entire state of the environment, the environment is accessible to that agent.
● Deterministic / Non-deterministic − If the current state of the environment and the actions of the agent totally determine the next state of the environment, the environment is deterministic; otherwise, it is non-deterministic.
● Episodic / Non-episodic - Each episode in an episodic environment consists of the agent seeing and then acting. The quality of its action is solely determined by the episode. The actions in prior episodes have no bearing on subsequent episodes.
References:
- Elaine Rich, Kevin Knight and Shivashankar B Nair, “Artificial Intelligence”, Mc Graw Hill Publication, 2009.
- Dan W. Patterson, “Introduction to Artificial Intelligence and Expert System”, Pearson Publication,2015.
- Saroj Kaushik, “Artificial Intelligence”, Cengage Learning, 2011.