Rough set — Safekipedia Discoverer

In computer science, a rough set is a way to describe groups or collections of items that helps us understand uncertainty and incomplete information. It was first described by Polish computer scientist Zdzisław I. Pawlak. A rough set works by creating two sets around the original group: a lower approximation, which contains items we are sure belong to the group, and an upper approximation, which contains items that might belong to the group.

The idea behind rough sets is to deal with situations where we don't have perfect information. For example, if we are trying to sort objects based on some features and some objects share the same features, rough sets help us understand what we know for certain and what is still unclear.

In the basic form of rough set theory, these lower and upper approximations are clear, exact sets. However, in other versions, they can be fuzzy sets, allowing for even more flexibility when dealing with uncertainty. This makes rough set theory a useful tool in many areas, such as data analysis, decision-making, and artificial intelligence.

Definitions

The following section explains the basic ideas of rough set theory, first described by Zdzisław I. Pawlak. Rough sets help us understand groups of items when we don’t have perfect information.

In simple terms, a rough set looks at a group of items and creates two other groups: one that we are sure belongs to the original group (the lower approximation) and one that might include items from the original group (the upper approximation). This helps us deal with uncertainty — we know what is definitely in the group and what could be in the group.

The theory uses information systems, which are tables of data where each row is an object and each column is an attribute. By looking at these attributes, we can group objects that are similar, called equivalence classes. These classes help us build the lower and upper approximations of any target group we are interested in.

Sample Information System
Object	P 1 {\displaystyle P_{1}}	P 2 {\displaystyle P_{2}}	P 3 {\displaystyle P_{3}}	P 4 {\displaystyle P_{4}}	P 5 {\displaystyle P_{5}}
O 1 {\displaystyle O_{1}}	1	2	0	1	1
O 2 {\displaystyle O_{2}}	1	2	0	1	1
O 3 {\displaystyle O_{3}}	2	0	0	1	0
O 4 {\displaystyle O_{4}}	0	0	1	2	1
O 5 {\displaystyle O_{5}}	2	1	0	2	1
O 6 {\displaystyle O_{6}}	0	0	1	2	2
O 7 {\displaystyle O_{7}}	2	0	0	1	0
O 8 {\displaystyle O_{8}}	0	1	2	2	1
O 9 {\displaystyle O_{9}}	2	1	0	2	2
O 10 {\displaystyle O_{10}}	2	0	0	1	0

Rule extraction

The category representations discussed above are all extensional in nature; that is, a category or complex class is simply the sum of all its members. To represent a category is, then, just to be able to list or identify all the objects belonging to that category. However, extensional category representations have very limited practical use, because they provide no insight for deciding whether novel (never-before-seen) objects are members of the category.

What is generally desired is an intensional description of the category, a representation of the category based on a set of rules that describe the scope of the category. The choice of such rules is not unique, and therein lies the issue of inductive bias. See Version space and Model selection for more about this issue.

There are a few rule-extraction methods. We will start from a rule-extraction procedure based on Ziarko & Shan (1995).

Rules induced from the lower approximation of the concept certainly describe the concept, hence such rules are called certain. On the other hand, rules induced from the upper approximation of the concept describe the concept possibly, so these rules are called possible. For rule induction LERS uses three algorithms: LEM1, LEM2, and IRIM.

The LEM2 algorithm of LERS is frequently used for rule induction and is used not only in LERS but also in other systems, e.g., in RSES. LEM2 explores the search space of attribute–value pairs. Its input data set is a lower or upper approximation of a concept, so its input data set is always consistent. In general, LEM2 computes a local covering and then converts it into a rule set.

Decision matrix for P 4 = 1 {\displaystyle P_{4}=1}
Object	O 4 {\displaystyle O_{4}}	O 5 {\displaystyle O_{5}}	O 6 {\displaystyle O_{6}}	O 8 {\displaystyle O_{8}}	O 9 {\displaystyle O_{9}}
O 1 {\displaystyle O_{1}}	P 1 1 , P 2 2 , P 3 0 {\displaystyle P_{1}^{1},P_{2}^{2},P_{3}^{0}}	P 1 1 , P 2 2 {\displaystyle P_{1}^{1},P_{2}^{2}}	P 1 1 , P 2 2 , P 3 0 {\displaystyle P_{1}^{1},P_{2}^{2},P_{3}^{0}}	P 1 1 , P 2 2 , P 3 0 {\displaystyle P_{1}^{1},P_{2}^{2},P_{3}^{0}}	P 1 1 , P 2 2 {\displaystyle P_{1}^{1},P_{2}^{2}}
O 2 {\displaystyle O_{2}}	P 1 1 , P 2 2 , P 3 0 {\displaystyle P_{1}^{1},P_{2}^{2},P_{3}^{0}}	P 1 1 , P 2 2 {\displaystyle P_{1}^{1},P_{2}^{2}}	P 1 1 , P 2 2 , P 3 0 {\displaystyle P_{1}^{1},P_{2}^{2},P_{3}^{0}}	P 1 1 , P 2 2 , P 3 0 {\displaystyle P_{1}^{1},P_{2}^{2},P_{3}^{0}}	P 1 1 , P 2 2 {\displaystyle P_{1}^{1},P_{2}^{2}}
O 3 {\displaystyle O_{3}}	P 1 2 , P 3 0 {\displaystyle P_{1}^{2},P_{3}^{0}}	P 2 0 {\displaystyle P_{2}^{0}}	P 1 2 , P 3 0 {\displaystyle P_{1}^{2},P_{3}^{0}}	P 1 2 , P 2 0 , P 3 0 {\displaystyle P_{1}^{2},P_{2}^{0},P_{3}^{0}}	P 2 0 {\displaystyle P_{2}^{0}}
O 7 {\displaystyle O_{7}}	P 1 2 , P 3 0 {\displaystyle P_{1}^{2},P_{3}^{0}}	P 2 0 {\displaystyle P_{2}^{0}}	P 1 2 , P 3 0 {\displaystyle P_{1}^{2},P_{3}^{0}}	P 1 2 , P 2 0 , P 3 0 {\displaystyle P_{1}^{2},P_{2}^{0},P_{3}^{0}}	P 2 0 {\displaystyle P_{2}^{0}}
O 10 {\displaystyle O_{10}}	P 1 2 , P 3 0 {\displaystyle P_{1}^{2},P_{3}^{0}}	P 2 0 {\displaystyle P_{2}^{0}}	P 1 2 , P 3 0 {\displaystyle P_{1}^{2},P_{3}^{0}}	P 1 2 , P 2 0 , P 3 0 {\displaystyle P_{1}^{2},P_{2}^{0},P_{3}^{0}}	P 2 0 {\displaystyle P_{2}^{0}}

Incomplete data

Rough set theory helps us find rules even when some information is missing from our data. It can handle three types of missing details: values that were lost, values that can be any option within a group, and values that don't matter.

Scientists have studied two special cases: one where all missing values were lost, and another where all missing values didn't matter. When a missing value can be any option within a group, we can fill it in using what we know about that group. For example, if a patient's temperature is missing but we know they have the flu, and all flu patients have high or very-high temperatures, we might guess their temperature was high or very-high. Special methods also allow us to work with all three types of missing values together.

Applications

Rough set methods are useful tools in machine learning and data mining. They help find important rules and reduce the number of variables to study, making analysis easier. These methods have been used in many fields, such as bioinformatics, economics, medicine, and software engineering. Recently, rough sets have been used to help make decisions by dividing problems into three groups: accept, reject, and defer, which opens up new possibilities for future uses.

History

The idea of rough sets was introduced by Pawlak in 1981 as a way to handle unclear or vague concepts. Many researchers have since explored the mathematical properties and applications of rough sets. These tools help us understand and work with ambiguity, vagueness, and general uncertainty in data.

Extensions and generalizations

Since rough sets were first introduced, many extensions and generalizations have been developed. Early work focused on how rough sets relate to fuzzy sets, with some researchers seeing them as complementary and others viewing rough sets as a broader category that includes fuzzy sets.

Three important extensions of rough set theory are:

Dominance-based rough set approach for handling multiple criteria in decision-making.
Decision-theoretic rough sets, which uses probability to make decisions with minimum risk.
Game-theoretic rough sets, which applies game theory to optimize decision-making processes.

Rough sets can also be defined using a rough membership function, which expresses the probability that an element belongs to a set based on available information. This approach differs from fuzzy sets because the membership of combined sets cannot be calculated in the same way.