Value granulation .28discretization.2Fquantization.29 Granular computing
benefits of value granulation: implications here exist @ resolution of
{
x
i
,
y
j
}
{\displaystyle \{x_{i},y_{j}\}}
not exist @ higher resolution of
{
x
i
,
y
j
}
{\displaystyle \{x_{i},y_{j}\}}
; in particular,
∀
x
i
,
y
j
:
x
i
↛
y
j
{\displaystyle \forall x_{i},y_{j}:x_{i}\not \to y_{j}}
, while @ same time,
∀
x
i
∃
y
j
:
x
i
↔
y
j
{\displaystyle \forall x_{i}\exists y_{j}:x_{i}\leftrightarrow y_{j}}
.
for example, simple learner or pattern recognition system may seek extract regularities satisfying conditional probability threshold such
p
(
y
=
y
j
|
x
=
x
i
)
≥
α
{\displaystyle p(y=y_{j}|x=x_{i})\geq \alpha }
. in special case
α
=
1
{\displaystyle \alpha =1}
, recognition system detecting logical implication of form
x
=
x
i
→
y
=
y
j
{\displaystyle x=x_{i}\rightarrow y=y_{j}}
or, in words, if
x
=
x
i
{\displaystyle x=x_{i}}
,
y
=
y
j
{\displaystyle y=y_{j}}
. system s ability recognize such implications (or, in general, conditional probabilities exceeding threshold) partially contingent on resolution system analyzes variables.
as example of last point, consider feature space shown right. variables may each regarded @ 2 different resolutions. variable
x
{\displaystyle x}
may regarded @ high (quaternary) resolution wherein takes on 4 values
{
x
1
,
x
2
,
x
3
,
x
4
}
{\displaystyle \{x_{1},x_{2},x_{3},x_{4}\}}
or @ lower (binary) resolution wherein takes on 2 values
{
x
1
,
x
2
}
{\displaystyle \{x_{1},x_{2}\}}
. similarly, variable
y
{\displaystyle y}
may regarded @ high (quaternary) resolution or @ lower (binary) resolution, takes on values
{
y
1
,
y
2
,
y
3
,
y
4
}
{\displaystyle \{y_{1},y_{2},y_{3},y_{4}\}}
or
{
y
1
,
y
2
}
{\displaystyle \{y_{1},y_{2}\}}
, respectively. noted @ high resolution, there no detectable implications of form
x
=
x
i
→
y
=
y
j
{\displaystyle x=x_{i}\rightarrow y=y_{j}}
, since every
x
i
{\displaystyle x_{i}}
associated more 1
y
j
{\displaystyle y_{j}}
, , thus,
x
i
{\displaystyle x_{i}}
,
p
(
y
=
y
j
|
x
=
x
i
)
<
1
{\displaystyle p(y=y_{j}|x=x_{i})<1}
. however, @ low (binary) variable resolution, 2 bilateral implications become detectable:
x
=
x
1
↔
y
=
y
1
{\displaystyle x=x_{1}\leftrightarrow y=y_{1}}
,
x
=
x
2
↔
y
=
y
2
{\displaystyle x=x_{2}\leftrightarrow y=y_{2}}
, since every
x
1
{\displaystyle x_{1}}
occurs iff
y
1
{\displaystyle y_{1}}
,
x
2
{\displaystyle x_{2}}
occurs iff
y
2
{\displaystyle y_{2}}
. thus, pattern recognition system scanning implications of kind find them @ binary variable resolution, fail find them @ higher quaternary variable resolution.
issues , methods
it not feasible exhaustively test possible discretization resolutions on variables in order see combination of resolutions yields interesting or significant results. instead, feature space must preprocessed (often entropy analysis of kind) guidance can given how discretization process should proceed. moreover, 1 cannot achieve results naively analyzing , discretizing each variable independently, since may obliterate interactions had hoped discover.
a sample of papers address problem of variable discretization in general, , multiple-variable discretization in particular, follows: chiu, wong & cheung (1991), bay (2001), liu et al. (2002), wang & liu (1998), zighed, rabaséda & rakotomalala (1998), catlett (1991), dougherty, kohavi & sahami (1995), monti & cooper (1999), fayyad & irani (1993), chiu, cheung & wong (1990), nguyen & nguyen (1998), grzymala-busse & stefanowski (2001), ting (1994), ludl & widmer (2000), pfahringer (1995), & cercone (1999), chiu & cheung (1989), chmielewski & grzymala-busse (1996), lee & shin (1994), liu & wellman (2002), liu & wellman (2004).
Comments
Post a Comment