Материал: искусственный интеллект

Внимание! Если размещение файла нарушает Ваши авторские права, то обязательно сообщите нам

suai.ru/our-contacts

quantum machine learning

assigned to one of the pair of answers

ψY Y

ψ= ψY N .

ψNY

ψNN

We set the initial state to a uniform distribution ψij = 1/2.

Second, for a context, (A, B) where A represents the first attribute and B represents the second, we constructed two projectors: one projector, PA(Y ) for answering Yes to the first attribute, and another projector PB(Y ) for answering Yes to the second attribute. These two projectors were defined as

follows:

 

 

,

I2 =

0

1

 

1

0

 

MY =

0

0

,

 

1

0

 

PA(Y ) = UA · (MY I2) · UA,

PB(Y ) = UB · (I2 MY ) · UB.

The projector for the No answers were then defined by PA(N) = I4 − PA(Y ) and PB(N) = I4 − PB(Y ), where I4 = I2 I2.

Third, the unitary matrix for each attribute was computed from a Hermitian matrix using the matrix exponential function

UA = exp(−i · (π/2) · HA),

UB = exp(−i · (π/2) · HB).

Fourth, the Hamiltonian matrix for the first attribute within a context (A, B) was defined as follows:

HA = VA I2

 

1A

−μA .

VA = 1 + μA2

1

 

μ

1

This “rotates” the amplitudes for the first attribute toward or away from the Yes answer depending on the parameter μA. The unitary matrix for the

23

suai.ru/our-contacts

quantum machine learning

second attribute within the same context (A, B) was initially defined in a similar manner as

HB = (I2 VB) ,

1 −μB

,

VB = 1 + μB2

 

1

 

μB

1

 

which “rotates” the amplitudes for the second attribute toward or away from the Yes answer depending on the parameter μB. However, we also added another component, HC , to the Hamiltonian for the second attribute

HB = (I2 VB) − γA,B · HC ,

1 1

 

0

1

 

1

 

1

 

0

0

 

HC =

1

1

 

1

0

+

1 1

 

0

0

 

 

 

 

 

 

 

 

 

 

 

The HC component in the last equation was a key parameter used to entangle the state. When γ is positive, it creates an entangled state with high amplitudes on ψY Y and ψNN , and when γ is negative it creates high amplitudes on ψY N and ψNY . An entangled state is required to produce dependencies in the predicted joint probability tables. The parameter, γA,B, was permitted to vary across each of the 6 contexts to allow di erent dependencies across the 6 tables.

Finally, we used the quantum rules to compute the choice probabilities. For example, the probability of YN when asked about attributes AI was

computed by

p(Y N|AI) = PI (N) · PA(Y ) · ψ 2

and the probability of NY when asked about the attributes IS was computed by

p(NY |SI) = PI (Y ) · PS(N) · ψ 2 .

This HSM model entails estimating 10 parameters for each type of stimulus: one parameter μX associated with each of the 4 variables, and one parameter γA,B associated with each of the 6 contexts. The parameters were estimated from the data separately for each type of stimulus using maximum likelihood methods. The parameter estimates are shown in Table 5. Positive values of μ tend to increase the marginal probabilities of a variable, and

24

suai.ru/our-contacts

quantum machine learning

Table 5: Parameters estimated for each type of avatar

 

 

μA

μI

μS

μH

 

 

 

 

 

 

 

 

 

non sexualized

0.4065

1.1792

0.7627

1.0852

 

 

sexualized

0.3482

-0.4430

0.3339

-0.5470

 

 

 

 

 

 

 

 

 

 

γAI

γAS

γAH

γIS

γIH

γSH

 

 

 

 

 

 

 

non sexualized

0.3163

0.4465

0.2110

-1.6456

-1.9623

0.3717

sexualized

0.1924

-1.5400

0.2267

0.2798

-2.0993

0.2769

Table 6: Probabilities predicted by the HSM model. The predictions are organized in the same manner as the observed data table.

 

 

non-sexualized

 

 

 

sexualized

 

 

 

 

 

 

 

 

 

 

 

 

YY

YN

NY

NN

 

YY

YN

NY

NN

AI

0.6584

0.0344

0.2373

0.0700

 

0.2751

0.3923

0.0645

0.2682

AS

0.6224

0.0704

0.1789

0.1284

 

0.5008

0.1666

0.1003

0.2323

AH

0.6544

0.0384

0.2470

0.0603

 

0.2593

0.4081

0.0497

0.2829

IS

0.7619

0.1517

0.0274

0.0590

 

0.2326

0.0594

0.3467

0.3613

IH

0.8261

0.0875

0.0406

0.0458

 

0.1569

0.1350

0.1112

0.5969

SH

0.7514

0.0699

0.1327

0.0460

 

0.2776

0.3834

0.0457

0.2933

negative values tend to decrease the marginals. Positive values of γ tend to entangle answers that agree (YY,NN), and negative values tend to entangle answers that disagree (YN,NY).

The predictions of the HSM model are shown in Table 6. Comparing Tables 4 and 6, one can see that the model makes fairly accurate predictions for the contexts from both stimuli.

7.3. Comparison of HSM and Joint probability models

We compared the fits of the 4 − way Joint probability model to the HSM model by using the Bayesian information criterion (BIC). The HSM model is not nested within (not a special case of) the Joint probability model, and so we used the BIC criterion to compare models, which can be used to compare non-nested models. The Joint probability model has 15 · 2 = 30 free parameters for both types of stimuli, which is 10 more than the 10 · 2 = 20 parameters used by the HSM model. Usually a model with more parameters fits better only because of the advantage produced by extra parameters. The BIC provides a balance between accuracy and parsimony. It compares the G2

25

suai.ru/our-contacts

quantum machine learning

for each model with a penalty for extra parameters: BIC(HSM vs. Joint) =

(G2HSM − G2Joint)−p·ln(N), where p = the di erence in number of parameters (10 in this case) and N = total number of observations (N = 8832 in this

case). If BIC(HSM vs. Joint) is negative, then the simpler HSM model is preferred, and if it is positive, then the more complex Joint model is preferred.

Recall that for the Joint probability model, G2sat − G2joint = 20.6521; the corresponding value for the HSM model is G2sat − G2HSM = 82.79 and the di erence equals G2HSM − G2joint = 62.14, which is far below the penalty 10 · ln(8832) = 90.86 for the extra 10 parameters used by the Joint model.

Therefore the BIC(HSM vs. Joint) is negative and clearly favors the HSM model over the Joint probability model.

8. Summary and alternative probabilistic models

8.1. Summary

HSM models provide a simple and low dimensional method for data fusion when researchers collect multiple contingency tables formed from measurement of subsets of p variables. The power of HSM models to perform data fusion is produced by the inclusion of incompatible variables. When variables are compatible, quantum probability theory works like classical probability theory, and the Hilbert space dimensionality increases exponentially as the number of compatible variables increases. However, when variables are incompatible, it is unlike classical probability theory, and the Hilbert space dimensionality remains constant as the number of incompatible variables increases. This reduction in dimensionality is achieved by using rotation of the basis vectors to generate new incompatible variables. In this way, the inclusion of additional variables does not increase the dimension of the vector space that is used to represent the data.

This article describes the general methods that we use to build HSM models. We illustrated these principles using an artificial data set in which 3 variables were used to generate 6 di erent contingency tables. An HSM model was built that perfectly reproduced the 6 tables, even though no 3- way joint probability could fit the data. We also applied these principles to a real data set consisting of 6 di erent 2 × 2 tables constructed from pairs of 4 binary variables. The joint probability model based on the 4 observed variables produced statistically significant deviations from the 6 observed data tables. A simpler HSM model produced a better account than the

26

suai.ru/our-contacts

quantum machine learning

joint probability model of the 6 observed data tables based on a BIC model comparison index.

HSM models provide new contributions to the current set of probabilistic and statistical tools for contingency table analysis. Loglinear/categorical data models only apply to a single table containing all p variables, whereas the HSM models can be applied to multiple tables containing di erent subsets of the p variables. Bayesian network models can also be applied to collections of tables; however, they assume the existence of a complete p − way joint distribution, and it is often the case that no complete p − way joint distribution exists. HSM models can be applied to collections of tables even when no p − way joint distribution exists to reproduce the collection.

8.2. Alternative probabilistic models

In this article, we have provided empirical evidence that HSM theory provides a useful way to model collections of contingency tables formed from subsets of p variables that cannot be reproduced by a complete p − way joint distribution. However, this is not the only way, and there are other probabilistic models that could be considered. One way that we mentioned earlier is to postulate a higher dimensional joint distribution with additional random variables [16] and use probabilistic data base programming methods [8] to form the joint distribution. A second way is to relax some of the axioms of Kolmogorov theory [23] and form a generalized probability theory. A third way is to expand the field over the vector space from complex to hyberbolic [20]. A fourth way is to propose a general probabilistic mechanism capable of generating either classical or quantum probabilities as a special case [3]. However, at this point in time, the advantage of HSM over these alternatives is that HSM is based on a coherent set of axioms (supporting Gleason’s theorem), HSM provides a specific and well defined algorithm, the models derived from HSM use a reasonably small number of parameters, and HSM models can be rigorously tested with empirical data. At this time, the alternatives mentioned above either lack an axiomatic foundation, or they are too general to specify and apply to real data, or they do not permit rigorous empirical tests.

9.References

[1]S. Abramsky Relational databases and Bells theorem In search of elegance in the theory and practice of computation Tannen,V. and Wong,

27