Материал: искусственный интеллект

Внимание! Если размещение файла нарушает Ваши авторские права, то обязательно сообщите нам

suai.ru/our-contacts

quantum machine learning

the article for this topic). One group of participants rated the pair of topics in a one order (e.g., one group rated relevance first) and another group in the opposite order (e.g., another rated interest first). The results revealed strong an significant order e ects from three of the five topics.

Similar findings of order e ects on relevance judgments for document retrieval were reported by Wang et al. [31] using Chinese student participants. A total 15 di erent pairs of topics were taken from articles appearing in Wikipedia and the Chinese Daily. In this study, a pair of articles were retrieved on a topic, and the participants had to rate each article in the pair for its relevance. The order that the articles were rated in a pair was manipulated across groups. Out of 15 topics, 4 produced statistically significant order e ects.

Violations of marginal invariance were observed by DiNunzio, Bruza, and Sitbon [14]. Once again, a large sample of participants were collected from the Amazon Mechanical Turk platform. In this study, participants were asked to categorize 82 di erent articles from Reuters about shipping and crude oil. For each article, one group was asked about a single category (e.g., was this article about shipping), and another group was asked about the conjunction (e.g., crude oil and shipping). The main comparison was between the proportion of answers to a single categorization (e.g., proportion of shipping categories when asked alone) and the total proportion to summed across the two mutually exclusive and exhaustive conjunctions (e.g., proportion of oil and shipping plus proportion of not oil and shipping). Large and significant di erences were found between the two conditions: for experienced workers, the total probability from the conjunction task matched the categorization probabilities reported by Reuters, but their proportions for the single categories over or underestimated those reported by Reuters.

4.Why apply quantum probability to data from information retrieval?

The idea of applying quantum probability to the field of information retrieval was proposed several years ago by Keith van Rijsbergen [28]. Dominic Widdows also promoted the use of quantum theory to information retrieval [30]. For the most recent developments concerning the application of quantum theory to information retrieval, see [22].

One of the reasons that van Rijsbergen gives for applying quantum theory to information retrieval is that it provides a su ciently general yet rigorous

8

suai.ru/our-contacts

quantum machine learning

formulation for integration of all three kinds of approaches – logical, vector space, and probabilistic – used in the past for information retrieval. As he points out, important concepts in quantum probability – state vectors, observables, uncertainty, complementarity, superposition, and compatibility

– all readily translate to analogous concepts in information retrieval. Another important reason for considering quantum theory is that much

of the data of interest is generated by human judgments, which frequently violate various rules of classical (Kolmogorov) probability. There is now a large literature that applies quantum theory to human judgment and decision making [10] [20]. For example, human judgments have been found to violate the rule that conjunctive probabilities cannot exceed the probability of a constituent event, which can be interpreted as a violation of total probability. Quantum probability provides a formulation for explaining these and other phenomena that appear puzzling from a classical probability point of view (see, e.g., [11] [2]).

5. Basics of quantum probability theory

HSM models are based on quantum probability theory and so we need to briefly review some of the basic principles used from this theory.3

Suppose we have p variables (Yi, i = 1, · · · , p) and each variable, such as Yi, produces one of a finite set of ni values when measured. In quantum theory, Yi is called an observable. The measurement outcome generated by measuring one of the p variables produces an event. For example, if variable Y1 is measured and it produces the value yi, then we observe the event (Y1 = yi).

Quantum theory represents events within a Hilbert space H. Quantum theory defines an event A as a subspace of the Hilbert space. Each subspace, such as A, corresponds to a projector, denoted PA for subspace A, which projects vectors into the subspace.

In quantum theory, a sequence of events, such as A and then B, denoted AB, is represented by the sequence of projectors PBPA. If the projectors commute, PAPB = PBPA, then the product of the two projectors is a projector corresponding to the subspace A ∩ B, that is, PBPA = P (A ∩ B); and the events A and B are said to be compatible. However, if the two projectors

3See [10], [20], [28] for tutorials.

9

suai.ru/our-contacts

quantum machine learning

do not commute, PBPA = PAPB, then neither their product is a projector, and the events are incompatible.

Quantum theory uses a unit length state vector, denoted , to assign probabilities to events as follows:4

p(A) = PA 2 ,

(2)

Quantum probabilities satisfy an additive measure: p(A) 0, p(H) = 1, and if PAPB = 0, then p(A B) = p(A)+p(B). In fact, Equation (2) is the unique way to assign probabilities to subspaces that form an additive measure for dimensions greater than 2 [18].

In quantum theory, the definition of a conditional probability is

p(B|A) = PBPA |ψ 2 , p(A)

and so the probability of the sequence AB equals p(AB) = p(A) · p(B|A) =PBPA 2 . Extensions to sequences with more than two events follows the same principles: The probability of the sequence (AB)C equals PC (PBPA) 2 for quantum theory.

5.1. Building Projectors

This section describes a general way to construct the projectors for events in the Hilbert space, and to formally describe the conditions that produce incompatibility. In the following, |V denotes a vector in the Hilbert space,V |W denotes an inner product, |V V | denotes an outer product, and P denotes a Hermitian transpose.

In general, a projector, denoted P , operating in an N−dimensional Hilbert space is defined by the two properties P = P = P 2. By the first property, P is Hermitian, and so it can be decomposed into N orthonormal eigenvectors; by the second property, P has only two eigenvalues, which are simply (0, 1). Define |Vj , j = 1, · · · , N as the set of N orthonormal eigenvectors of P . The projector P can be expressed in terms of the eigenvectors as follows

P = λj |Vj Vj | , (3)

j

4A more general approach uses what is called a density operator rather than a pure state vector, but to keep ideas simple, we use the latter.

10

suai.ru/our-contacts

quantum machine learning

where the outer product, |Vj Vj|, is the projector that projects into the ray spanned by eigenvector |Vj , and λj = 1 if |Vj corresponds to an eigenvalue of 1, and λj = 0 if |Vj corresponds to an eigenvalue of 0. These N eigenvectors form an orthonormal basis that spans the Hilbert space. Every vector, such as |ψ H can be expressed as a linear combination of these basis (eigen) vectors

N

 

 

= j

φj · |Vj

(4)

If two projectors, PA, PB share all of the same eigenvectors, then they commute. In other words, two events A, B are compatible if they are described in terms of the same basis. If the two projectors do not share all of the same eigenvectors, then they do not commute, and the events A, B are described by two di erent bases. They are incompatible, and must be evaluated sequentially, because one needs to change from one basis to evaluate the first event, to another basis to evaluate the second event, making them incompatible.

Define |Vj , j = 1, · · · , N as the basis used to describe event A, and define |Wj , j = 1, · · · , N as the basis used to describe event B. We can change from one basis to another by a unitary transformation (a “rotation” in Hilbert space)

|Wj = U |Vj , j = 1, · · · , N,

(5)

where U is defined by UU = I, that is, U is an isometric transformation that preserves inner products. Therefore, the projector for event B can be re-expressed in terms of the event A basis |Vj , j = 1, · · · , N as follows

N

 

 

PB = j

λj |Wj Wj|

 

= U λj |Vj Vj| U.

(6)

According to Equation (5), the unitary transformation U represents the transitions from state |Wi to state |Vj by the inner product Vj |Wi .

So far, we have presented a general method for building the projectors by defining a basis for the vector space and by transforming from one basis to another using unitary transformation. Then the next question is how to build the unitary transformation? In general, any unitary transformation

11

suai.ru/our-contacts

quantum machine learning

can be built from a Hermitian operator H as follows:

U = exp(−i · H).

(7)

The right hand side is exponential function of the Hermitian operator H. In summary, the HSM program selects a Hermitian operator H for Equa-

tion (7), and then uses the Hermitian operator to build the unitary operator U which provides the relation between projectors PA and PB for incompatible events. The beauty of using a vector space is that it provides an infinite number of ways to generate incompatible variables by unitary “rotation,” and yet remain within the same N−dimensional space. This is how an HSM model maintains a low dimensional representation even when there are a large number of variables.

5.2. Building the Hilbert space

This section describes how we construct a Hilbert space to represent the p variables. This construction depends on the compatibility relations between the variables. For this section, we need to use the Kronecker (tensor) product between two matrices, denoted as P Q.

To begin building the Hilbert space, suppose we measure a single variable, say Y1, that can produce n1 values corresponding to the mutually exclusive and exhaustive set of events (Y1 = yi), i = 1, · · · n1. To represent these events in a Hilbert space, we partition the space into n1 orthogonal subspaces. Each subspace, such as (Y1 = yi), corresponds to a projector P (Y1 = yi). The projectors for all of the events are pairwise orthogonal,

P (Y1 = yi)P (Y1 = yj) = 0, and complete,

i P (Y1 = yi) = I (where I is the

identity that projects onto the entire

Hilbert space). These n1

events are all

 

 

compatible, and the projectors are all commutative, because they are all orthogonal to each other. Each projector generates N1 ≥ n1 eigenvectors, and the projectors all share the same eigenvectors, but with di erent eigenvalues. These N1 eigenvectors provide the basis for spanning a N1dimensional Hilbert space, HN1 .

Nothing requires the number of eigenvectors of a projector to be equal to the number of observed values of the variable used by the researcher. For example, it would be an arbitrary decision by a researcher to use only 2 values (yes,no) for attribute A, but 3 values for attribute B, and 4 values for attribute C. The researcher could have used 5 rating values for all three variables. So it would not make sense to assume that variable A is represented in

12