Материал: искусственный интеллект

Внимание! Если размещение файла нарушает Ваши авторские права, то обязательно сообщите нам

suai.ru/our-contacts

quantum machine learning

172 D. Gkoumas et al.

Concerning the Eq. (10), the probability for both documents to be relevant (i.e., the state |RtRt ) regarding the text-based modality equals |a1a2|2. If we measure only the probability of the first document to be relevant concerning the text-based modality results again in |a1a2|2. Then after the measurement, the probability for the second document to be relevant is equal to 1. Consequently, we can simultaneously predict the probability of the second document to be relevant concerning the image-based modality, which is equal to cos2θ, where θ is the angle between the image-based and text-based basis (Fig. 10). Similar outcomes result once we measure the probability for both documents to be irrelevant (i.e., the state |RtRt in Eq. (10)), one relevant and the other irrelevant

(i.e., the state |RtRt in Eq. (11))), or one irrelevant and the other relevant (i.e.,

the state |RtRt in Eq. (11)).

In Sect. 3, we have described the CHSH inequality defining four observables, where each observable has two binary values ±1 thus gives two mutually exclusive outcomes. In a similar manner, in our case, for the document DM 1, we have variables Rt1 and Ri1, which take values 1, 1, where Rt1 = 1 corresponds to the basis vector |Rt1 and Rt1 = 1 corresponds to its orthogonal basis vector

|Rt1 . Similarly, Ri1 = 1 corresponds to the basis vector |Ri1 and Ri1 = 1

corresponds to its orthogonal basis vector |Ri1 . For the document DM 2, we have variables Rt2 and Ri2 which take values 1, 1, where Rt2 = 1 corresponds to the basis vector |Rt2 and Rt2 = 1 corresponds to its orthogonal basis vector

|Rt2 . Similarly, Ri2 = 1 corresponds to the basis vector |Ri2 and Ri2 = 1

corresponds to its orthogonal basis vector |Ri2 . Then Eq. (5) results in

| Rt1Rt2 + Rt2Ri1 + Rt1Ri2 − Ri1Ri2 | ≤ 2,

(12)

where

Rt1Rt2 = ((+1)p(Rt1) + (1)p(Rt1)) ((+1)p(Rt2) + (1)p(Rt2))

= p(Rt1)p(Rt2) − p(Rt1)p(Rt2) − p(Rt1)p(Rt2) + p(Rt1)p(Rt2),

Rt2Ri1 = ((+1)p(Rt2) + (1)p(Rt2)) ((+1)p(Ri1) + (1)p(Ri1))

= p(Rt2)p(Ri1) − p(Rt2)p(Ri1) − p(Rt2)p(Ri1) + p(Rt2)p(Ri1),

Rt1Ri2 = ((+1)p(Rt1) + (1)p(Rt1)) ((+1)p(Ri2) + (1)p(Ri2))

= p(Rt1)p(Ri2) − p(Rt1)p(Ri2) − p(Rt1)p(Ri2) + p(Rt1)p(Ri2),

Ri1Ri2 = ((+1)p(Ri1) + (1)p(Ri1)) ((+1)p(Ri2) + (1)p(Ri2))

= p(Ri1)p(Ri2) − p(Ri1)p(Ri2) − p(Ri1)p(Ri2) + p(Ri1)p(Ri2).

The above products of probabilities are defined as joint probabilities between two independent outcomes. The violation of Eq. (12) is a sign of entanglement, and the pair of documents may result in one of the aforementioned Bell states (Eqs. (10), (11)) as have been described above.

suai.ru/our-contacts

quantum machine learning

Investigating Non-classical Correlations

173

5 Experiment Settings

5.1Dataset

The proposed model is tested on the ImageCLEF2007 data collection [12], the purpose of which is to investigate the e ectiveness of combining image and text for retrieval tasks. Out of 60 test queries we randomly picked up 30 ones, together with the ground truth data. Each query describing user information need consists of three sample images and a text description, whereas each document consists of an image and a text description. For every query, we created a subset of 300 relevant and irrelevant documents, which includes firstly all the relevant documents for the query, and the rest being irrelevant documents. The dataset is used for investigating both the Bell states (Eqs. (10) and (11)). The number of relevant documents per query ranges from 11 to 98.

5.2Image and Text Representations-Mono-modal Baselines

The late fusion process is based on mono-modal retrieval scores. For the visual information, feature extraction consists of using the representations learned by the VGG16 model [18], with weights pre-trained on ImageNet to extract features from images, resulting in a feature vector of 2048 floating values for each image. After feature vector extractions, we compute the similarity scores between a submitted visual query and images in the dataset based on Cosine function. For textual information, a query expansion approach has been applied extending the query with the ten most frequent terms according to the ground truth text-based documents. This indeed corresponds to a simulated explicit relevance feedback scenario. Then, the TF-IDF vector representation is used for calculating the text-based Cosine similarity between the a query and text documents. Cosine similarity is particularly used in positive space, where the Cosine similarity score is bounded in [0,1]. In our case, we make use of Cosine similarity score for approximating the probability of relevance.

5.3Experimental Procedure

At the first step, for both text-based and image-based modalities, the Cosine function is employed to approximate the probability of relevance according to a multi-modal query (Fig. 5). Then, we create pairs of relevant documents. In the next step, expectation values are computed based on probabilities of relevance according to the process being described in Sect. 4. The probability for a document to be relevant concerning a modality is equal to the result of Cosine function. Consequently, the probability for a document to be irrelevant concerning the same modality equals 1 minus the result of Cosine function. Then, we fit the CHSH inequality with the calculated expectation values and check for any existence of violation. For each query, we calculate in total the percentage of documents show a violation of the CHSH inequality. At the end of the experiment, we calculate the percentage of queries showing violation.

suai.ru/our-contacts

quantum machine learning

174 D. Gkoumas et al.

6 Results and Discussion

The experiment results are out of our expectations since we did not observe any violation of Bell’s inequality. This implies that in the context of our experimental setting non-classical correlations between pairs of documents may not exist, but also that the hypothesis of rotation invariance falls down. Thus, the image-based and text-based bases are not equal Bell states as defined in Eq. (4).

This result may be related to our experimental setting that the outcomes of the observables are initially independent. For instance, the probability of the text-based relevance of the first document does not a ect the probability of the text-based relevance of the second document. Thus, the joint probability of relevance is calculated as a product of individual relevance probabilities. However, in [1, 2, 6, 7] the Bell inequality has been violated. In those experiments, the users are asked to report their judgments on composite states. Hence the joint probabilities can be directly estimated from the judgments. Thus, the expectation values are calculated under an implicit assumption that the outcomes can be incompatible. This assumption may result in “conjunction fallacy” [20] violating the monotonicity law of probability by overestimating the joint probability, thus violating the Bell inequality.

Our result may be also due to the dataset that has been used to conduct the experiment. In ImageClef2007, the outcomes are independent, i.e., the text-based and image-based relevance, therefore we cannot make the opposite assumption. Thus, we may need another dataset containing relevance judgment for a pair of documents. Additionally, we may search for a dataset where Bell states (i.e., Eq. (2)) preexist, such that an interaction between two documents cannot be validly decomposed and modeled as interaction of separate documents. Then, the Bell inequality may be violated for those cases.

Finally, we experimentally investigated the violation of the Bell inequality in a small-scale experiment. In the current experiment, for each query, we focused on a small amount of relevant and irrelevant multimodal documents trying to search for non-classical correlations between two documents. However, it is worth conducting a large-scale experiment as well, looking also at a general first round retrieval process, or even at relevance feedback scenario. Moreover, it would be interesting to investigate the existence of non-classical correlations among many documents. Then, the CHSH inequality should be generalized for systems with multiple settings or basis [10].

7 Conclusion

In this paper, we have investigated non-classical correlations between pairs of decision fused multimodal documents. We examined the existence of such correlations through the violation of the CHSH inequality. In this case, a violation implies that measuring a mono-modal decision in a document, we could instantaneously predict with certainty a mono-modal decision in the other system acquiring information about how to fuse local decisions. Unfortunately, we did

suai.ru/our-contacts

quantum machine learning

Investigating Non-classical Correlations

175

not find any violation of the Bell inequality. This result may be related to our assumption that the outcomes of the observables are initially independent. The result may also be due to the dataset. On one hand there is no real user involved in relevance judgment; on the other hand there do not exist initial Bell states between two multimodal documents. Nevertheless, the experimental results and discussions may provide theoretical and empirical insights and inspirations for future development of this direction.

Acknowledgement. This work is funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 721321.

A Appendix

The expectation of a random variable X that takes the values {+, −} according to the probability distribution PX (+), PX () is defined as

X = (+)PX (+) + ()PX ().

For two random variables X , Ψ , that take the values {+, −} according to the probability distribution PX (+), PX () and PΨ (+), PΨ () respectively, the expectation value is defined as the product resulting in

X, Ψ = ((+)PX (+) + ()PX ()) ((+)PΨ (+) + ()PΨ ())

=(+)(+)(PX (+)PΨ (+)) + (+)()(PX (+)PΨ ())

+()(+)(PX ()PΨ (+)) + ()()(PX ()PΨ ()).

References

1.Aerts, D., Sozzo, S.: Quantum structure in cognition: why and how concepts are entangled. In: Song, D., Melucci, M., Frommholz, I., Zhang, P., Wang, L., Arafat, S. (eds.) QI 2011. LNCS, vol. 7052, pp. 116–127. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24971-6 12

2.Aerts, D., Sozzo, S.: Quantum entanglement in concept combinations. Int. J. Theor. Phys. 53(10), 3587–3603 (2014)

3.Aspect, A., Grangier, P., Roger, G.: Experimental realization of Einstein-Podolsky- Rosen-Bohm Gedankenexperiment: a new violation of Bell’s inequalities. Phys. Rev. Lett. 49(2), 91 (1982)

4.Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia Syst. 16(6), 345–379 (2010)

5.Baltruˇsaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41, 423–443 (2018)

6.Bruza, P.D., Kitto, K., Ramm, B., Sitbon, L., Song, D., Blomberg, S.: Quantumlike non-separability of concept combinations, emergent associates and abduction. Logic J. IGPL 20(2), 445–457 (2011)

suai.ru/our-contacts

quantum machine learning

176 D. Gkoumas et al.

7.Bruza, P.D., Kitto, K., Ramm, B.J., Sitbon, L.: A probabilistic framework for analysing the compositionality of conceptual combinations. J. Math. Psychol. 67, 26–38 (2015)

8.Cirel’son, B.S.: Quantum generalizations of Bell’s inequality. Lett. Math. Phys. 4(2), 93–100 (1980)

9.Clauser, J.F., Horne, M.A., Shimony, A., Holt, R.A.: Proposed experiment to test local hidden-variable theories. Phys. Rev. Lett. 23(15), 880 (1969)

10.Gisin, N.: Bell inequality for arbitrary many settings of the analyzers. Phys. Lett. A 260(1–2), 1–3 (1999)

11.Gleason, A.M.: Measures on the closed subspaces of a Hilbert space. J. Math. Mech. 6, 885–893 (1957)

12.Grubinger, M., Clough, P., Hanbury, A., M¨uller, H.: Overview of the ImageCLEFphoto 2007 photographic retrieval task. In: Peters, C., et al. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 433–444. Springer, Heidelberg (2008). https://doi.org/10. 1007/978-3-540-85760-0 57

13.Hou, Y., Song, D.: Characterizing pure high-order entanglements in lexical semantic spaces via information geometry. In: Bruza, P., Sofge, D., Lawless, W., van Rijsbergen, K., Klusch, M. (eds.) QI 2009. LNCS (LNAI), vol. 5494, pp. 237–250. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00834-4 20

14.Hou, Y., Zhao, X., Song, D., Li, W.: Mining pure high-order word associations via information geometry for information retrieval. ACM Trans. Inf. Syst. (TOIS) 31(3), 12 (2013)

15.Melucci, M.: Introduction to Information Retrieval and Quantum Mechanics, pp. 156–158, 176–181, 212–213, 217–221. Springer, Berlin (2015). https://doi.org/10. 1007/978-3-662-48313-8

16.Nielsen, M.A., Chuang, I.: Quantum computation and quantum information (2002)

17.Pathak, A.: Elements of Quantum Computation and Quantum Communication, pp. 92–98. Taylor & Francis, Abingdon (2013)

18.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

19.Stenger, V.J.: Timeless Reality: Symmetry, Simplicity and Multiple Universes. (Chap. 12)

20.Tversky, A., Kahneman, D.: Extensional versus intuitive reasoning: the conjunction fallacy in probability judgment. Psychol. Rev. 90(4), 293 (1983)

21.Van Rijsbergen, C.J.: The Geometry of Information Retrieval. Cambridge University Press, Cambridge (2004)

22.Veloz, T., Zhao, X., Aerts, D.: Measuring conceptual entanglement in collections of documents. In: Atmanspacher, H., Haven, E., Kitto, K., Raine, D. (eds.) QI 2013. LNCS, vol. 8369, pp. 134–146. Springer, Heidelberg (2014). https://doi.org/ 10.1007/978-3-642-54943-4 12