Материал: искусственный интеллект

Внимание! Если размещение файла нарушает Ваши авторские права, то обязательно сообщите нам

suai.ru/our-contacts

quantum machine learning

166

E. Di Buccio and M. Melucci

experimentation, the theoretical impact has been addressed since the proposal of the Geometry of IR [26]. In this chapter we address both aspects.

With regard to the practical impact of the QTL, some types of user, who are experts in their own application domains such as journalists and scholars, may be willing to use meet and join for building complex queries and searching a document collection by themes rather than simple and short queries and Þnding speciÞc resources. A user may meet and join subspaces in the context of vector spaces, instead of intersecting and complementing subsets. Although meet and join are well-known operators of quantum theory, we do not argue that documents and queries are quantum objects like subatomic particles. Instead, we are investigating whether the retrieval process involving expert users may exhibit some quantum-like behavior.

From the theoretical perspective there is a more profound reason suggesting the replacement of sets with spaces. Actually, an initial replacement took place with the advent of the VSM which views documents as points of a vector space and not only mere elements of Boolean sets. The main motivation driving from the Boolean sets to the vector spaces was the need of a retrieval function providing a ranking. The inner product of the VSM between document vectors and query vectors provides such a ranking because it sums up the weights of the memberships of a document and the sets to which a query belongs.

One future work will focus on media other than text, terms, and words and on modalities other than querying. Indeed, a term is bound to the easy recognition of terms in documents and to the userÕs intuition that a term corresponds to the set of documents about the concept represented by the term. When terms are combined by Boolean operators, a term has a semantics and the results, which are document sets, obtained by the operators are an extensional representation of a concept. A setbased approach to retrieval with image, video, sound, or multimedia documents is less natural than with textual documents. The content descriptors of image, video, or sound such as pixels, shapes, or chroma cannot be described by terms and the assumption that sets and set operators can express informative content does not seem as intuitive as for text. Similarly, multimodality Þts less naturally with a set-based retrieval model. When click-through and user interaction data are collected, sets are not the most obvious representation of informative content. The reason is that the language of non-textual or multimodal traits is likely to describe individuals with a logic other than a classical logic.

To the end of experimenting different modalities, some experiments are underway by using the subtopics of the TREC 2010 Web Track Test Collection as themes.4 Instead of implementing themes using index terms, we will implement themes using subtopics, which may be viewed as aspects of the main topic. The experiments will simulate a more interactive scenario than the scenario simulated in this chapter. A user will submit a query (i.e., the main topic) and the retrieval system will extract a set of pertinent themes. We will measure the effectiveness of

4http://trec.nist.gov/data/web/10/wt2010-topics.xml.

suai.ru/our-contacts

quantum machine learning

Searching for Information with Meet and Join Operators

167

the ranked list obtained by the representation which will be based only on the query terms, of the list obtained by using all the distinct terms associated with the extracted themes or by using the themes built through join and meet. Further experiments will be carried out on the Dynamic Domain Track Test Collections. The goal of the Dynamic Domain Track is to Òsupport research in dynamic, exploratory search of complex information domains.Ó5 The task is highly interactive and the interaction with the user is simulated through the Jig, which returns explicit judgments on the top Þve retrieved documents along with relevant passages in those documents. We will investigate the use of relevant passages as a source for implementing themes.

Acknowledgements This chapter is part of a project that has received funding from the European UnionÕs Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 721321.

References

1.Aerts, D., Melucci, M., de Bianchi, M. S., Sozzo, S., & Veloz, T. (2018). Special issue: Quantum structures in computer science: Language, semantics, retrieval. Theoretical Computer Science, 752, 1Ð4.

2.Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993Ð1022.

3.Broder, A. Z., Carmel, D., Herscovici, M., Soffer, A., & Zien, J. (2003). EfÞcient query evaluation using a two-level retrieval process. In Proceedings of the Twelfth International Conference on Information and Knowledge Management (pp. 426Ð434). New York, NY: ACM. http://doi.acm.org/10.1145/956863.956944.

4.Caputo, A., Piwowarski, B., & Lalmas, M. (2011). A query algebra for quantum information retrieval. In Proceedings of the IIR Workshop. http://ceur-ws.org/Vol-704/19.pdf.

5.Carpineto, C., & Romano, G. (2012). A survey of automatic query expansion in information retrieval. ACM Computing Surveys, 44(1), 1Ð50. http://doi.acm.org/10.1145/2071389. 2071390.

6.Cooper, W. (1995). Some inconsistencies and misidentiÞed modeling assumptions in probabilistic information retrieval. ACM Transactions on Information Systems, 13(1), 100Ð111.

7.Croft, W., & Lafferty, J. (Eds.). (2003). Language modeling for information retrieval. Berlin: Springer.

8.Croft, W., Metzler, D., & Strohman, T. (2009). Search engines: Information retrieval in practice. Boston: Addison Wesley. http://ciir.cs.umass.edu/downloads/SEIRiP.pdf.

9.Deerwester, S., Dumais, S., Furnas, G., Landauer, T., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science and Technology, 41(6), 391Ð407.

10.Frommholz, I., Larsen, B., Piwowarski, B., Lalmas, M., Ingwersen, P., & van Rijsbergen, K. (2010). Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework. In Proceedings of IIiX (pp. 115Ð124).

11.Halmos, P. (1987). Finite-dimensional vector spaces. Undergraduate texts in mathematics. New York, NY: Springer.

12.Halmos, P. R. (1960). Naïve Set Theory. New York, NY: D. Van Nostrand Company, Inc.

5http://trec-dd.org.

suai.ru/our-contacts

quantum machine learning

168

E. Di Buccio and M. Melucci

13.Hawking, D., Moffat, A., & Trotman, A. (2017). EfÞciency in information retrieval: Introduction to special issue. Information Retrieval Journal, 20(3), 169Ð171. http://dx.doi.org/10.1007/ s10791-017-9309-7.

14.Hughes, R. (1989). The structure and interpretation of quantum mechanics. Cambridge: Harvard University Press.

15.Ingwersen, P. (1992). Information retrieval interaction. London: Taylor Graham Publishing.

16.Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788Ð791.

17.Melucci, M. (2015). Introduction to information retrieval and quantum mechanics. Berlin: Springer.

18.Melucci, M., & van Rijsbergen, C. J. (2011). Quantum mechanics and information retrieval

(Chap. 6, pp. 125Ð155). Berlin: Springer.

19.Piwowarski, B., Frommholz, I., Lalmas, M., & van Rijsbergen, C. J. (2010). What can quantum theory bring to information retrieval. In Proceedings of CIKM (pp. 59Ð68). New York, NY: ACM.

20.Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4), 333Ð389.

21.Salton, G., & Buckley, C. (1988). Term weighting approaches in automatic text retrieval.

Information Processing & Management, 24(5), 513Ð523.

22.Schmitt, I. (2008). QQL: A DB&IR query language. The VLDB Journal, 17(1), 39Ð56. http:// dx.doi.org/10.1007/s00778-007-0070-1.

23.Sordoni, A., He, J., & Nie, J. Y. (2013). Modeling latent topic interactions using quantum interference for information retrieval. In Proceedings of CIKM (pp. 1197Ð1200). http://dl.acm. org/citation.cfm?doid=2505515.2507854

24.Sordoni, A., Nie, J. Y., & Bengio, Y. (2013). Modeling term dependencies with quantum language models for IR. In Proceedings of SIGIR (pp. 653Ð662).

25.Turtle, H., & Flood, J. (1995). Query evaluation: Strategies and optimizations. Information Processing & Management, 31(6), 831Ð850. http://dx.doi.org/10.1016/0306-4573(95)00020- H.

26.Van Rijsbergen, C. J. (2004). The Geometry of information retrieval. Cambridge: Cambridge University Press.

27.Widdows, D. (2004). Geometry and meaning. Stanford, CA: CSLI Publications.

28.Widdows, D., & Peters, S. (2003). Word vectors and quantum logic: Experiments with negation and disjunction. In R. T. Oehrle & J. Rogers (Eds.), Proceedings of the Mathematics of Language Conference (Vols. 141Ð154).

29.Zellhšfer, D., Frommholz, I., Schmitt, I., Lalmas, M., & van Rijsbergen, K. (2011). Towards quantum-based DB+IR processing based on the principle of polyrepresentation. In Proceedings of ECIR (pp. 729Ð732). Berlin: Springer. http://dl.acm.org/citation.cfm?id=1996889.1996989.

suai.ru/our-contacts

quantum machine learning

Index

A

Akaike information criterion, 45, 47 Arithmetic formula on operands, 141 Aspect model, 101

B

Bag-of-words approach, 36

Bayesian information criterion, 45, 47 Bell test

inequality and interpretation, 36Ð37 semantic retrieving, 37Ð38

Bell test parameter vs. HAL window size, 39 Best Match N. 25 (BM25) extension, 146 BiLSTM-CRF architecture, 91

Binary Independence Retrieval (BIR) model, 146

Bloch representation of quantum mechanics, 26

Borel σ -algebra, 53

Brussels quantum approach, see Double-slit experiment

C

Car components

atomic conditions, 116, 117 car management, 116 properties, 116

Classical (Boolean) logic, 36

Classical probability, see KolmogorovÕs axiomatics

ClauserÐHorneÐShimonyÐHolt (CHSH) type, 37

Cognitive semantic information retrieval, see Bell test

Cognitivistic/conceptualistic interpretation, 12Ð14

Collobert and Weston (C&W) approach, 87

Commuting quantum query language (CQQL)

database condition, 135 information retrieval, 134 proximity condition, 135 QM mathematics, 134

set of commutative conditions, 135Ð137 vector spaces, query processing, 134

Compatible and incompatible observables BornÕs rule, joint measurements, 62 commutativityÐnoncommutativity, 62 deÞned, 61Ð62

projection postulate Hermitian operator, 63 LŸders, 63Ð64

von Neumann, 63 Complex-valued embedding, 100 Conceptual entities

printed or printable webpages, 2 QWeb composite entity, 16

Context effects, 21, 23, 24 Contextuality

CHSH inequality, 37

compatibility of different queries, 36 user and smart information system, 36

Contextualized word embedding, 99 Copenhagen interpretation, 64Ð65

© Springer Nature Switzerland AG 2019

169

D. Aerts et al. (eds.), Quantum-Like Models for Information Retrieval and Decision-Making, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-25913-6

suai.ru/our-contacts

170

D

Database query language

Boolean model, information retrieval, 130

user search and text retrieval search, 129 See also Weighting formula

Data type construction, 121Ð123 Decision-making

classical (Boolean) logic violations, 36 psychological aspects, 36

Degree of purity, 59 Density operators, 59 Diagonal values, 119

Distributional hypothesis, 86Ð87 Documental entities, see Double-slit

experiment; Information retrieval (IR)

Document-level representation, 92 Document ranking, 156Ð158 Double-slit experiment

Born rule, 10 classical, 7

classical probabilistic average, 12 cognitivistic/conceptualistic interpretation,

12Ð14

conceptual and classical logical thoughts, 21

conÞgurations, 6

conjunction and disjunction, 23 descriptions of, 6

detection screen, 8 DiracÕs notation, 9 effective Hilbert space, 10 experimental probability, 8 ÒÞrst sectorÓ modeling, 23

interference contribution, 7 paradigmatic projection operators, 23 probabilities, 25

projection operator, 24 properties, 11 quantum, 7, 8

quantum Þeld theory, 22 real functions, 9 superposition principle, 11

Venn-diagram representation, 21 weighted average, 25

written documents, 6 Downstream task, 96Ð97

E

Elementary data types deÞned, 117

Þnite domains, 118

quantum machine learning

Index

non-orthogonal data type, 117, 119Ð121 orthogonal data type, 117, 118

Ensemble interpretation, 64 Entangled/non-separable state, 37

F

FaginÕs approach, 140

Formula of total probability (FTP), 55 Hermitian matrices, 67 incompatible observables, 67 quantum conditional probability, 68 VŠxjš interpretation, 68Ð70

Fuzzy logic theory

score function, 132Ð133 t-conorm function, 132 t-norm function, 132

weighted score function function, 133Ð134

G

Gaussian embedding, 100

H

Hilbert space multi-dimensional (HSM) modeling

choice probabilities, 47 compatibility relations, 45 computer programs, 48 contingency data tables, 42 description, 41

dimension determination, 45 four-way joint distribution, 42 initial state, deÞned, 46

model parameters and test models, 47 projectors, 46Ð47

See also Quantum probability theory Hyperbolic embedding, 100

Hyperspace analogue to language (HAL) algorithm, 37Ð38

I

Information retrieval (IR) Boolean model, 149

click-through activity and natural language phrases, 145

cognitive experiments with human participants, 2

CQQL, 134 decision making, 83 description of, 2, 3

deterministic processes, 3 physical and conceptual entities, 2 quantumness, 83