suai.ru/our-contacts |
quantum machine learning |
166 |
E. Di Buccio and M. Melucci |
experimentation, the theoretical impact has been addressed since the proposal of the Geometry of IR [26]. In this chapter we address both aspects.
With regard to the practical impact of the QTL, some types of user, who are experts in their own application domains such as journalists and scholars, may be willing to use meet and join for building complex queries and searching a document collection by themes rather than simple and short queries and Þnding speciÞc resources. A user may meet and join subspaces in the context of vector spaces, instead of intersecting and complementing subsets. Although meet and join are well-known operators of quantum theory, we do not argue that documents and queries are quantum objects like subatomic particles. Instead, we are investigating whether the retrieval process involving expert users may exhibit some quantum-like behavior.
From the theoretical perspective there is a more profound reason suggesting the replacement of sets with spaces. Actually, an initial replacement took place with the advent of the VSM which views documents as points of a vector space and not only mere elements of Boolean sets. The main motivation driving from the Boolean sets to the vector spaces was the need of a retrieval function providing a ranking. The inner product of the VSM between document vectors and query vectors provides such a ranking because it sums up the weights of the memberships of a document and the sets to which a query belongs.
One future work will focus on media other than text, terms, and words and on modalities other than querying. Indeed, a term is bound to the easy recognition of terms in documents and to the userÕs intuition that a term corresponds to the set of documents about the concept represented by the term. When terms are combined by Boolean operators, a term has a semantics and the results, which are document sets, obtained by the operators are an extensional representation of a concept. A setbased approach to retrieval with image, video, sound, or multimedia documents is less natural than with textual documents. The content descriptors of image, video, or sound such as pixels, shapes, or chroma cannot be described by terms and the assumption that sets and set operators can express informative content does not seem as intuitive as for text. Similarly, multimodality Þts less naturally with a set-based retrieval model. When click-through and user interaction data are collected, sets are not the most obvious representation of informative content. The reason is that the language of non-textual or multimodal traits is likely to describe individuals with a logic other than a classical logic.
To the end of experimenting different modalities, some experiments are underway by using the subtopics of the TREC 2010 Web Track Test Collection as themes.4 Instead of implementing themes using index terms, we will implement themes using subtopics, which may be viewed as aspects of the main topic. The experiments will simulate a more interactive scenario than the scenario simulated in this chapter. A user will submit a query (i.e., the main topic) and the retrieval system will extract a set of pertinent themes. We will measure the effectiveness of
4http://trec.nist.gov/data/web/10/wt2010-topics.xml.
suai.ru/our-contacts |
quantum machine learning |
Searching for Information with Meet and Join Operators |
167 |
the ranked list obtained by the representation which will be based only on the query terms, of the list obtained by using all the distinct terms associated with the extracted themes or by using the themes built through join and meet. Further experiments will be carried out on the Dynamic Domain Track Test Collections. The goal of the Dynamic Domain Track is to Òsupport research in dynamic, exploratory search of complex information domains.Ó5 The task is highly interactive and the interaction with the user is simulated through the Jig, which returns explicit judgments on the top Þve retrieved documents along with relevant passages in those documents. We will investigate the use of relevant passages as a source for implementing themes.
Acknowledgements This chapter is part of a project that has received funding from the European UnionÕs Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 721321.
1.Aerts, D., Melucci, M., de Bianchi, M. S., Sozzo, S., & Veloz, T. (2018). Special issue: Quantum structures in computer science: Language, semantics, retrieval. Theoretical Computer Science, 752, 1Ð4.
2.Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993Ð1022.
3.Broder, A. Z., Carmel, D., Herscovici, M., Soffer, A., & Zien, J. (2003). EfÞcient query evaluation using a two-level retrieval process. In Proceedings of the Twelfth International Conference on Information and Knowledge Management (pp. 426Ð434). New York, NY: ACM. http://doi.acm.org/10.1145/956863.956944.
4.Caputo, A., Piwowarski, B., & Lalmas, M. (2011). A query algebra for quantum information retrieval. In Proceedings of the IIR Workshop. http://ceur-ws.org/Vol-704/19.pdf.
5.Carpineto, C., & Romano, G. (2012). A survey of automatic query expansion in information retrieval. ACM Computing Surveys, 44(1), 1Ð50. http://doi.acm.org/10.1145/2071389. 2071390.
6.Cooper, W. (1995). Some inconsistencies and misidentiÞed modeling assumptions in probabilistic information retrieval. ACM Transactions on Information Systems, 13(1), 100Ð111.
7.Croft, W., & Lafferty, J. (Eds.). (2003). Language modeling for information retrieval. Berlin: Springer.
8.Croft, W., Metzler, D., & Strohman, T. (2009). Search engines: Information retrieval in practice. Boston: Addison Wesley. http://ciir.cs.umass.edu/downloads/SEIRiP.pdf.
9.Deerwester, S., Dumais, S., Furnas, G., Landauer, T., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science and Technology, 41(6), 391Ð407.
10.Frommholz, I., Larsen, B., Piwowarski, B., Lalmas, M., Ingwersen, P., & van Rijsbergen, K. (2010). Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework. In Proceedings of IIiX (pp. 115Ð124).
11.Halmos, P. (1987). Finite-dimensional vector spaces. Undergraduate texts in mathematics. New York, NY: Springer.
12.Halmos, P. R. (1960). Naïve Set Theory. New York, NY: D. Van Nostrand Company, Inc.
5http://trec-dd.org.
suai.ru/our-contacts |
quantum machine learning |
168 |
E. Di Buccio and M. Melucci |
13.Hawking, D., Moffat, A., & Trotman, A. (2017). EfÞciency in information retrieval: Introduction to special issue. Information Retrieval Journal, 20(3), 169Ð171. http://dx.doi.org/10.1007/ s10791-017-9309-7.
14.Hughes, R. (1989). The structure and interpretation of quantum mechanics. Cambridge: Harvard University Press.
15.Ingwersen, P. (1992). Information retrieval interaction. London: Taylor Graham Publishing.
16.Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788Ð791.
17.Melucci, M. (2015). Introduction to information retrieval and quantum mechanics. Berlin: Springer.
18.Melucci, M., & van Rijsbergen, C. J. (2011). Quantum mechanics and information retrieval
(Chap. 6, pp. 125Ð155). Berlin: Springer.
19.Piwowarski, B., Frommholz, I., Lalmas, M., & van Rijsbergen, C. J. (2010). What can quantum theory bring to information retrieval. In Proceedings of CIKM (pp. 59Ð68). New York, NY: ACM.
20.Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4), 333Ð389.
21.Salton, G., & Buckley, C. (1988). Term weighting approaches in automatic text retrieval.
Information Processing & Management, 24(5), 513Ð523.
22.Schmitt, I. (2008). QQL: A DB&IR query language. The VLDB Journal, 17(1), 39Ð56. http:// dx.doi.org/10.1007/s00778-007-0070-1.
23.Sordoni, A., He, J., & Nie, J. Y. (2013). Modeling latent topic interactions using quantum interference for information retrieval. In Proceedings of CIKM (pp. 1197Ð1200). http://dl.acm. org/citation.cfm?doid=2505515.2507854
24.Sordoni, A., Nie, J. Y., & Bengio, Y. (2013). Modeling term dependencies with quantum language models for IR. In Proceedings of SIGIR (pp. 653Ð662).
25.Turtle, H., & Flood, J. (1995). Query evaluation: Strategies and optimizations. Information Processing & Management, 31(6), 831Ð850. http://dx.doi.org/10.1016/0306-4573(95)00020- H.
26.Van Rijsbergen, C. J. (2004). The Geometry of information retrieval. Cambridge: Cambridge University Press.
27.Widdows, D. (2004). Geometry and meaning. Stanford, CA: CSLI Publications.
28.Widdows, D., & Peters, S. (2003). Word vectors and quantum logic: Experiments with negation and disjunction. In R. T. Oehrle & J. Rogers (Eds.), Proceedings of the Mathematics of Language Conference (Vols. 141Ð154).
29.Zellhšfer, D., Frommholz, I., Schmitt, I., Lalmas, M., & van Rijsbergen, K. (2011). Towards quantum-based DB+IR processing based on the principle of polyrepresentation. In Proceedings of ECIR (pp. 729Ð732). Berlin: Springer. http://dl.acm.org/citation.cfm?id=1996889.1996989.
suai.ru/our-contacts |
quantum machine learning |
A
Akaike information criterion, 45, 47 Arithmetic formula on operands, 141 Aspect model, 101
B
Bag-of-words approach, 36
Bayesian information criterion, 45, 47 Bell test
inequality and interpretation, 36Ð37 semantic retrieving, 37Ð38
Bell test parameter vs. HAL window size, 39 Best Match N. 25 (BM25) extension, 146 BiLSTM-CRF architecture, 91
Binary Independence Retrieval (BIR) model, 146
Bloch representation of quantum mechanics, 26
Borel σ -algebra, 53
Brussels quantum approach, see Double-slit experiment
C
Car components
atomic conditions, 116, 117 car management, 116 properties, 116
Classical (Boolean) logic, 36
Classical probability, see KolmogorovÕs axiomatics
ClauserÐHorneÐShimonyÐHolt (CHSH) type, 37
Cognitive semantic information retrieval, see Bell test
Cognitivistic/conceptualistic interpretation, 12Ð14
Collobert and Weston (C&W) approach, 87
Commuting quantum query language (CQQL)
database condition, 135 information retrieval, 134 proximity condition, 135 QM mathematics, 134
set of commutative conditions, 135Ð137 vector spaces, query processing, 134
Compatible and incompatible observables BornÕs rule, joint measurements, 62 commutativityÐnoncommutativity, 62 deÞned, 61Ð62
projection postulate Hermitian operator, 63 LŸders, 63Ð64
von Neumann, 63 Complex-valued embedding, 100 Conceptual entities
printed or printable webpages, 2 QWeb composite entity, 16
Context effects, 21, 23, 24 Contextuality
CHSH inequality, 37
compatibility of different queries, 36 user and smart information system, 36
Contextualized word embedding, 99 Copenhagen interpretation, 64Ð65
© Springer Nature Switzerland AG 2019 |
169 |
D. Aerts et al. (eds.), Quantum-Like Models for Information Retrieval and Decision-Making, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-25913-6
suai.ru/our-contacts
170
D
Database query language
Boolean model, information retrieval, 130
user search and text retrieval search, 129 See also Weighting formula
Data type construction, 121Ð123 Decision-making
classical (Boolean) logic violations, 36 psychological aspects, 36
Degree of purity, 59 Density operators, 59 Diagonal values, 119
Distributional hypothesis, 86Ð87 Documental entities, see Double-slit
experiment; Information retrieval (IR)
Document-level representation, 92 Document ranking, 156Ð158 Double-slit experiment
Born rule, 10 classical, 7
classical probabilistic average, 12 cognitivistic/conceptualistic interpretation,
12Ð14
conceptual and classical logical thoughts, 21
conÞgurations, 6
conjunction and disjunction, 23 descriptions of, 6
detection screen, 8 DiracÕs notation, 9 effective Hilbert space, 10 experimental probability, 8 ÒÞrst sectorÓ modeling, 23
interference contribution, 7 paradigmatic projection operators, 23 probabilities, 25
projection operator, 24 properties, 11 quantum, 7, 8
quantum Þeld theory, 22 real functions, 9 superposition principle, 11
Venn-diagram representation, 21 weighted average, 25
written documents, 6 Downstream task, 96Ð97
E
Elementary data types deÞned, 117
Þnite domains, 118
quantum machine learning
Index
non-orthogonal data type, 117, 119Ð121 orthogonal data type, 117, 118
Ensemble interpretation, 64 Entangled/non-separable state, 37
F
FaginÕs approach, 140
Formula of total probability (FTP), 55 Hermitian matrices, 67 incompatible observables, 67 quantum conditional probability, 68 VŠxjš interpretation, 68Ð70
Fuzzy logic theory
score function, 132Ð133 t-conorm function, 132 t-norm function, 132
weighted score function function, 133Ð134
G
Gaussian embedding, 100
H
Hilbert space multi-dimensional (HSM) modeling
choice probabilities, 47 compatibility relations, 45 computer programs, 48 contingency data tables, 42 description, 41
dimension determination, 45 four-way joint distribution, 42 initial state, deÞned, 46
model parameters and test models, 47 projectors, 46Ð47
See also Quantum probability theory Hyperbolic embedding, 100
Hyperspace analogue to language (HAL) algorithm, 37Ð38
I
Information retrieval (IR) Boolean model, 149
click-through activity and natural language phrases, 145
cognitive experiments with human participants, 2
CQQL, 134 decision making, 83 description of, 2, 3
deterministic processes, 3 physical and conceptual entities, 2 quantumness, 83