suai.ru/our-contacts |
quantum machine learning |
140
Fig. 5 Connected weights between conjunction and disjunction
|
|
I. Schmitt |
1 |
|
0.5 y = 0.5 + y − 0.5 y |
|
|
|
|
|
0.5 0,1 y = y |
0.5 |
|
0.5 0.5,0.5 y = 0.25 + 0.5 y |
|
|
0.5 1,0 y = 0.5 |
|
|
0.5 y = 0.5 y |
0 |
1 |
y |
|
(x + ¬θ )(y + θ ) = xy + xθ + y¬θ + ¬θ θ = xyθ + xy¬θ + xθ + y¬θ
= (xy + x)θ + (xy + y)¬θ = xθ + y¬θ.
This effect can be interpreted as a neutral combination of conditions. Figure 5 illustrates for a constant x = 0.5 that x θ,1−θ y lies exactly in the middle between conjunction and disjunction.
Weighting of non-Boolean query conditions is a well-known problem. Following [9], we distinguish four types of weight semantics: (1) weight as measure of importance of conditions, (2) weight as a limit on the amount of objects to be returned, (3) weight as threshold value, cf. [10, 11], and (4) weight as speciÞcation of an ideal database object. Next, we discuss related papers about weights as measure of importance. All logic-based weighting approaches supporting vagueness use the fuzzy t-norm/t-conorm min/max.4 In contrast, our approach is a general logicbased approach where min/max is just one special case.
Fagin’s Approach Fagin and Wimmers [4] propose a special arithmetic weighting formula to be applied on a score rule S. Let Θ = {θ1, . . . , θn} be weights with
θi |
[ |
0, 1 |
] |
|
i |
|
Θ= |
1, and θ1 |
≥ |
|
≥ |
|
≥ |
θn. The weighted version of the score is |
|
|
|
, |
|
θi |
|
|
θ2 |
|
. . . |
|
|||||
then deÞned as S |
|
(μ1(o), . . . , μn(o), θ1, . . . , θn) = (θ1−θ2)S(μ1(o))+2 (θ2−θ3) |
|||||||||||||
S(μ1(o), μ2(o))+. . .+n θnS(μ1(o), . . . , μn(o)). Since this weighting approach is on top of a logic it violates R4. Thus, associativity in combination with min/max, for example, cannot be guaranteed (see [12]). Furthermore, Sung [13] describes a so-called stability problem for FaginÕs approach. We can show that FaginÕs formula with weights θ1F , θ2F applied to our CQQL logic can be simulated by our weighting where θ1 = 1 and θ2 = 2 θ2F hold.
4min/max is the only fuzzy t-norm/t-conorm which supports idempotence and distributivity.
suai.ru/our-contacts |
quantum machine learning |
Weights in CQQL |
141 |
Arithmetic Formula on Operands The main idea of [14Ð16] is to apply arithmetical formulas on operands of a disjunction or conjunction. Thus, they all do not fulÞll R4. For example, Sung [13] deÞnes SΘ (μ1(o), . . . , μn(o), θ1, . . . , θn) = S(θ1μ1(o), . . . , θnμn(o)). Interestingly, the operand formula 1−θ (1−μ(o)) for the t-norm min proposed by Carson et al. [16] is very near to our evaluation formula on CQQL.
Weighted Sum Singitham et al. [17], Fuhr and Gro§johann [18], and Oracle Corporation [19] propose a weighted sum approach completely independent from a logic. As shown, we can simulate the weighted sum by connected weights.
Logic-Based Weighting Approach on min/max The approaches proposed by Dubois and Prade [20], Pasi [21], and Yager [22] are very similar to our weighting approach and fulÞll R1, R2, R3, and R4. However, they are strictly connected to the t-norm/t-conorm min/max. This results in a problem described by Fagin and Wimmers [4]. First, linearity cannot be fulÞlled. Second, if μ1(o) ≥ 1 − θ2/θ1 ≥ μ2(o) holds, then the result is completely independent from μ1(o) and μ2(o).
OWA Approach The OWA approach is discussed, for example, in [9, 23]. It was not developed to weight certain conditions. Instead, the user can assign weights to the highest score, to the second highest score, and so on. As a result, the characteristic of a conjunction can be gradually shifted to one of a disjunction. Thus, the weight does not express the importance of a condition.
We conclude that our weighting approach can be seen as a logic-based generalization of existing weighting approaches.
In this paper we propose a logic-based weighting mechanism which is completely embedded in a logic. As logic we used the logic of our query language CQQL. Later, we show that our approach can also be applied to other logics. A very interesting result are connected weights in CQQL. They produce the weighted sum by means of logic.
One problem not tackled here is the question, if a user is always able to specify weight values. We propose to use a kind of user interactions to infer that values. For example, a user starts with equal weight values and is able to adjust them after she/he sees the query result. Another approach is to learn weight values from required preferences over result objects.
We evaluated our approach successfully in a content-based image retrieval context [24]. There, weight values are not given by users but are learnt from user interactions.
suai.ru/our-contacts |
quantum machine learning |
142 |
I. Schmitt |
1.Zadeh, L. A. (1988). Fuzzy logic. IEEE Computer, 21, 83Ð93.
2.Schmitt, I. (2008). QQL: A DB&IT query language. The VLDB Journal: The International Journal on Very Large Data Bases, 17, 39Ð56.
3.Cooper, W. S. (1988). Getting beyond Boole. Information Processing & Management, 24, 243Ð248.
4.Fagin, R., & Wimmers, E. L. (2000). A formula for incorporating weights into scoring rules.
Theoretical Computer Science, 239, 309Ð338.
5.van Rijsbergen, C. J. (1979). Information retrieval. London: Butterworths.
6.Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Essex : ACM Press.
7.van Rijsbergen, C. J. (2004). The geometry of information retrieval. Cambridge: Cambridge University Press.
8.Olteanu, D., Huang, J., & Koch, C. (2009). Sprout: Lazy vs. eager query plans for tupleindependent probabilistic databases. In 2009 IEEE 25th International Conference on Data Engineering (pp. 640Ð651).
9.Herrera-Viedma, E., Lopez-Herrera, A. G., Alonso, S., Porcel, C., & Cabrerizo, F. J. (2007).
Alinguistic multi-level weighted query language to represent user information needs. In
Proceedings of the IEEE International Conference on Fuzzy Systems, London, July 23–26, 2007 (pp. 1Ð6). Piscataway: IEEE.
10.Robertson, S. E., & Jones, K. S. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27, 129Ð146.
11.Van Rijsbergen, C. J., Robertson, S. E., & Porter, M. F. (1980). New models in probabilistic information retrieval. London: British Library Research and Development Department London.
12.Schulz, N., & Schmitt, I. (2003). Relevanzwichtung in komplexen €hnlichkeitsanfragen. In
G.Weikum, H. Schšning, & E. Rahm (Eds.), Datenbanksysteme in Business, Technologie und Web, BTW’03, 10. GI-Fachtagung, Leipzig, Februar 2003. Lecture notes in informatics (LNI)
(Vol. P-26, pp. 187Ð196). Bonn: Gesellschaft fŸr Informatik.
13.Sung, S. (1998). A Linear Transform Scheme for Combining Weights into Scores. Technical report, Rice University.
14.Bookstein, A. (1980). Fuzzy requests: An approach to weighted Boolean searches. Journal of the American Society for Information Science (JASIS), 31, 240Ð247.
15.Bookstein, A. (1980). A comparison of two weighting schemes for Boolean retrieval. In
R.N. Oddy, S. E. Robertson, C. J. van Rijsbergen, & P. W. Williams (Eds.), Information Retrieval Research, Proceedings of the Joint ACM/BCS Symposium in Information Storage and Retrieval, Cambridge, June 1980 (pp. 23Ð34).
16.Carson, C., Belongie, S., Greenspan, H., & Malik, J. (1997). Region-based image querying. In Proceedings of the IEEE Workshop CVPR ’97 Workshop on Content-Based Access of Image and Video Libraries, Puerto Rico (pp. 42Ð49).
17.Singitham, P. K. C., Mahabhashyam, M. S., & Raghavan, P. (2004). EfÞciency-quality tradeoffs for vector score aggregation. In M. A. Nascimento, M. T. …zsu, D. Kossmann, R. J. Miller,
J.A. Blakeley, K. B. Schiefer (Eds.), Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, August 31–September 3 2004 (pp. 624Ð635). Burlington: Morgan Kaufmann.
18.Fuhr, N., & Gro§johann, K. (2001). XIRQL: A query language for information retrieval in xml documents. In W. B. Croft, D. J. Harper, D. H. Kraft, J. Zobel (Eds.), SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September 9–13, 2001, New Orleans, LA (pp. 172Ð180). New York: ACM.
19.Oracle Corporation. (1999). Oracle8i visual information retrieval – user guide and reference.
suai.ru/our-contacts |
quantum machine learning |
Weights in CQQL |
143 |
20.Dubois, D., & Prade, H. (1986). Weighted minimum and maximum operations in fuzzy set theory. Information Science, 39, 205Ð210
21.Pasi, G. (1999). A logical formulation of the Boolean model and of weighted Boolean models. In Proceedings of the Workshop on Logical and Uncertainty Models for Information Systems (LUMIS 99). London: University College London.
22.Yager, R. R. (1987). A note on weighted queries in information retrieval systems. Journal of the American Society for Information Science, 38, 23Ð24.
23.Boughanem, M., Loiseau, Y., & Prade, H. (2005). Rank-ordering documents according to their relevance in information retrieval using reÞnements of ordered-weighted aggregations. In M. Detyniecki, J. M. Jose, A. NŸrnberger, C. J. van Rijsbergen (Eds.), Adaptive Multimedia Retrieval: User, Context, and Feedback, Third International Workshop, AMR 2005, Glasgow, July 28–29, 2005, Revised Selected Papers. Lecture notes in computer science (Vol. 3877, pp. 44Ð54). Berlin: Springer.
24.Zellhšfer, D. (2015). A Preference-Based Relevance Feedback Approach for Polyrepresentative Multimedia Retrieval. Doctoral thesis, BTU Cottbus-Senftenberg.
suai.ru/our-contacts |
quantum machine learning |
Emanuele Di Buccio
and Massimo Melucci 
Abstract Information Retrieval (IR) is the complex of models, languages, and techniques aimed to retrieve documents containing information relevant to the userÕs information needs. Current retrieval technology requires a retrieval model for guaranteeing effective results. While all retrieval models for term-based search bring into play the Boolean logic of sets, a document collection can be searched by themes, instead of terms, using the logic of vector spaces, instead of sets. Indeed, vector spaces may generalize sets by breaking some laws of algebra of sets. The main aim of this chapter is to provide an overview of the state-of-the-art formalism used in IR and explain how the novel model based on themes deÞned as vector spaces and inspired by quantum operators, such as two lattice operators known as meet and join, can be built upon this formalism.
When searching for information, the users of an IR system express their information needs through behavior (e.g., click-through activity) or queries (e.g., natural language phrases). By its nature, IR is inherently an interactive activity which is performed by a user accessing the collections managed by a system through very interactive devices immersed in context. Queries, which are the most used data for expressing information needs, are sentences expressed in a natural language, oftentimes very short (e.g., one word) or occasionally much longer (e.g., a text paragraph). However, queries are not the only means to communicate information needs. Other means such as click-through data can be observed during userÐsystem interaction or within social networks. Through interaction, the user aims to reÞne his query, to provide additional evidence describing his information need or to indirectly tell his needs to the system.
E. Di Buccio á M. Melucci ( )
Department of Information Engineering, University of Padova, Padova, Italy e-mail: dibuccio@dei.unipd.it; massimo.melucci@unipd.it
© Springer Nature Switzerland AG 2019 |
145 |
D. Aerts et al. (eds.), Quantum-Like Models for Information Retrieval and Decision-Making, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-25913-6_8