*** Dr. Merih Seran Uysal ***


Efficient Similarity Search in Large Multimedia Databases

With increasing ubiquity of information systems, reduced costs of storage devices and data repositories based on advances on hardware technology, recent years have witnessed an explosion in generation and storage of data. The rapid increase in data collection is observed in a wide bunch of fields, such as multimedia, biology, medicine, business, marketing, and engineering. In particular, the easiness of multimedia capture and generation via various devices have resulted in a huge increase in generation and dissemination of multimedia data comprising image, video, and audio data.

Content-based multimedia retrieval leading to effective search results is increasingly preferred and utilized in numerous application domains. To this end, first data objects are represented by data representation models, such as signatures, and then a distance-based similarity measure is applied to those models. The utilization of distance-based similarity measures on high-dimensional databases leads to effective search results, however, at the cost of high computational query time. Thus, efficient and effective retrieval is indispensable for vast high-dimensional databases.

In this work, novel efficient query processing techniques are presented which are applicable to high-dimensional complex data represented by signatures. The introduced efficiency improvement techniques allow for utilization of multi-step filter-and-refine algorithms, resulting in alleviation of query time costs. In addition, the proposed filter approximation techniques and algorithms attain considerable reduction of exact distance computations, guaranteeing completeness of result sets. Extensive experimental evaluations point out the strengths of the introduced techniques which outperform the state of the art. Overall, efficient similarity search techniques are primarily contributed for high-dimensional databases comprising signatures.