Applications and Search Paradigms#

One of the key goals of the OpenWebSearch.eu (OWS) project is the creation of an open web index (OWI) that enables the development of various types of applications using web data. Work package 4 aims to support the creation of development of vertical search engines (search applications) based on the OWI in various ways. In contrast to general purpose and large-scale search engines, such as Google or Bing, vertical search engines serve specific domains or purposes, which also provides opportunities to optimize search and retrieval strategies. The reason for providing support methods is grounded in another key goal of OWS, which is the building of an ecosystem of vertical search engines and other applications based on OWS. Initial steps in this directions consist in the 3rd party call for search applications, as well as the conduction of hackathons where the development of search applications can be tried out. The support of the development of vertical search applications is provided on different levels:

  • First, technical support is provided by explaining how a search application can be created using the OWS technical infrastructure. This is done by providing a technical documentation and a prototype application, which serves as a blueprint for other applications (TODO Link).

  • Second, two search applications are being developed (TODO LINK), in order to demonstrate the feasibility and usefulness of the search application concept and the OWS technical infrastructure.

Search Paradigms#

This section investigates different search paradigms, in order to outline relevant aspects of search engines related to OWS. These paradigms are not mutually exclusive, but are partially overlapping. They also highlight new ways of retrieving information through open web search, increasing trust, and protecting privacy in contrast to features of classical search engines.

Vertical Search Engines#

In contrast to general purpose and large-scale search engines, such as Google or Bing, OWS shares the web index for vertical search engines serving specific domains or purposes, which also provides opportunities to optimize search and retrieval strategies. This results in particular benefits for the end user by offering higher precision of the search result, utilisation of domain knowledge in terms of ontologies or knowledge graphs, and facilitating specific user tasks. Today’s popular and powerful vertical search solutions mostly pursue commercial purposes or are part of enterprises’ business models, such as product search of Amazon, people search of LinkedIn, or hotel search of Booking.com. Even Google maintains vertical search engines (e.g. YouTube) and is interested in further vertical search solution as recently demonstrated through the purchase of the ITA Travel Search Company. Presently, vertical search engines become a more and more active field for commercial purposes, such as marketing, product vending, awareness capturing. In order to enable the development of search verticals, OWS enables to download a portion or subset of the index related to a certain topic or geographical region. This allows to create a search application that makes use of this index subset and provides search features dedicated to its purpose and user expectations. Instead of dealing with a large index that makes it expensive to retrieve relevant search results, a small index makes it easier to perform accurate ranking. Furthermore, the index subset can be stored and managed on an own server, which increases the independence from external technologies. This enables 3rd party search applications without crawling Web sites and re-use partitions of the OWS index, and therefore reducing network traffic.

Search Applications#

Prototpye Search Application#

Small Demo

../../_images/applications-basic-demo.png

Fig. 9 Demo on a two small CIFF+parquet indices#

Readiness Level#

  • Data Readiness Level: Completeness, preciseness, ELSA Compliance and availability of data

  • Infrastructure Readiness Level: Availability and size of infrastructure

  • Software Readiness Level: Documentation, Test, Bugs, CI/CD, License