jxta-jxse
  1. jxta-jxse
  2. JXTA_JXSE-103

The Lucene/XQuery/RegEx Discovery Service for JXTA

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.5
    • Fix Version/s: 2.7 Cheeseburger
    • Component/s: code
    • Labels:
      None
    • Environment:

      Operating System: All
      Platform: Windows

    • Issuezilla Id:
      103

      Description

      DESIGN GOAL:

      Preserve the existing B-Tree based index/search implementation because of its
      effectiveness and performance for single keyword search.

      The XQuery capabilities are built in addition to the existing search strategy,
      not on the top of it. No API changes are needed. Programmers only need to
      specify "XQuery" as the attribute name argument and a query conforming to
      XQuery syntax as the value argument when invoking the getRemoteAdvertisements
      and getLocalAdvertisements() in order to instruct the discovery service to
      perform the XQuery search.

      For example,

      String attribute ="XQuery";
      String value="for $result in //ad where contains
      ($result/adIndexFields/Name, "pizza") return $result";
      getRemoteAdvertisements(null, DiscoveryService.ADV, attribute, value, 10);

      It is noteworthy that the "ad" and "adIndexFields" are reserved words
      specifically for this implementation because they represent the structural
      elements in the XML advertisement database (XML file). An ADV-type
      advertisement would look like this in the database:

      <?xml version="1.0" encoding="UTF-8" ?>
      <Adv>
      <ad key="Adv/cbid-
      59616261646162614A78746150325033391E61DEF903411491B0B873D8546716370BBED460B48A6E
      1D4D9082F72D2F9C176D305701"
      advType="jxta:RA"
      lifetime="1197266226859"
      expiration="1197266226859">

      <adData>
      (string representation of the adv)
      </adData>

      <adIndexFields>
      (multiple name-value elements)
      </adIndexFields>
      </ad>
      ...
      </Adv>

      METHODOLOGY:

      I have analyzed how the Cm and SrdiIndex classes are referenced and used by the
      existing JXTA discovery service.

      From an edge peer's point of view, the use cases on Cm by the discovery service
      can be grouped into three major categories: 1) save and index a discovered
      advertisement into the peer's local cache; 2) remove a specific advertisement
      from the cache; 3) search the saved and indexed advertisements. Cm stores and
      manages three types of advertisements, namely, Peers, Groups, and Adv in a
      special version of Xindice. Similarly, we can build three simple XML files to
      store Peers, Groups, and Adv advertisements to facilitate XQuery related
      operations. Whenever the discovery service uses Cm to save or remove an
      advertisement, we do the same on the XML files. Whenever the JXTA user attempts
      to search advertisements, the discovery service can check if the user intends
      to do an XQuery search or a simple keyword/wildcard based search. If the XQuery
      is intended, then the discovery service performs it on the XML files and
      returns the result. Otherwise, it performs the simple search.

      From a rendezvous peer's point of view, similar tasks are done on SrdiIndex.

      IMPLEMENTATION

      All the implementation is done in one class:
      net.jxta.impl.discovery.DiscoveryServiceImpl. All the XQuery related methods
      are prefixed by "xquery". All the sections of changes are marked by comment
      lines

      //<DC desc="XQuery discovery service implementation">

      //</DC>

      The xqueryPushSrdi has replaced the pushSrdi because the former can push all
      the index fields of an advertisement as a whole while the latter can only push
      individual index fields. XQuery must work on all the fields to unleash its
      power.

      The XML databases (basically, XML files) store an advertisement in two major
      elements: the actual data which can be used to construct an advertisement
      instance, and the index fields.

      FUTURE DEVELOPMENT

      Streaming XQuery. Search performance. In-memory priority data search and
      streaming for secondary data.

      Push srdi selectively.

      Review of synchronization on the doc objects.

      Validation of arguments.

      XML database housekeeping/cleanup, expiration policies.

      Advertisement owner id concept to prevent rdv from sending the query back to
      the sending peer.

      Database update performance.

      1. 103_api.diff
        22 kB
        adamman71
      2. 103_impl.diff
        243 kB
        adamman71
      3. DiscoveryServiceImpl.java
        98 kB
        thenetworker
      4. DiscoveryServiceImpl.java
        96 kB
        thenetworker
      5. Issue-103.zip
        136 kB
        thenetworker

        Issue Links

          Activity

          Hide
          adamman71 added a comment -

          These two new patches is work performed by Nick (buzzlightyear) to include
          TheNetworker's contribution based on 2.6-pre-alpha 790.

          These should be considered as part of the next release, be it 2.6.1 or not.

          Apparently, the code referring to the JXTA Shell has been stripped. PSE is
          enforced as the default membership. A login into pse during the StdPeerGroup
          init phase is performed.

          The code used by TheNetworker has been refined for years. It is unclear how much
          testing and further work remains. To be analyzed when 2.6 is out.

          Show
          adamman71 added a comment - These two new patches is work performed by Nick (buzzlightyear) to include TheNetworker's contribution based on 2.6-pre-alpha 790. These should be considered as part of the next release, be it 2.6.1 or not. Apparently, the code referring to the JXTA Shell has been stripped. PSE is enforced as the default membership. A login into pse during the StdPeerGroup init phase is performed. The code used by TheNetworker has been refined for years. It is unclear how much testing and further work remains. To be analyzed when 2.6 is out.
          Hide
          thenetworker added a comment -

          Login to the PSE has been improved. I will upload it soon.

          Right now, login to PSE needs X Window on Linux/Unix, so it will stall when no
          display can be found. But JXTA often needs to be run as a service or daemon
          without any display.

          Show
          thenetworker added a comment - Login to the PSE has been improved. I will upload it soon. Right now, login to PSE needs X Window on Linux/Unix, so it will stall when no display can be found. But JXTA often needs to be run as a service or daemon without any display.
          Hide
          buzzheavyyear added a comment -

          Could you upload your PSE addition as a patch as all of your other additions
          have already been made into a patch on top of 790? Perhaps it might be wise to
          wait until the patch has been merged (once 2.6 is out). This is scheduled to go
          into 2.6.1.

          Show
          buzzheavyyear added a comment - Could you upload your PSE addition as a patch as all of your other additions have already been made into a patch on top of 790? Perhaps it might be wise to wait until the patch has been merged (once 2.6 is out). This is scheduled to go into 2.6.1.
          Hide
          adamman71 added a comment -

          Not a real P1

          Show
          adamman71 added a comment - Not a real P1
          Hide
          adamman71 added a comment -

          Code committed, next episode in issue 417 (if any).

          Show
          adamman71 added a comment - Code committed, next episode in issue 417 (if any).

            People

            • Assignee:
              jxta-jxse-issues
              Reporter:
              thenetworker
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: