Document-level security

A frequent requirement for search engines is to maintain document-level security. While a public search engine may expose all documents to all users, many intranet-oriented search engines maintain information that it is accessible to only a subset of users. Historically, the solution to maintaining document-level security has been a roll-your-own with the most common approaches being listed here:

  1. Hopefully your requirements allow you to enrich your indexed document with access tokens that can be searched for using a filter query based on the current user's access tokens. For a simplistic example, to allow only documents marked as accessible to the marketing department, or unclassified, you might add this parameter: fq=group_label:(marketing_department OR UNCLASSIFIED) to your query. However, there will be syncing challenges if the authorization lists per document are managed elsewhere. ManifoldCF helps with that and uses this general approach to document security.
  2. Write a custom Solr post-filter QParser. This isn't too hard to work correctly but it's fundamentally difficult to scale when an external service must be consulted. This is because it operates on every document the search matches, not just the top X results.
  3. Separate from Solr, implement a postprocessing filter on the document result set that removes documents that the user shouldn't see. This approach is convenient because you can just wrap your calls to Solr with your own proprietary security model. However, this approach is often flawed, particularly when faceting is used, since the facet counts can't be based on the access control rules.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset