• Remove a Page from the Sitecore xDB (Sitecore 8 Technical Preview)

    Posted 10/01/2014 by techphoria414

    Note: Information in this post is based on the Sitecore 7.5 and 8 Technical Previews and is subject to change.

    Here's a quick one. I've been doing some JMeter traffic generation on my xDB for a forthcoming post/video on the Path Analyzer, and to push data from session to the xDB quickly, I implemented an "end session" page to hit at the end of each test thread.

    (I will hopefully also get a chance to share my work in JMeter, but in the mean time you should start where I started, with Martina Welander's awesome post.)

    Unfortunately the first time around I forgot to exclude my /EndSession.aspx from analytics, so it was muddying my data. However it was pretty easy to directly remove this from the xDB using a mongodb query.

    Basically, I'm telling mongodb to update documents in the the Interactions collection and remove elements from the Pages array where the URL path is /EndSession.aspx. The final "true" argument tells mongodb to update all documents which match the query (the empty first argument in this case), not just the first it finds. For more info on what's going on here, check the mongodb documentation on the update() method.

    After running this, I had to rebuild the sitecore_analytics_index using the Indexing Manager and rebuilt the reporting database using /sitecore/admin/RebuildReportingDB.aspx.

    This query was used on a Sitecore 8 xDB but based on the Sitecore 7.5 xDB Technical Preview, would work with 7.5 as well.

    - Nick / techphoria414 

    Read more... Pre-Disqus Comments (1)
  • Sitecore 7 Computed Fields: All Templates and Datasource Content

    Posted 11/22/2013 by techphoria414

    We've been getting some great practical exposure to Sitecore 7 indexing here at Active Commerce, so I thought I'd quickly share a couple tidbits that others should find very useful.

    Index All Templates (no really, ALL templates)

    If you are making good use of template inheritance, it's likely that searching on the item template alone does you no good. Your search needs to include base templates as well. Unfortunately, the built-in Sitecore.ContentSearch.ComputedFields.AllTemplates class behaves exactly as its counterpart in the old Sitecore.Search API -- it only goes one level deep into base templates. Bummer! Fortunately it's easy to replace with our own computed field, which actually accounts for all templates. In our version, it stops crawling base templates when the Standard Template is reached. Find the code and configuration example below.

    Note: If you are sensitive to indexing performance in your environment, you will definitely want to test the impact of adding this. I've been told that performance constraints are the reason the built-in AllTemplates field does not do this. This renders it pretty useless however.

    Indexing Datasource Content

    I've covered this topic before in relation to the Page Editor and what used to be called Advanced Database Crawler. If you are building any sort of site search, you need the indexed content for a page to reflect the contents of its datasource items as well. I was really hoping this would be built-in somehow in Sitecore 7, but alas it is not. However as we've done this before for Sitecore.Search, we can do it again for Sitecore.ContentSearch. Now however, we can take advantage of the fact that datasources are accounted for in the Links DB now. Note I don't believe this would include search-based datasources, but someone can correct me there if needed. You can find the code and configuration for the computed field below. Note that once again, you may need to test the performance of this computed field in your environment for very large content databases.

    What Sitecore 7 also makes easier is ensuring that the page is re-indexed if the datasource item is saved. The indexing.getDependencies pipeline allows specifying other items that should be re-indexed. The name of the pipeline (and even the documentation in the config file) is a bit misleading. To me it's more of a "get dependents" pipeline as a "get dependencies," as it indicates items that are dependent on the current item. But this is semantics.

    Sitecore 7 includes a processor (disabled by default) that finds any items on whom the current item is used as a datasource. Unfortunately on enabling this (7.0 rev. 130918), I got a nasty exception.

    Exception: System.InvalidCastException
    Message: Unable to cast object of type 'Sitecore.Data.ItemUri' to type 'Sitecore.ContentSearch.IIndexableUniqueId'.
    Source: System.Core
       at System.Linq.Enumerable.d__b1`1.MoveNext()
       at System.Collections.Generic.List`1.InsertRange(Int32 index, IEnumerable`1 collection)
       at Sitecore.ContentSearch.Pipelines.GetDependencies.GetDatasourceDependencies.Process(GetDependenciesArgs context)
       at (Object , Object[] )
       at Sitecore.Pipelines.PipelineMethod.Invoke(Object[] parameters)
       at Sitecore.Pipelines.CoreProcessor.Invoke(Object[] parameters)
       at Sitecore.Pipelines.CorePipeline.Run(PipelineArgs args)
       at Sitecore.Pipelines.CorePipeline.Run(String pipelineName, PipelineArgs args, String pipelineDomain, Boolean failIfNotExists)
       at Sitecore.Pipelines.CorePipeline.Run(String pipelineName, PipelineArgs args, String pipelineDomain)
       at Sitecore.Pipelines.CorePipeline.Run(String pipelineName, PipelineArgs args)
       at Sitecore.ContentSearch.Pipelines.GetDependencies.GetDependenciesPipeline.GetIndexingDependencies(IIndexable indexable)
       at Sitecore.ContentSearch.Crawler`1.UpdateDependents(IProviderUpdateContext context, T indexable)
       at Sitecore.ContentSearch.SitecoreItemCrawler.DoUpdate(IProviderUpdateContext context, SitecoreIndexableItem indexable)
       at Sitecore.ContentSearch.Crawler`1.Update(IProviderUpdateContext context, IIndexableUniqueId indexableUniqueId, IndexingOptions indexingOptions)
       at Sitecore.ContentSearch.LuceneProvider.LuceneIndex.PerformUpdate(IIndexableUniqueId indexableUniqueId, IndexingOptions indexingOptions)
       at Sitecore.ContentSearch.LuceneProvider.LuceneIndex.Update(IIndexableUniqueId indexableUniqueId)

    I've reported the issue to Sitecore, but in the mean time, included below is code and config for a replacement processor. Utilizing this will ensure that when your datasource is updated, the page will be reindexed as well.

    Code below. Happy Sitecoring.


    Read more... Pre-Disqus Comments (6)