We've been getting some great practical exposure to Sitecore 7 indexing here at Active Commerce, so I thought I'd quickly share a couple tidbits that others should find very useful.
Index All Templates (no really, ALL templates)
If you are making good use of template inheritance, it's likely that searching on the item template alone does you no good. Your search needs to include base templates as well. Unfortunately, the built-in Sitecore.ContentSearch.ComputedFields.AllTemplates class behaves exactly as its counterpart in the old Sitecore.Search API -- it only goes one level deep into base templates. Bummer! Fortunately it's easy to replace with our own computed field, which actually accounts for all templates. In our version, it stops crawling base templates when the Standard Template is reached. Find the code and configuration example below.
Note: If you are sensitive to indexing performance in your environment, you will definitely want to test the impact of adding this. I've been told that performance constraints are the reason the built-in AllTemplates field does not do this. This renders it pretty useless however.
Indexing Datasource Content
I've covered this topic before in relation to the Page Editor and what used to be called Advanced Database Crawler. If you are building any sort of site search, you need the indexed content for a page to reflect the contents of its datasource items as well. I was really hoping this would be built-in somehow in Sitecore 7, but alas it is not. However as we've done this before for Sitecore.Search, we can do it again for Sitecore.ContentSearch. Now however, we can take advantage of the fact that datasources are accounted for in the Links DB now. Note I don't believe this would include search-based datasources, but someone can correct me there if needed. You can find the code and configuration for the computed field below. Note that once again, you may need to test the performance of this computed field in your environment for very large content databases.
What Sitecore 7 also makes easier is ensuring that the page is re-indexed if the datasource item is saved. The indexing.getDependencies pipeline allows specifying other items that should be re-indexed. The name of the pipeline (and even the documentation in the config file) is a bit misleading. To me it's more of a "get dependents" pipeline as a "get dependencies," as it indicates items that are dependent on the current item. But this is semantics.
Sitecore 7 includes a processor (disabled by default) that finds any items on whom the current item is used as a datasource. Unfortunately on enabling this (7.0 rev. 130918), I got a nasty exception.
Exception: System.InvalidCastException Message: Unable to cast object of type 'Sitecore.Data.ItemUri' to type 'Sitecore.ContentSearch.IIndexableUniqueId'. Source: System.Core at System.Linq.Enumerable.
d__b1`1.MoveNext() at System.Collections.Generic.List`1.InsertRange(Int32 index, IEnumerable`1 collection) at Sitecore.ContentSearch.Pipelines.GetDependencies.GetDatasourceDependencies.Process(GetDependenciesArgs context) at (Object , Object ) at Sitecore.Pipelines.PipelineMethod.Invoke(Object parameters) at Sitecore.Pipelines.CoreProcessor.Invoke(Object parameters) at Sitecore.Pipelines.CorePipeline.Run(PipelineArgs args) at Sitecore.Pipelines.CorePipeline.Run(String pipelineName, PipelineArgs args, String pipelineDomain, Boolean failIfNotExists) at Sitecore.Pipelines.CorePipeline.Run(String pipelineName, PipelineArgs args, String pipelineDomain) at Sitecore.Pipelines.CorePipeline.Run(String pipelineName, PipelineArgs args) at Sitecore.ContentSearch.Pipelines.GetDependencies.GetDependenciesPipeline.GetIndexingDependencies(IIndexable indexable) at Sitecore.ContentSearch.Crawler`1.UpdateDependents(IProviderUpdateContext context, T indexable) at Sitecore.ContentSearch.SitecoreItemCrawler.DoUpdate(IProviderUpdateContext context, SitecoreIndexableItem indexable) at Sitecore.ContentSearch.Crawler`1.Update(IProviderUpdateContext context, IIndexableUniqueId indexableUniqueId, IndexingOptions indexingOptions) at Sitecore.ContentSearch.LuceneProvider.LuceneIndex.PerformUpdate(IIndexableUniqueId indexableUniqueId, IndexingOptions indexingOptions) at Sitecore.ContentSearch.LuceneProvider.LuceneIndex.Update(IIndexableUniqueId indexableUniqueId)
I've reported the issue to Sitecore, but in the mean time, included below is code and config for a replacement processor. Utilizing this will ensure that when your datasource is updated, the page will be reindexed as well.
Code below. Happy Sitecoring.