• Sitecore JavaScript Services

    Using the Sitecore JSS Manifest for Content Import

    Posted 03/15/2018 by techphoria414

    So it's been a year for me at Sitecore and nothing posted here, time for something a bit fun (and maybe useful?). One of the amazing things I've had the opportunity to contribute to at Sitecore is the revolutionary Sitecore JavaScript Services. If you don't know what I'm talking about, stop now, go read the docs, spin up one of the sample sites, run an import, then come back.

    One of the powerful aspects of Sitecore JSS is its support for code-first Sitecore development from JavaScript. You can define Sitecore IA (templates, renderings, etc) and content via the Manifest API then trigger an import via the JSS CLI. The manifest "definition" code generates a JSON-based manifest, which is placed in an Update Package, installed via Sitecore Ship (by default), and then run through the JSS import pipelines.

    For JSS, this approach is useful for Front-end developers who don't need to learn all the ins-and-outs of Sitecore in order to deploy a fully functional Sitecore site. But recently I was looking for a means to import some test content for another project, and realized its potential as a more general import tool as well.

    Normally a content import (especially a one-time import) is something I would use Sitecore PowerShell Extensions to accomplish. I still love SPE but for my purposes, the JSS Import had some advantages:

    • I could define the templates, in addition to the content items. There's currently no SPE commadlet for creating templates. (Though this may change based on some Slack discussions since then!)
    • The amount of code required to generate an item is much less.
    • I was planning to import from a JSON-based REST API from Setlist.fm. This means that in JavaScript, it's very easy to map/translate the data into the object shape expected by JSS.
    • By enabling the JSS full wipe mode, I could easily clean all imported templates and content on every incremental run of the import, as I added more fields, data, types, etc.
    • I've been doing way more JavaScript than PowerShell recently, so it just seemed easier. :)

    This approach does have some drawbacks:

    • The import currently expects to run in the context of a JSS app, so a JSS app needs to be configured and a JSS app item will be generated as the parent of the imported content.
    • The import process may create some unnecessary items, e.g. a root for the JSS app dictionary.
    • The JSS manifest does not have the full capabilities of the Sitecore data model. For example interlinking between items is difficult or impossible (at the time of writing).
    • The packaging and installation process does have some overhead. Installation via Ship became an issue for me due to the size of the Update Package. Once I scaled the amount of data coming in, I started getting HTTP timeouts from Ship, and had to install via the Update Installation Wizard.

    It still worked great for my test data purposes, and I could see other one-time imports benefiting as well. And it's always interesting to find new, novel uses for a tool you helped create.

    How It's Done

    Start with the JSS Quick Start to get one of the basic sample apps up and running. You'll need to follow the steps to get the JSS server and infrastructure components up and running. Then, if you like, strip it down. Get rid of all the react/angular code, existing *.sitecore.js manifest definitions, etc. Then create manifest definition files for your templates and your content. See my examples below. Enable full wipe mode if you want a clean import each time. Then use the JSS CLI to generate the manifest, create the update package, and install it.

    jss deploy package --includeContent --noFiles
    The includeContent flag is important to ensure your content items are imported. The noFiles flag is used to exclude any JSS code files from the package (we aren't deploying a JSS app, so we don't need them). If you just want to generate the update package, so you can install it via another means, you can just execute
    jss package --includeContent --noFiles

    The Code

    So again, this is a experimental/novel use of JSS, not a typical use case we had in mind when creating the manifest and import. But is it helpful/interesting to you? Have feedback? Let us know in #jss on the Sitecore Community Slack! -Nick

  • Sitecore MVP

    2017 Sitecore MVPs

    Posted 01/31/2017 by techphoria414

    Not much to say other than CONGRATS to all the 2017 MVPs! Happy to see my friend and former colleague Derek Dysart finally join the ranks. If you haven't been listening to Core Sampler, be sure to add it to your podcast app.

    This will by my sixth year as a Sitecore MVP. Thanks to the community for all your support, and thanks to Sitecore for the recognition.



  • Nominate more Women for Sitecore MVP

    Posted 11/01/2016 by techphoria414

    Hey Sitecore community! It’s our favorite time of the year: MVP Nomination! It’s your opportunity to help ensure that all the great contributors to the community are recognized for their efforts. If you know anyone who's not an MVP who you think deserves it, please take the time to fill out a nomination. If you know an existing MVP who has continued to serve the community, please re-nominate them as well to ensure continued MVP status!

    I want to take this chance though to ask you to nominate and re-nominate women of the Sitecore community in particular. The problems of monoculture, the advantages of diversity, and the challenges that technology companies have with gender diversity are well documented. I don’t think it’s a leap to argue that the Sitecore MVP program faces the same challenges. Diversity of people means diversity of ideas, and more female role models will only help to grow and strengthen the Sitecore community as a whole.

    There has certainly been progress since the conception of the MVP program. Cheers to all the talented women of the community who attended this year’s MVP Summit!

    But I believe we can do more. I’d ask everyone in the Sitecore community:

    • Are there women you know in the Sitecore community who should be nominated or re-nominated? Maybe a blog post that helped you out, or someone who helped you on Slack? Take the time and make the effort to bring more diversity into the MVP program.
    • Is everyone in your organization, regardless of gender, being given an opportunity to take the time and do the work needed to become an MVP?
    • Is your organization making efforts to hire more women in technology roles?

    The problem of gender diversity in technology is obviously bigger than our community. But we have a strong, tightknit and inclusive community that I have to believe rivals that of any other similar software platform. Let’s make it even stronger.


  • Rewriting Dates in Sitecore Analytics for Demo Data

    Posted 09/09/2016 by techphoria414

    Hello everyone! It's been awhile since I've blogged so I thought I'd post this useful bit before next week's MVP Summit and Symposium. Two plugs first though!

    Core Sampler

    If you haven't yet, be sure to check out the new Sitecore podcast from Derek Dysart, Core Sampler. Episode 1 is everyone's favorite Sitecore junkie, Mike Reynolds, and Episode 2 will be someone else very familiar...


    The Sitecore MVPs are leading an effort to help the Sitecore Community give back to the people of Louisiana, who are so graciously hosting us next week. Many many people not far from New Orleans have experienced horrific flooding recently, and are facing a long recovery.

    You can contribute to the effort through a financial donation or by volunteering the day before Symposium.

    And with that, let's get on to the post.

    Adjusting Dates in your xDB Demo Data

    In preparation for demoing Active Commerce at Symposium, we are busy creating JMeter scripts which simulate various user behaviors, conversions, traffic sources, etc. We are really hoping to show the power of the Sitecore marketing and analytics tools for e-commerce sites powered by our product. I won't get into the JMeter scripts here -- that's perhaps a topic for another day. But the result of running these scripts is a huge number of visits, but all on the same day. Not very exemplary of real site traffic. So I set about creating a PowerShell script that allowed us to "rewrite history" and shape the traffic over a given time period.

    The script can be found below. It uses a string of integers to represent the traffic shape, and updates the mongodb analytics data in place. You'll obviously need to rebuild your reporting database (and maybe your analytics indexes??) after running this. It is dependent on the Mdbc (MongoDB Cmdlets for PowerShell) Module.

    If you are looking to generate dummy or demo data for use with Sitecore Analytics, this may be useful for you. Enjoy!

  • Black Art Revisited: Sitecore DataProvider/Import Hybrid with MongoDB

    Posted 09/11/2015 by techphoria414

    tl;dnr -- Combine data import and DataProvider approaches for great justice. Find the source on GitHub.

    My Black Art of Sitecore DataProviders article still gets a lot of hits, which to some degree makes sense -- creating a DataProvider hasn't really changed since Sitecore 6.0 (though I very much need to explore pipeline-based item providers in Sitecore 8). But if I've learned anything myself since then, it's that there are actually very limited circumstances in which you want to implement a DataProvider. Why's that?

    • They're difficult to implement
    • It's difficult to invalidate Sitecore caches when data is updated
    • It's difficult to trigger search index updates when data is updated
    • You can't enrich the content or metadata in Sitecore (e.g. Analytics attributes)
    • You are dependent at runtime on the source system
    • They're even more difficult to implement in a way that performs well

    So the fallback is usually an import via the Item API. But I've never really seen anyone happy with this either, especially for large imports. Why's that?

    • Writing data to Sitecore is SLOW. Even with a BulkUpdateContext.
    • Publishing can be very slow as well, though Sitecore 7.2 improved this significantly.
    • Usually you end up using a BulkUpdateContext, which means needing to trigger a search index rebuild, and potentially a links database rebuild, after every import.
    • Scheduled updates mean that it can be hours and hours before a data change in the source system is reflected in Sitecore.

    Is there a way to integrate data with Sitecore that balances the immediacy of a data provider, with the simplicity and enrichment ability of an import? Maybe! I'm presenting here a POC that combines the two approaches to try and achieve just that. We do this by introducing an intermediary data store that you may already have in your Sitecore 7.5 or 8 environment -- MongoDB.

    Data Provider / Import Hybrid

    The basic process:

    1. Data is pushed frequently from an external system into MongoDB. Writing data to MongoDB should be quick and easy, so it can be done very often, in theory. Maybe the data is already in MongoDB, in which case you are set.
    2. New items (products in this case) are imported frequently into Sitecore. This can be done quickly and more often, in theory, because we are implementing minimal data -- just creating the item, and populating a field with an external ID.
    3. We use a DataProvider with a simple implementation of GetItemFields to provide field data for the item directly from the MongoDB.
    4. To ensure caches are cleared and indexes are updated when data changes, we monitor the MongoDB oplog, a collection that MongoDB maintains to help synchronize data between replica sets.
    5. Content editors can enrich data on the item as needed. Externally managed fields can be denied Field Write to prevent futile edits.

    So, does this work? Glad you asked. I put together a POC and recorded a walkthrough, which you can find below. In the video, I go into more detail on import vs data provider, and some of the potential gotchas of the hybrid approach.

    Again, this is all theoretical. Has not been attempted in a production implementation. But I do think there is potential here, especially given that MongoDB is going to be found in more and more Sitecore environments going forward. Feedback is welcome, as are pull requests. :)

    Full source code can be found on GitHub.

  • Remove a Page from the Sitecore xDB (Sitecore 8 Technical Preview)

    Posted 10/01/2014 by techphoria414

    Note: Information in this post is based on the Sitecore 7.5 and 8 Technical Previews and is subject to change.

    Here's a quick one. I've been doing some JMeter traffic generation on my xDB for a forthcoming post/video on the Path Analyzer, and to push data from session to the xDB quickly, I implemented an "end session" page to hit at the end of each test thread.

    (I will hopefully also get a chance to share my work in JMeter, but in the mean time you should start where I started, with Martina Welander's awesome post.)

    Unfortunately the first time around I forgot to exclude my /EndSession.aspx from analytics, so it was muddying my data. However it was pretty easy to directly remove this from the xDB using a mongodb query.

    Basically, I'm telling mongodb to update documents in the the Interactions collection and remove elements from the Pages array where the URL path is /EndSession.aspx. The final "true" argument tells mongodb to update all documents which match the query (the empty first argument in this case), not just the first it finds. For more info on what's going on here, check the mongodb documentation on the update() method.

    After running this, I had to rebuild the sitecore_analytics_index using the Indexing Manager and rebuilt the reporting database using /sitecore/admin/RebuildReportingDB.aspx.

    This query was used on a Sitecore 8 xDB but based on the Sitecore 7.5 xDB Technical Preview, would work with 7.5 as well.

    - Nick / techphoria414 

    Read more... Pre-Disqus Comments (1)
  • Active Commerce

    Sitecore 8 Technical Preview: Active Commerce and the Experience Explorer

    Posted 09/26/2014 by techphoria414

    Today Sitecore released a Technical Preview of Sitecore 8 to the MVP community, and like good MVPs we are all scrambling to install the preview and write our first blog post on the beauty of Sitecore 8.

    Note: This article and video is based on a Technical Preview of Sitecore 8. Features and functionality are subject to change.

    Sitecore showed many amazing features of this forthcoming version during the Symposium events in Las Vegas and Barcelona. One "lesser" feature (really only in relation to the other amazing features) is the Experience Explorer. The basic idea is to simulate various visit and visitor segments in order to test personalization and other behaviors. Since Active Commerce utilizes the Sitecore Rule Engine for cart promotions, I was curious as to whether the Experience Explorer could be used to test cart discounts with Active Commerce. The answer was most definitely yes -- check it out below.

    It's worth noting that getting Active Commerce running on this Sitecore 8 Technical Preview took no code changes from the POC which I did on Sitecore 7.5. Neither did getting the Experience Explorer to work with our promotion engine. This is another great example of why we built Active Commerce natively within Sitecore, and why we say Active Commerce is "Sitecore e-commerce done right."

    It's worth noting that there is an Experience Explorer Module available for earlier versions of Sitecore as well.

    Great work Sitecore. Can't wait for Sitecore 8 to go gold.

    - Nick / techphoria414

  • One Month with Sitecore 7.5, Part 6: Extending Report Data via Aggregation

    Posted 08/28/2014 by techphoria414

    In the final part of this series investigating Sitecore 7.5, we’ll look at how the new analytics and reporting structure allows us to extend the processing framework, and create new data in the reporting SQL database.

    By moving collected analytics data to MongoDB, Sitecore solved issues of scalability and extensibility. However it did not help them with the problem of doing reporting on these massive data sets. While MongoDB is a great platform for storing and retrieving documents, relational databases still rule the world of complex queries and data analysis. So rather than eliminate the SQL database from analytics, Sitecore introduced a processing framework that can aggregate data into a new relational data structure which has been optimized for reporting.

    The new reporting database contains a series of fact and dimension tables, which is a common structure utilized by business intelligence and data warehousing tools. In short, a fact is an event, potentially with some sort of measurable data about the event. A fact for example might be a page view (including duration) or a website visit (including the number of pages visited). Facts are structured in a way that should allow easy summation, grouping, etc for reporting purposes. A fact record would contain foreign keys to dimension tables, which would contain data about the people or objects which were involved with the event. This could be the Sitecore item which was visited in a page view, or the contact which visited your site. In essence, dimensions are lookup tables.

    This new Sitecore 7.5 analytics framework also allows you to extend the reporting database with your own fact and dimension tables, and to extend data processing to populate them. You may perhaps want to do some reporting on data you have added to the contact, or on data you are collecting about user interactions via page events.

    In the preview release of Sitecore 7.5 provided to MVPs, the process for creating a custom aggregation is described in the Customization chapter of the xDB Configuration Guide.

    1. Utilize events or other analytics to log the data you wish to aggregate.
    2. Create a new Fact table.
    3. Create model classes for the key and value of your Fact.
    4. Create a new AggregationProcessor and register in the aggregation pipeline.
    In this example, we are going to create a new fact table with data about what products are added to our users’ shopping carts. Note that the Sitecore documentation is much more thorough in describing this process -- be sure to reference it. Consider this your introduction/overview.

    Use Page Events to Collect Cart Data

    For this POC, I just added the event to the existing Active Commerce shopping cart logic. It’s obviously important here to include any data which you wish to include in your aggregation. You’ll also need to create the event in Sitecore.

    Create a new Fact Table

    The irony of Sitecore introducing a NoSQL database to its architecture in 7.5 is that for the first time, Sitecore is also giving you a reason to create new relational tables in SQL Server. Well, at least it’s ironic to me.

    My new fact table contains information on all the products which users have added to their carts. You are typically going to have an aggregate primary key which contains the columns that define the uniqueness of the event. For our “product added” fact, that will be the product code (unique product identifier), the date of the event, the site the user was browsing, and the contact who added the product to his/her cart. The only aggregated value we are tracking on this fact is the quantity added.

    We’ll also add foreign key constraints to the appropriate dimension tables. Note that Sitecore recommends that you create these constraints to document dependencies with dimension tables, but that you disable them to improve performance.

    Create Model Classes for your Fact

    Our next step is to create model classes for our new fact table, which Sitecore will map for us during the aggregation process. We’ll need a DictionaryKey subclass for our “key,” which contains our primary key fields, and a DictionaryValue subclass for our “value,” which contains the aggregated value(s) for the event. We’ll also create a Fact subclass which combines the two.
    Sitecore seems to do the table and field mapping based on naming, and also seems to handle the obvious type mappings between Guid/uniqueidentifier, DateTime/shortdatetime, string/varchar, long/bigint, etc. The current early-release documentation is incomplete on this subject. The use of the Hash32 type for our site dimension ID, for example, was based on reviewing existing facts and aggregation processors which Sitecore includes in 7.5.

    The constructor for the Fact base class accepts a reduction function which we must provide. This function combines, or aggregates, two values for a given fact key. If we were to process two events which have the same key, this function would be called to aggregate their values before the fact is written to the reporting database. In this example, we simply add the values together, as I suspect would often be the case. Your DictionaryValue subclass is a logical place to create the static function that’s needed here.

    Create a New AggregationProcessor

    Here’s where the real work happens. As you might have expected, aggregation processing happens in a pipeline. When a visit is being processed, it is passed through the interactions pipeline and each processor has the opportunity to perform aggregation for the facts for which it is responsible. The processor itself can examine data in the visit, and “emit” facts.

    What’s interesting here is that you could theoretically call out to other data sources here in constructing your facts -- you aren’t limited to data being processed from the xDB. I’m also curious as to whether the processing API would allow distribution of processing work for other data sources beyond visits, perhaps calling a custom pipeline. But that’s an investigation for another time.

    For our processor here, we need to iterate over the pages in the visit, and look for any shopping cart events. If any are found, we’ll use the Fact API to construct a new fact, and “emit” it with its key and value. Behind the scenes, this will call our aggregation function as needed. The processing API also provides some other utility calls we need, to find or create the site dimension as needed, and to translate the date/time precision of our event as needed. The default precision strategy will “round” the date/time to the minute. This would, in theory, allow you to run and filter reports with minute-by-minute precision.

    Finally, we’ll need to patch in this new processor to our Sitecore config. Note that there appears to be some new grouping available in the pipeline configuration now. As the number of pipelines in Sitecore continues to balloon, this totally makes sense. Perhaps Sitecore will shed more light on this new structure as 7.5 comes closer to release.

    Rebuild and Test

    To test our new processor, we need to rebuild our analytics data. To facilitate rebuilding of analytics data, Sitecore actually requires that you have two reporting databases, so that one can still be available for reporting, while the other is rebuilding. These are simply configured as the reporting and reporting.secondary connection strings. Testing of the rebuild can then be done through a new administrative screen, /sitecore/admin/RebuildReportingDB.aspx.

    Click “Start” and Sitecore will begin to process, and update you on progress as it goes.

    If you have a lot of data, rebuilding could obviously take some time. On large sites which have collected a lot of data, it may be necessary to keep a reduced data set around for testing purposes. Otherwise the debugging cycles for new aggregations could become very long and arduous.

    Once processing is completed, aggregated data should appear in your fact table.


    Now that we have this additional data available, how do we best report on it? One option I imagine would be creating some cool new SPEAK-based reporting UIs. I am not experienced enough with the framework yet myself to say, but it seems like it would be easy enough to wire up some SPEAK charting components along with a SQL-based data source to create your own reports. But that will be a post for another day, perhaps by someone else!

    I did want to attempt to push my data into a Stimulsoft report (Engagement Analytics) as well, which seems like it would be easier. But at the moment I’m getting an error when attempting to access report items in the Content Editor. And thus I am bailed out by beta software. But the point is -- you have some options for creating reports based on your new data.

    That’s it!

    And that brings us to the end of our series on Sitecore 7.5. This release of Sitecore truly brings the infrastructure and architecture of DMS to the next level. As always, it will be exciting to see what partners and customers do with the framework. We at Active Commerce are very much looking forward to using the framework to bring new functionality and great new data to our customers.

  • One Month with Sitecore 7.5, Part 5: Persisting Contact Data to xDB

    Posted 08/28/2014 by techphoria414

    With its flexible schema and scalable architecture, the xDB immediately becomes an attractive option in Sitecore 7.5 for storing all sorts of user-centric data, particularly anything you are interested in utilizing for reporting purposes. Developers who have worked with the .NET MongoDB Driver know how easy it is to persist any object data to the database. However, for good reason, your access to xDB is a bit more abstracted than this. You do, however, have three options for persisting contact data to the xDB.

    I myself only implemented one of the options below in my search for a means to persist shopping cart data in a POC for Active Commerce. But I’ve provided an overview of all three options.

    The submitContact Pipeline

    We saw in Part 3 of this series how data can be associated with the current contact via the Contact.Attachments dictionary.

    Though very useful, data in the Attachments collection is not persisted with the contact when the session is flushed. However, you could tap into the submitContact pipeline by creating your own SubmitContactProcessor, and persist the data to your own collection in MongoDB.

    As for how you persist, and how you load that data later, you’re a bit on your own. There is no corresponding loadContact pipeline at this time, and your best option for persistence appears to be accessing MongoDB directly via Sitecore.Analytics.Data.DataAccess.MongoDb.MongoDbDriver. You could then potentially access that data via an extension method on your Contact. Not ideal, and I’m not sure whether this would work with xDB Cloud.

    This did not seem to be the ideal option for me. I wanted something more straightforward which worked within the existing xDB data structures.


    This structure on the contact seems to allow storing of simple name/value string pairs that are persisted and loaded with the contact data. This is a step forward, but for a shopping cart, I needed something that could handle a complex object.

    Contact Facets

    Not to be confused with search facets, contact facets allow you to define entirely new model classes that can be stored with the contact, and accessed via Contact.GetFacet<T>(string). Here we have an option which allows us to store complex data with the contact, without having to worry about persisting the data ourselves. Sitecore 7.5 includes a number of contact facets, which can be utilized to store additional information about the contact. This data appears to help fill out the Experience Profile report.

    The facets are configured in a new /sitecore/model configuration element, which defines various data model interfaces and their implementations, and associates them to entities (a contact in this case) with a given name.

    For example, to fill in a contact’s first/last name, we can use the Personal facet.

    Implementing your own facet requires a few steps, but is not difficult. The steps below include my POC for persisting shopping cart data.

    1. Create an interface for your facet which inherits IFacet. Add your desired fields.
    2. Create an implementation which inherits Facet. Use base methods to “ensure,” “get,” and “set” member values.
    3. For composite object structures, create an IElement and Element following the same pattern.

    4. Register your element in the /sitecore/model/elements configuration.
    5. Register the facet in the /sitecore/model/entities/contact/facets configuration.

    6. Access the facet via Contact.GetFacet<T>(string).

    After the contact’s session is flushed, you can very plainly see your new data persisted with the Contact. Nice!!

    MongoDB Facet Data

    As you can see, facets are an easy and powerful means of persisting contact data to xDB.

    That’s it for Part 5! In the last part of this series, we’ll look at another new extension point available in Sitecore 7.5, data aggregation.

  • One Month with Sitecore 7.5 Part 4: Sclability Options, New and Old

    Posted 07/09/2014 by techphoria414

    It’s actually going on two months with Sitecore 7.5 at this point, and I’ve obviously gotten myself in over my head with this blog series, but I’m battling on. Here in Part 4, we’ll take a look at deployment options for Sitecore 7.5, both new and existing.

    A more in-depth reviewing of existing Sitecore hosting architecture considerations can be found on Aware Web’s blog.

    Minimal Deployment

    Despite the additional complexity Sitecore has added to the DMS architecture in 7.5, it's still possible to run full Sitecore functionality in a single instance of the software. This obviously simplifies things for developers, and means that even small deployments can take advantage of the xDB. However, unless you are utilizing the xDB Cloud Service, running MongoDB is necessary. With the addition of xDB Cloud and Sitecore's "enhanced" SQL Sessions (to support Session_End), you can avoid that requirement.

    Scaling via Server Role

    In 7.5, Sitecore continues to provide new options for offloading server roles onto dedicated hardware (or virtual hardware). This does allow you to vertically scale (in the traditional use of the term) individual servers according to the needs of the service they are running. Let's review all of our available options....

    Content Management and Content Delivery

    The longstanding scaling option in Sitecore, splitting your Content Management (CM) and Content Delivery (CD) servers is typically your first step in growing a Sitecore environment. This is sometimes called "content staging" and can be done for performance, availability, and security reasons.


    This has essentially been an option since Sitecore 6.3 (someone please correct me if I'm wrong on that), but since the publishing process was limited to a single thread, there was not much benefit to splitting publishing responsibility from your CM server. Sitecore 7.2 changed this however, and multi-threaded publishing can now easily consume CPU resources on a multi-core server. Large instances with many items and frequent content changes will benefit greatly from a dedicated publishing server.

    Content Search

    Using either the Solr or Coveo providers for Sitecore.ContentSearch, it’s possible to offload content indexing and searching onto a dedicated search server. Useful for deployments which are highly dependent on content search, or which need to handle searching of massive amounts of content.

    SQL Server

    Even the most minimal of Sitecore installations should likely have an independent SQL Server. In addition to the standard Sitecore databases (core, master, web), in 7.5 this would house the reporting database and potentially a session database. Depending on the amount of analytics data, you may want to take this a step further by splitting your reporting into a separate SQL Server. This is a must if your hosting architecture is geo-distributed, since you will need a central reporting database in which to aggregate data from your data centers.



    Unless you are using xDB Cloud (which we’ll discuss shortly) and using SQL Server for your sessions, to use analytics in 7.5 you will need to run MongoDB. For low traffic sites, it may be possible to run it on the same hardware as SQL Server, but given the low cost of virtualized Linux servers, adding a dedicated MongoDB install is an easy scaling option. You might also consider separating the MongoDB session database onto its own server, again a must if you have a geo-distributed architecture.

    Processing / Data Aggregation

    For high traffic sites, splitting off processing and data aggregation responsibilities to separate hardware will decrease impact on content authoring, and speed up data processing. Another option might be to dual-purpose your CD servers, and take advantage of distributed processing using your existing cluster.

    Reporting Services

    High traffic sites, or those making heavy use of reporting, may also benefit from splitting responsibility for the Reporting Services, which run queries and combine data from the collection and reporting databases in order to support reporting UIs such as the Executive Insight Dashboard and Engagement Analytics.

    Scaling by Server Role

    Scaling via Clustering

    Many aspects of the Sitecore deployment architecture can also “scale out,” so that as your needs increase, you can add additional servers for load balancing and failover.

    Content Delivery

    Almost always your first need for scaling out. Adding additional content delivery servers is the primary mechanism by which you can improve your site’s performance and availability. With the publishing improvements in 7.2, and the new analytics architecture in 7.5, it’s now also much more practical to deploy a geo-distributed architecture, with multiple clusters of content delivery servers. By utilizing the SQL Server or MongoDB session state providers in Sitecore 7.5, it’s also possible to implement non-sticky load balancing within each webfarm, which gives a better load distribution between the servers, and increases the reliability / failover capability of your webfarm.

    Content Management

    For organizations with a large number of content authors, adding additional content management servers allows scaling of the authoring environment. Since you can only have a single master database, all content management servers must be local to the master database.


    One of the primary benefits of MongoDB over SQL Server and other relational databases is its ability to scale horizontally. Sharding allows MongoDB to distribute data across multiple servers using a shard key. Replication sets mirror data between servers for failover. Combining the two gives you a sharded cluster. This allows MongoDB to scale horizontally for huge data sets, on low cost Linux servers. This is not without complexity however. See the MongoDB guide to Sharded Cluster Architectures.

    Processing Servers

    The new Sitecore 7.5 data processing/aggregation services utilize worker processes which read from a processing pool and feed them to “aggregators” which process the data for reporting purposes.. This makes it possible for deployments which are processing huge amounts of data to run multiple processing servers, all of which can process data from the Collection database and aggregate it to the Reporting database.

    Between versions 7 (content search), 7.2 (publishing), and 7.5 (analytics), Sitecore has made many improvements on the ability of the software to scale to massive amounts of content and analytics data. But with it has come additional deployment complexity. To mitigate this, and make it possible for any Sitecore deployment to easily take advantage of the xDB, Sitecore plans to offer xDB Cloud.

    Scaling by Cluster

    xDB Cloud Service

    Complimenting the release of Sitecore 7.5, xDB Cloud will allow you to take full advantage of the Experience Database and all its reporting without having to collect, store, or process any analytics data locally. 100% managed by Sitecore, this means you can get away without having to run MongoDB in your environment -- provided you are using SQL Server for session data.

    Sitecore plans to offer xDB Cloud at a “low” cost based on the number of contacts stored in the xDB. They will also be offering non-production access at a lower cost (with a corresponding lower SLA) for use in development and testing.

    What’s most exciting about this new offering is that Sitecore is removing barriers for all their customers to better utilize the Digital Marketing System. Large customers with huge amounts of data can take advantage of the new, highly scalable architecture. Smaller and larger customers alike also have the option of outsourcing their analytics infrastructure to Sitecore with a service that they know will grow with their data needs.

    Configuration and Use

    Sitecore was kind enough to grant me access to a preview of xDB Cloud for evaluation with Active Commerce. After requesting an instance from Sitecore, a step which will be replaced with an easy App Center purchase in the future, I was given a Deployment ID. I enabled the Sitecore.Cloud.Xdb.config, filled in my Deployment ID within this file, and… that was it. I admittedly did not spent a lot of time testing this service, but it certainly appears that they have achieved their goal of making xDB easy to use via this SaaS offering.


    One of the most exciting aspects of xDB, which I will be investigating in the last part of this series, is the extension of reporting data. The data aggregation and reporting architecture in 7.5 makes it possible to extend the data collection and analysis performed by the DMS. However, in the initial release, it will not be possible to extend data aggregation in the xDB Cloud offering. Sitecore does plan on addressing this in a future release, and all appearances are that it is a high priority item for them.

    MongoDB Cloud Hosting

    This article was going to end here, but I was inspired to try one last experiment with Sitecore 7.5 this morning -- utilizing ObjectRocket’s hosted MongoDB service for the collection database and the other Sitecore 7.5 MongoDB databases. ObjectRocket, owned by Rackspace, has a very nice, scalable cloud offering for MongoDB. They have several data centers which are local to both AWS and Rackspace data centers, making it a very attractive option particularly for those who are already hosting with Amazon or Rackspace.

    I created all the needed MongoDB databases in an ObjectRocket instance and configured my connection strings to utilize them. This included a MongoDB session database. In a production scenario, this may or may not be practical, depending on your latency to ObjectRocket, as the session database would presumably be more sensitive to latency. But with just myself as a single user browsing my development site, there did not appear to be any performance degradation. Even the Experience Profile report, utilizes the collection database, seemed to perform reasonably.

    Using a service such as ObjectRocket with Sitecore 7.5 would potentially give you the ability to scale to massive amounts of data, without maintaining your own MongoDB infrastructure. However unlike Sitecore’s xDB Cloud offering, you would still have the complexity of maintaining your own processing and reporting services. And you would certainly want to do additional performance testing of both the site and data aggregation!

    That’s it for Sitecore 7.5 deployment architectures. In Part 5 of this series, we’ll get back into some code, and look at how you can extend the Contact data which is persisted to the xDB.


View more