The SharePoint Structure

Have you ever thought about the way a SharePoint site is built? In the following section, key structural pieces of a SharePoint installation are explained. SharePoint sites often grow organically—the section, Comparing a SharePoint Web Application to a Tree, later in the chapter, introduces you to the basic building blocks of a SharePoint installation’s structure. Table 2-1 lists the nine main SharePoint structural elements of interest.

Table 2-1. The Definition of Nine Main SharePoint Structural Elements

Structure element

Definition

Farm

The term farm is used in two main ways. The farm can be considered to be the physical computers and the software required to be running on them to host SharePoint. In addition, each farm’s settings are held in a unique Configuration Database stored inside SQL Server.

Service application

A service application runs within a farm to provide capability to the sites hosted on it or another farm.

Content database

The majority of the information added to a SharePoint Site is stored in a content database.

Web application

A SharePoint site is accessed through a web application that provides the address and authentication configuration among other configuration properties. A web application must have a corresponding website in Microsoft’s web server, Internet Information Services (IIS).

Site collection

One or more sites that are grouped together into a Site Collection. Sites are organized hierarchically in Site Collections, and some configuration settings and administrative actions applied to Site Collections effect every site in the group.

Site

A site is a logical grouping of content within SharePoint. Each Site Collection has a root site, the main point of entry.

Document library

Document libraries are historically the most used element of a SharePoint site. Library settings control visibility and content types among other critical configuration.

List

The majority of all content in a SharePoint Site is held in a list. Sites have many lists of many types. Even a document library is a list, but a very specialized kind of list with tools designed to work best with documents.

Webpage

In SharePoint 2010, webpages take on new importance for SharePoint sites. Every webpage in SharePoint 2010 is a wiki page with rich text editing capability. Each web page you create from the browser is stored in a document library.

Comparing a SharePoint Web Application to a Tree

Think of a Web Application as a tree. Each trunk is a Site Collection, with the first site in the collection coming from the same set of roots as the other trunks. The branches are like sites branching off the first site and the leaves are list items and documents.

Some web applications are like a pomegranate tree, which can have more than one trunk in the same tree; SharePoint Server 2010 (a product built on SharePoint Foundation 2010) has a good example of this type of tree, the web application configured to be the My Site Host. Each individual’s My Site is itself a Site Collection; in an organization with 80,000 users you would end up with 80,000 Site Collections sprouting out of the same web application root base.

The base address of the application, my.litware.com, for example, redirects to the current authenticated user’s personal Site Collection, which is located at http://my.litware.com/personal/<username>. In this example, you could browse to any other My Site public profile by entering my.litware.com/personal/ followed by the other user’s name (if you know it).

It might help to picture the SharePoint components of a public website like a tree with many trunks and even more main branches. Figure 2-2 illustrates how the web addresses of such a site might map to the tree picture.

The SharePoint pomegranate tree.

Figure 2-2. The SharePoint pomegranate tree.

Many web applications look more like a pecan tree. The pecan tree has one trunk; the Site Collection, has a few thick, strong branches off it, supporting other branches and lots of leaves (not to mention tasty nuts in the fall…). Figure 2-3 illustrates how the web addresses of this type of site might map to a single-trunk tree.

A classic intranet publishing portal matches this version of the metaphor: http://portal.contoso.com is the address of the web application and the first site of the Site Collection. The entry page for this web application is at the address http://portal.contoso.com/pages/welcome.aspx. Human Resources might have a main trunk site at http://portal.constoso.com/hr. Benefits information might be stored in a leaf document library at http://portal.contoso.com/Locations/Lists/Benefits. The webpage about medical benefits might be at http://portal.contoso.com/Locations/Lists/Benefits/Medical.aspx, with a link to the provider’s benefit statement at http://portal.contoso.com/Locations/Lists/Benefits/ProviderStatement.pdf.

The SharePoint pecan tree.

Figure 2-3. The SharePoint pecan tree.

The SharePoint Farm Supports the Web Applications

The SharePoint farm is the set of servers hosting all of the sites and support they need. A farm can have as few as one server, which would host the entire infrastructure needed for a small organization.

SharePoint is very scalable. A farm supporting higher user demand would benefit from a large amount of server resources. Such a farm could be much more like an industrial nut orchard that provides the benefits of its fruit to large amounts of people.

Some service applications provided to the farm are analogous to the water and fertilizer that are applied to an entire orchard. Other service applications are applied with more discretion, similar to spraying insecticide at the site of an infestation. Business Data Connectivity (BDC) is a service that can be applied across all the web applications and the content in their Site Collections. You can configure a connection to a business data source (such as Microsoft CRM) once and provide that data to all the web applications in a farm. In the preceding examples, a workflow that begins when a new client is added in CRM might add a task list item. Because BDC is a shared service, you can make use of the data it provides on a user’s My Site or in a department’s Team Site from the same BDC source.

The database is important in planning large implementations of SharePoint. You might or might not know about the relationship between SharePoint and the database. The designers of SharePoint chose to leverage the power of the entire Microsoft platform stack. One place mature technology was exploited is the storage of items added to or created in SharePoint. By taking advantage of the efficient, secure, and reliable platform provided by Microsoft SQL Server, all of those benefits are passed on to the users and administrators of SharePoint sites.

All of the content in a SharePoint 2010 farm is stored in one or more databases on one or more servers running SQL Server. When SharePoint use really takes off in a large organization, it becomes very important to understand the relationship between the items discussed above and a content database.

The relationship of the content database to the web application and Site Collection is explained as follows: one content database holds content from one or more Site Collections of one web application. In the example web applications presented earlier, at least two content databases would be required to hold the two web applications for the My Sites and the intranet. Further, the contents of a single Site Collection must be stored together in the same database; however, one content database can hold the content of multiple Site Collections. A web application can spread out the storage of multiple Site Collections across multiple content databases.

The tree metaphor is helpful to get across many important administrative concepts of SharePoint. Visualizing the structure of your SharePoint environment can be helpful in decisions about content upload, creation of webpages, and new site creation. You will be well on your way toward understanding the structure if you can keep in mind the relationship between the first five main structural elements described previously. The farm, web application, service application, web application, content database, and Site Collection are critical concepts for the intermediate to advanced user of SharePoint who wants to speak to IT professionals about the supporting structures of her SharePoint site or sites.

The Content Database as a Unit of Storage

Of all the SharePoint structural concepts introduced so far, the content database might be the most important point to understand toward achieving a very successful SharePoint implementation. If the users in your organization begin to depend on SharePoint for hosting all of their critical files, lists, and webpages, the amount of storage used can grow dramatically. You can take your understanding of the tree metaphor, add your understanding of content databases, and apply it to an example where quick storage growth becomes a challenge for performance and stability of the SharePoint implementation. You will also be able to see how the same elements can explain the solution.

Let’s go back to the intranet portal example and assume that the entire organizational structure was represented in the site structure such as the Human Resources department. A common business structure might have sites for Sales, Marketing, and Operations. Teams under those groups might also receive sites below their parent group site. As more and more sites are created with more and more members of the organization creating and uploading documents, pictures, list items, and webpages, the one content database for this one tree in the orchard is storing all the content.

This implementation is a classic example of three issues common in successful SharePoint implementations: 1) disorganization of information; 2) delays backing up and restoring the content; and 3) deteriorating performance of the entire web application. All of these issues occur gradually over time, so a great approach is to be aware and monitor growth and change and plan for reorganization or new hardware purchases ahead of time.

Using a Content Database as a Unit of Backup and Restoration

First, consider the case of backing up and restoring the content of your intranet in this example. A successful SharePoint implementation with a SharePoint structure like this can result in a database measured in terabytes of storage space. If you’ve ever tried to back up a 200 GB hard disk, you understand the amount of time it requires to save your important information.

The amount of time it takes to back up data is affected by two critical restore parameters. The first is how often data can be backed up; if it takes eight hours to back up the content database and you only run one backup at a time, your SharePoint content will only be safely backed up once during a business day. For certain tasks at some organizations, it is acceptable to lose a day’s worth of information; for others, losing even a minute of information could be trouble.

The second restore parameter affected by large content databases is the amount of time it takes to restore. Again, it is up to the task and the organization to determine how long is too long. However, in some situations, waiting a day for SharePoint to be restored after a disaster or an unexpected failure is just too long.

Tip

INSIDE OUT Your backups might be interrupting SharePoint operations

You might not care for the full detail of the backup operation, but as a site user and business influencer, some backup details are important to pay attention to. Certain types of backup operations can diminish your SharePoint experience. For example, the method called the Site Collection Backup requires that the Site Collection is placed in a Read-Only mode for the duration of the backup. Other types of backups are resource intensive and might cause competition with user operations. Both issues can be mitigated by scheduling backup tasks during off hours if you do not run a 24-hour operation. Awareness of how and when your backups are running can help you to understand the performance implications for your users.

Managing Content Database Size for Performance

Next, consider the case of deteriorating performance in your web application. In a really large content database, there are probably a few ways performance can be affected. One of the biggest performance issues in large SharePoint 2007 implementations is that of the large list. SharePoint 2010 has been improved in many ways in how it deals with large lists. The effect of a large list on the performance of a web application has been reduced, but there is still a chance that you might be affected by it.

The detail of why large lists impact performance is mostly uninteresting to the average SharePoint user. The highlight is that a database can be locked in certain situations and one of these can occur with large lists. A lock occurs on the database level when more than 5,000 items in a list are queried at one time. During this lock, all other reads and writes for all lists in all Site Collections are queued until the previous transaction is complete and the lock is released. When a database is locked, everyone accessing the database experiences a delay in receiving results. If this lock situation happens in the one database holding all the content for your one Site Collection holding all the sites for every group in your organization, many people will be unhappy.

Note

If you or someone in your organization is interested in the full technical details of database locking in SharePoint, read the White Paper, “Working with large lists in Office SharePoint Server 2007,” which you can download at http://technet.microsoft.com/en-us/library/cc262813(office.12).aspx. Although written for 2007, the database and list fundamentals still apply in the 2010 version of the product. Many of the SharePoint Customization issues detailed in the White Paper have been addressed, but the underlying principle of locking is still possible (and well explained by the paper).

Organizing for Content Database Growth

Finally, consider the case of disorganization of information. This one is probably not hard to imagine if you’ve been using a computer for many years and you’ve ever lost a file on your hard disk or a file share. Over time, file storage tends to become filled with documents that are rarely accessed and out of date. The same can happen to any kind of content in a SharePoint site. Remember that a SharePoint site is intended to be dynamic; therefore, you need to plan accordingly. Identify the areas that are important to the users of your site. Plan to repeatedly highlight timely, relevant information. Collect feedback from your users on organization and usefulness to ensure that your growing site meets not just your needs, but the needs of your collaborators, as well.

The best case scenario for your organization is to plan ahead of time and accommodate growth where anticipated. SharePoint sites intended for team collaboration and document sharing tend to grow in database size over time. Document sharing sites are popular, but they can often be isolated into Site Collections by audience. Site collections are natural security and audience boundaries. Interaction within a legal team, for example, deserves this type of isolation for the sensitivity of the information alone. However, other teams can also follow the model, whereby internally important information is contributed on one Site Collection and more generally interesting information is uploaded to a shared portal. Again, if we look to the SharePoint Product Team’s example of building out My Sites in SharePoint Server, we see this model in its extreme. A My Site gives a user a place to upload content and control access in his own Site Collection. If the team designing SharePoint puts that architecture forward as a model, you can feel safe in assigning small to medium sized project teams similar workspaces that they can control.

Creating Site Collections for unique audiences reduces the amount of content in each Site Collection. The added benefit beyond security is added mobility of the content within the content database. Individual Site Collections can be moved between content databases with more flexibility than sites or lists. Also, storage size quotas can be placed on Site Collections but not sites or lists. If the size of a content database becomes an operational issue, the ability to move content to reduce the size of existing databases becomes a big benefit.

If you find yourself in the situation where too much content has been added too quickly, there is good news. Others have been through this before and strategies have been developed to overcome all three of these issues related to the inevitable growth in a successful SharePoint implementation. For example, if you find that you want to reduce your backup or restore period time, there are two possible paths. Relying on IT professional analysis if necessary, identify if the current read or write speed for your content database storage, your backup storage, or the network in between can be increased by the purchase of new hardware or optimization of the current hardware.

At the same time, use your understanding of the SharePoint structure to look at how you might reorganize your sites to meet your goals. If all of your sites are currently held in one Site Collection and subsequently, one content database, you have an opportunity to create a new Site Collection in a new content database where existing sites can be moved. However, moving is always stressful and sometimes items become lost in the move. Professional tools and consultants will help you move more quickly with less loss, but there is always a cost associated with that kind of help. The best strategy is to be proactive and move early. If you identify the level of service you need from SharePoint and you can estimate when potential support milestones will be hit, you can build a roadmap for potential changes ahead.

The time to backup or restore is a common service level requirement for electronic information. When you leverage the Site Collection as a mobile unit of SharePoint Content, you can arrange your Content Databases to accomplish your service level requirements. In a common method of backup, the Content Database is the container that is backed up or restored. Moving your most critical Site Collections to a new content database will allow your IT professionals to backup and restore those sites more quickly. If you consider this type of reorganization, keep in mind that you can host multiple Site Collections under one web application. In that way, you can maintain one base web address for multiple Site Collections and reduce the backup and restore time of your most critical information.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset