dCS Product Overview

dCS Product Overview

Chapter 2

Introducing Content Server

Figure 12: Content Server provides low-level services for dCS.

Content Server is the dCS base product. It provides the underlying features that the rest of the dCS family uses. If dCS is a kind of content management operating system, then Content Server is its kernel and some of the inner programming layers. Content Server ultimately implements all the fundamental features of the products. Like most operating systems, the outer layers (such as CS-Direct and CS-Direct Advantage) provide friendlier interfaces than the inner layer. In addition, the outer layers provide more programming leverage for developers. Nevertheless, Content Server provides programmers with a significant low-level toolkit. Content Server also provides many features that support the delivery system.

As you read this chapter, remember that, although Content Server or CS-Satellite provide these features, you might interact with these features through a higher-level product.

Publish Static or Dynamic Pages

After developing a site and adding content to it within the management system, you can publish it. Publishing means moving information from the management system to the delivery system so that visitors can view it. dCS provides the following publishing mechanisms:

Mirror to Server (or more simply, mirroring) -- The management system essentially copies the elements and data to the delivery system.

Export to Disk (or more simply, exporting) -- The management system processes its elements and data and generates HTML. Then, the management system transmits this HTML to the delivery system.

Export to XML -- The management system generates XML files and transmits them to the delivery system.

Figure 13: Static versus Dynamic page generation

When a visitor accesses a URL on the delivery system, the way the delivery system serves the page depends on the publishing mechanism.

If a site uses Mirror to Server publishing, the delivery system composes a web page when a visitor requests it. This is sometimes called dynamic page delivery. The generated page can be personalized for this visitor.

If a site uses Export to Disk, all the pages are preconstructed. Thus, the delivery system simply serves the page. This is sometimes called static page delivery. The page cannot be personalized.

If a site uses Export to XML, all the pages are preconstructed. Note that the delivery system may or not be a web site. The delivery system really just needs to be some destination that requires XML files, which might be a non-dCS system or a database.

Content Server can use either HTTP or HTTPS to transfer data during publishing. Furthermore, publishing will work even if there is a proxy server between the management system and the delivery system.

How Are Pages Dynamically Generated?

Consider an online newspaper. Readers get to the online newspaper by entering a URL into their browsers; for example:
http://www.gloopytimes.com
The web server maps this URL to a Content Server URL such as:
http://www.gloopytimes.com/servlet/ContentServer?pagename=gl/start
The end of a Content Server URL is called the pagename. In the preceding example, the pagename is gl/start. Upon receiving a URL, Content Server looks up the pagename in the SiteCatalog. The SiteCatalog is a database table that contains entries for all legal pagenames on this web site.

Each pagename entry in the SiteCatalog lists the name of a root element that is associated with the pagename. For example, the SiteCatalog might indicate that for pagename gloopy/start, the root element is named layout. An element points to an XML file. The root element is the first element that Content Server processes to handle the request for a particular pagename.

The root element can, in turn, call other elements. Thus, you can divide complex web tasks into discrete elements much the same way that programmers divide a large program into discrete functions. Different root elements can call the same elements.

Processing all the elements called by the root element generates one page. The page is the HTML that the web server sends to the visitor's browser. Notice that Content Server generates the page; that is, the page did not exist prior to the visitor's request for a particular URL.

The elements are essentially smart templates containing the logic to react to changing data. For example, elements can display different data for different visitors. You can set up the elements to react to changing information in databases, to the current time of day, or to the values of input variables. Thus, requests for the the same pagename can potentially generate different pages.

Figure 14: When a user types a URL, one or more elements generate a page.

Cache Pages to Improve Performance

Parsing elements into HTML takes time. To improve performance, the system should not unnecessarily parse the same over and over again. After all, if the same element is going to generate the same page each time, why bother parsing again? Why not just save the generated page?

In fact, Content Server allows you to save (to cache) the generated HTML on the disk or in memory. Then, when the page is requested, Content Server simply retrieves the page from the cache. In the diagram, John requests a page. Content Server finds the root element (poodle.xml) for this page and parses this XML into some HTML. Then, Content Server caches the HTML. A few minutes later, Mary requests the same page. Content Server knows that the page is already cached and delivers it to Mary.

Figure 15: Once a page is cached, subsequent users can access that cached page quickly.

Cache Pagelets Locally

As noted earlier, a page might consist of the HTML generated by multiple elements. We call the HTML generated by one element a pagelet. One page often comprises multiple pagelets.

Content Server can separately cache each pagelet. For example, consider the online newspaper, The Gloopy Times. If Content Server could only cache entire pages, the caching would not be very helpful. Rachel's entire page would be cached, and since Danny's page is different, Rachel's cached page would not help Danny view the information he wants. However, most of the elements that create this page have no personalization logic. For example, the weather forecast that Danny sees is the same weather forecast that Rachel sees. Therefore, the designer of The Gloopy Times should probably cache the pagelets created by the root element, the weather forecast pagelet, and the headline pagelet. When anyone visits the site, Content Server can construct most of the page from the cached pagelets. Content Server need only parse the personalized element.

In addition, the web site developer could also decide to cache a personalized pagelet. This would be a wise strategy if the personalized pagelet is used a second time, possibly by another page. Conversely, caching personalized pagelets can rapidly fill up the cache on a busy site.

Content Server lets you mark the expiration date and time of any pagelet in the cache. When pagelets expires, Content Server flushes the old pages to prevent the cache from filling up with obsolete pagelets.

Resultset Caching

When you query a database, the result of the query is called a resultset. Resultsets contain the rows and columns that match the query. Content Server caches resultsets. If the same query is repeated, Content Server simply looks up the results in the resultset instead of repeating the query. This is much faster than repeatedly querying the database for the same content.

When a database is modified, Content Server automatically flushes all resultsets that are associated with the modified database.

Content Server provides multiple properties that you can use to control the maximum number of resultsets cached, as well as absolute or relative expiration timeouts.

Cache Pages Remotely

Content Server, working in combination with another product named CS-Satellite, can cache pagelets on remote machines. Caching on remote machines reduces the load on the Content Server host and spreads it across multiple hosts. CS-Satellite offers an excellent scalability solution--as your needs increase, simply add more hosts running CS-Satellite.

CS-Satellite runs on a remote machine. On the remote machine, you install a web server and CS-Satellite; you do not, however, install an application server or DBMS on these remote machines. You also need to install a load balancer, which will manage incoming requests and distribute them efficiently among the CS-Satellite hosts.

A typical delivery configuration, containing Content Server and four CS-Satellite hosts, looks as follows:

Figure 16: CS-Satellite runs on remote machines.

There are two different ways to serve and cache pages with CS-Satellite:

Using the Satellite servlet

Using the FileServer servlet

The Satellite servlet caches pages at the pagelet level. You add CS-Satellite XML or JSP tags to your elements. The CS-Satellite tags identify the pagelet to be cached and the time it should expire. Use the Satellite servlet to cache pages generated by Content Server and which are composed of pagelets.

The FileServer servlet caches pages at the page level. If a requested page is not in FileServer's cache, it works as a proxy server, passing page requests directly to the specified URL. Use the FileServer servlet to cache dynamically generated pages that do not contain Satellite tags or dynamically generated pages that were not generated by Content Server.

As of dCS 5.0, a CS-Satellite is automatically installed on the host running Content Server. This co-resident CS-Satellite (really, the Satellite servlet) also provides an additional performance boost.

Determine Who Can See What

Security is the mechanism that permits or denies access to a resource, such as to a particular page. For example, you can indicate that only people in the payroll group are allowed to access payroll pages. To make this system work, users need login names and passwords. System administrators can assign these themselves or web site developers can write forms to allow users to pick their own.

In addition to a login name and password, you must also assign ACLs (access control lists) to each user. The ACLs identify the group or groups to which this user belongs. For example, user Danny might belong to the Browser group, while user Rachel might belong to both the ContentEditor group and the Manager group. When creating an ACL, the system administrator selects some combination of the following permissions: Read, Retrieve, Write, Create, Delete, Revision Tracking Audit, and Revision Tracking Admin.

Figure 17: Danny is in the Browser group; this group can read and retrieve pages.

The system administrator identifies the ACLs of individuals or groups who can access pages. For example, the system administrator might mark the poodle page as being accessible to those in the Browser group. Therefore, the system will allow Danny to access the poodle page.

Figure 18: Danny (a Browser) can access the Poodle page, but not the Beagle page.

Once a visitor's security information is validated, Content Server maintains that user name and ACL as part of the session information. Therefore, as long as the session is valid, visitors do not have to log in again. Visitors that are not validated receive the default username (DefaultReader) and the minimum ACL (Browser).

Don't confuse security with personalization. Security blocks some or all kinds of access to content, whereas personalization filters content according to users' preferences or behavior. Both require user identification.

Using LDAP or the NT 4.0 Authentication Plug-In

You can replace Content Server's own security storage with LDAP (Lightweight Directory Access Protocol). Many Content Server customers use LDAP to store user profile information. When you use this option, user names and attributes are stored in your directory server rather than in the Content Server database.

You can also use the NT 4.0 authentication plug-in rather than Content Server's native system or LDAP. In this system, NT 4.0 authenticates users but stores the user information in the Content Server user management tables.

Manage Sessions and Cookies

A user visits your web site's home page and then visits a second page. Perhaps the user visits a third page before departing. The collection of visits constitutes one session. HTTP, by itself, cannot track sessions. That is because HTTP is a stateless protocol, meaning that nothing about the previous click is remembered in subsequent clicks. So, if the user supplied some information about herself on the home page, that information is not available to subsequent pages.

Content Server automatically creates a session when a user first visits the web site. You can use Content Server tags to record information about the user's visit into special session variables. For example, you could code the main page to ask a user to state her preferences, then store the preferences as session variables. Subsequent elements can all use these session variables. Subsequent elements can use Content Server tags to create new session variables or to modify the values of existing ones.

Session variables last only as long as the session. To store information about visitors more permanently, you may use cookies rather than session variables. Your elements can use Content Server tags to write cookies to the visitor's browser, where the cookie records certain key facts. When the visitor returns to the web site, the visitor's browser automatically sends that cookie's data back to your elements. Your elements can then use this data to customize pages.

Instead of cookies, you can optionally store information about users in a Content Server database or within LDAP.

Content Server maintains session variables across all members of a cluster. CS-Satellite provides features to maintain cookies on remote hosts.

Use Development Tools

Content Server includes the following tools for developers and system administrators:

Content Server Explorer -- The interface to this utility is similar to Windows Explorer. It provides an easy way for developers to navigate and alter the dCS portions of the database. From Content Server Explorer you can launch your choice of editor to create or modify source code.

CatalogMover -- This utility helps developers move database tables to another dCS site. Note that since all code for a dCS application is stored in a database, this provides a convenient mechanism for moving a dCS application.

debuggers -- Content Server provides separate XML and JSP code debuggers. These debuggers allow you to set breakpoints, step through code, examine the values of variables, and so on.

XMLPost -- This utility lets you import well-formed XML documents into your dCS database. For example, an online newspaper can use XMLPost to grab articles off a newsfeed and put them right into the dCS database.

In addition to these out-of-the-box tools, you can also use Content Server with many popular editors (such as Macromedia Dreamweaver) and IDEs (such as Symantec Visual Cafe).

Trigger Programs to Run at Certain Times

Many UNIX users are familiar with cron, which is a daemon that runs designated programs at a specified date and time. Content Server offers a similar feature called event management. This feature lets you specify elements that should be run at a certain date and time. You use Content Server tags to set up the element to be invoked and the date and time it should run. You can also use Content Server tags to send e-mail announcing the completion of an event.

The date and time can be recurring. For example, you could set up Content Server to run an element that scans a directory ever hour to add new information to a database table. Or, you could publish content every morning at 6:00.

Publishing, backing up, and notifications are prime candidates for event management.

Perform Searches

Content Server provides a simple, SQl-based, built-in search engine. More importantly, Content Server provides interfaces to the following popular search and personalization engines:

AltaVista

Verity

For example, if you build a financial web site, Content Server hands over each of the site's articles to the search engine, and the search engine builds an index. The technique used for indexing varies among the search engines.

Content Server provides XML tags, JSP tags, and Java methods for interacting with these search engines.

Since content changes frequently, you probably need to rebuild parts of the index frequently. Fortunately, the system can incrementally update the index. For example, on a delivery system, you can have the system automatically update the index incrementally whenever you publish a new article.

Manipulate XML Documents with CS-Bridge XML

CS-Bridge XML comes with divine Content Server 5. You use it to receive, deliver, process, route, and transform XML documents to and from other enterprise applications over the web. CS-Bridge XML enables you to create rules that automate the processing of repetitive, document-based business transactions. CS-Bridge XML is suitable for document transformations, including sharing, syndicating, and rebranding of content (using XSLT), and for exchanging business documents with web market exchanges, customers, or partners.

CS-Bridge XML consists of the following primary components:

Input Post Handler -- Receives the client's HTTP request to access CS-Bridge XML; the request contains the user's authentication credentials and the transformation request. The Input Post Handler authenticates the user and then passes the XML document to the appropriate Document Handler.

Document Handlers -- A set of Content Server elements that you code. Each document handler performs whatever operations you require on the input XML documents. For example, you might write a document handler that converts an XML-based shopping cart into an order form. A document handler can optionally call other Content Server elements or access information from the Content Server database.

JAXP (Java APIs for XML Processing) -- A set of standards-based interfaces (provided by an application server) to features such as an XML parser and an XSLT processor.

Partner Database -- A database table that contains authentication information for all CS-Bridge XML users. It also contains the mappings that enable the Input Post Handler to call the appropriate Document Handler.

Output Post Handler -- Sends the processed XML document to the target system via HTTP post. This component is not always invoked; in other words, the flow of an XML document might stop after the Document Handler processes it. For example, a Document Handler could simply write a processed XML document into the Content Server database.

Figure 19 shows a simplified flow of an XML document through CS-Bridge XML.

Figure 19: CS-Bridge XML transforms XML documents.

CS-Bridge XML Caches DTDs

DTDs (document type definitions) are associated with each document sent to or sent by CS-Bridge XML. DTDs either reside in an external location specified by a URI (Uniform Resource Identifier) or are embedded directly in the XML document. CS-Bridge XML automatically caches external DTDs; caching improves performance because the system does not need to download the same DTD repeatedly over the Internet. DTDs that are embedded in the XML document are not cached.

Schedule Document Processing

CS-Bridge XML checks the document queue whenever you choose to manually invoke it, or it checks periodically according to a specified interval. You use a component called the Event Handler to control processing frequency. For example, you can specify that the document queue be processed every hour, on the hour.

Communicate with Middleware Using CS-Bridge Enterprise

CS-Bridge Enterprise is an add-on module for Content Server. It provides an interface between Content Server and back-end applications such as ERP systems. More precisely, CS-Bridge Enterprise provides an interface between Content Server and webMethods; it is webMethods that ultimately communicates with the back-end application. Using CS-Bridge Enterprise, you do not need to know the interfaces to any back-end systems; you only need to learn the interfaces to CS-Bridge Enterprise.

CS-Bridge Enterprise is a configurable processing engine that enables you to call back-end application services from web pages delivered by Content Server.

With CS-Bridge Enterprise, creating business logic is a process of transforming internal data objects--service requests, shopping carts, and so on--into formats required to make a request against a target application via webMethods. To do this, CS-Bridge Enterprise alternately converts between Content Server format and webMethods format depending on the direction of data flow.

To ensure that you pass data that conforms to the appropriate interface, the framework enables you to manipulate data at different points during the translation process using XSLT style sheets--both from and to Content Server. Using XSLT style sheets to manage logic specific to the target application enables you to reuse container objects across multiple target systems.

Components in the CS-Bridge Enterprise System

A Content Server integration comprises the following integrated software components:

XML tags for web presentation.

XML and JSP tags for structuring data in DOMList format.

JSP integration controllers that manage data traffic and execute the data conversion process.

XSLT style sheets for describing data conversions.

JAXP (Java APIs for XML Processing) -- A set of standards-based interfaces (provided by an application server) to features such as an XML parser and an XSLT processor.

Middleware services layer that converts data to middleware document format and handles responses from the enterprise middleware.

webMethods for requesting services from back-end systems.

This combination of components enables you to exchange data between dCS and any enterprise-information system supported by webMethods, including enterprise-resource planning (ERP) systems, customer-relationship management (CRM) applications, mainframe systems, and databases. To assist in your development effort, CS-Bridge Enterprise also includes a fully coded sample integration that uses SAP as the target application. Sample integration points include example code for JSP controllers, XSLT style sheets, and XML tags that show how the configurable framework works.

Architecture

CS-Bridge Enterprise divides development into layers that separate site design from business logic, and business logic from integration logic. This division makes it possible to work on different aspects of solution design in parallel, requiring minimal code to access EIS services at the web-design level using custom tags.

CS-Bridge Enterprise application logic is based on a three-layered design that separates web presentation from the underlying technical configuration, as follows:

Presentation layer

Integration logic layer

Middleware services layer

Presentation Layer

The top presentation layer comprises tags that enable a web designer to add functionality to the end-user experience. Tags provide a simple command syntax that allows the designer to include functions provided by integrated Enterprise Information Systems (EIS) in presentation logic. Some tags resolve on integration logic in the layers below, and others provide command syntax for services such as user authentication and account creation, and creation and maintenance of data structures that carry data between layers.

Integration Logic Layer

The middle integration logic layer contains the code that implements or supports integration tags in the UI layer. This layer, which contains predefined integration points for the tags shipped with CS-Bridge Enterprise, also provides the environment for you to develop and manage your own tags. Each tag resolves on a JSP Integration Controller and a set of at least two XSLT style sheets--one that transforms the data structures passed from the tag layer into output appropriate for delivery to the middleware or EIS directly, and another that transforms the results of EIS processing into a data structure appropriate for access using tags at the presentation layer (for presentation back to the browser).

Middleware Services Layer

The middleware services layer provides a Middleware Services interface and an application bridge that is specific to webMethods. This layer provides services such as delivery and translation of XML data from the integration logic into syntax and calls that are specific to webMethods. This layer also handles aspects of delivery, error recovery, and connection maintenance.

Create and Consume Web Services

Content Server provides tags and tools for your site to create and to consume web services. Content Server can interact with any application that has a SOAP (Simple Object Access Protocol) interface.

Use Predefined Web Services

Content Server includes a complete set of asset-delivery functions implemented as web services. For example, a client with the proper credentials can request a list of assets from a site run by Content Server. To implement web services, Content Server provides a complete set of WSDL (web services description language) files.

Most times, you will have control over the client interaction with Content Server. For access by potentially unknown client applications, however, the supplied WSDL files can also be posted to a URL and registered via UDDI (universal description, discovery, integration).

Create Custom Web Services

With Content Server, you can create web services that map data from any Content Server functions you want to make available. Because of its support for XML, Java, and JSP, the existing Content Server development environment also provides a familiar platform for developing web services. A supplied tag set enables you to build a SOAP response and stream SOAP encapsulated data to and from applications. As with the prepackaged web services, the Content Server delivery capability and page-evaluation pipeline are used to process SOAP requests.

Building custom web services requires the following steps:

1. Writing a Content Server page and element to handle data. The element contains the logic for your web services function, including required SOAP XML tags for formatting the SOAP response. Content Server automatically generates the XML for the SOAP envelope.

2. Defining the SOAP operation and SOAP action for the service in your client program. Optionally, you can create a WSDL file that defines the SOAP operation and action.

Consume Web Services

Content Server can act as a web services client--requesting and consuming web services made available by any remote application that offers a web service and that returns a data type supported by the XML schema.

Using the information contained in the WSDL file for a given web service, you configure a supplied Content Server invocation tag that specifies the location of the web service, the operation to invoke, and the name of the object in which the return data is to be stored. An associated parameter tag specifies the input parameters for the particular operation.

When the data is transmitted and stored in Content Server, it becomes available for display or further processing. Content Server pages or APIs handle the return data according to your custom business logic. By presenting the web service as a tag, Content Server handles data as if it were an ordinary content tag for a native Content Server function.

Summary

Content Server is the base platform for the dCS product set. It provides an extensive low-level XML and JSP toolkit to help develop web applications. It also provides underlying services that are essential for managing and delivering content. However, rather than use those services directly, most dCS sites take advantage of the additional features provided by a higher level product, such as CS-Direct, which is the topic of the next chapter.



dCS Product Overview Revision 5.0, Document version 12/08/02 Customer Support