Friday, October 3, 2014

Architecture of Apache Tomcat

In this article, we look into the different component of Apache Tomcat architecture that will help us to understand Tomcat in more detail. The Apache Tomcat server is an open source, Java-based web application container that was created to run servlet and JavaServer Pages (JSP) web applications.

Apache Tomcat is very stable and has all of the features of a commercial web application container – yet comes under Open Source Apache License. Tomcat also provides additional functionality that makes it a great choice for developing a complete web application solution. Some of the additional features provided by Tomcat—other than being open source and free—include the Tomcat Manager application, specialized realm implementations, and Tomcat valves.
Regarding the latest release of Tomcat always check Apache Tomcat site.

The Architecture of Tomcat 
Before reading this you should look into tomcat conf/server.xml in order to understand this. Download server.xml file.
A Tomcat instance, or server, is the top-level component in Tomcat’s container hierarchy. Only one Tomcat instance can live in a single Java Virtual Machine (JVM). This approach makes all other Java applications, running on the same physical machine as Tomcat server, safe in case Tomcat and/or its  JVM crashes.

Tomcat instance consists of grouping of the application containers, which exist in the well-defined hierarchy. The key component in that hierarchy is the Catalina servlet engine. Catalina is the actual Java servlet container implementation as specified in Java Servlet API

XML representation of the relationships between the different Tomcat containers. 

This instance can be broken down into a set of containers including a server, a service, a connector, an engine, a host, and a context. By default, each of these containers is configured using the server.xml


The Server 
In the Tomcat world, a Server represents the whole container. Tomcat provides a default implementation of the Server interface which is rarely customized by users.

The Server is Tomcat itself — an instance of the Web application server — and is a top-level component. It owns a port that is used to shut down the server. In addition, the Server can be set in debug mode, which instantiates a version of the Java Virtual Machine (JVM) that enables debugging.
Only one instance of the Server can be created inside a given Java Virtual Machine (JVM).
Separate Servers configured to different ports can be set up on a single machine to separate applications so that they can be restarted independently. That is, if one Server running in a JVM were to crash, the other applications would be safe in another Server instance. This is sometimes done in hosting environments in which each customer has a separate instance of a JVM, so a badly configured/written application will not cause others to crash.

A Server element represents the entire Catalina servlet container. Therefore, it must be the single outermost element in the conf/server.xml configuration file. Its attributes represent the characteristics of the servlet container as a whole.
All implementations of Server support the following attributes:

  • className : Java class name of the implementation to use. This class must implement the org.apache.catalina.Server interface. If no class name is specified, the standard implementation will be used.
  • address: The TCP/IP address on which this server waits for a shutdown command. If no address is specified, localhost is used.
  • port : The TCP/IP port number on which this server waits for a shutdown command. Set to -1 to disable the shutdown port.
    Note: Disabling the shutdown port works well when Tomcat is started using Apache Commons Daemon (running as a service on Windows or with jsvc on un*xes). It cannot be used when running Tomcat with the standard shell scripts though, as it will prevent shutdown.bat|.sh and catalina.bat|.sh from stopping it gracefully.
  • shutdown : The command string that must be received via a TCP/IP connection to the specified port number, in order to shut down Tomcat.
  • debug : The debug attribute is available to all Tomcat elements. It states the debug level to use when logging messages to a defined Logger.

Standard Implementation The standard implementation of Server is org.apache.catalina.core.StandardServer.
The following components may be nested inside a Server element:

  • Service - One or more service element.
  • GlobalNamingResources - Configure the JNDI global resources for the server.

The Service
A Service is an intermediate component which lives inside a Server and ties one or more Connectors to exactly one Engine. The Service element is rarely customized by users, as the default implementation is simple and sufficient

"A Service element represents the combination of one or more Connector components that share a single Engine component for processing incoming requests. One or more Service elements may be nested inside a Server element."

An Engine is a request-processing component that represents the Catalina Servlet engine. It examines the HTTP headers to determine the virtual host or context to which requests should be passed.
Each Service represents a grouping of Connectors (components that manage the connection between the client and server) and a single container, which accepts requests from the Connectors and processes the requests to present them to the appropriate Host. Each Service is named so that administrators can easily identify log messages sent from each Service.
In other words, the container contains the Web Applications. It is responsible for accepting requests, routing them to the specified Web application and specific resource, and returning the result of the processing of the request. Connectors stand between the client making the request and the container. They provode additional services such as SSL request

All implementations of Service support the following attributes

  • className : Java class name of the implementation to use. This class must implement the org.apache.catalina.Service interface. If no class name is specified, the standard implementation will be used
  • name : The display name of this Service, which will be included in log messages if you utilize standard Catalina components. The name of each Service that is associated with a particular Server must be unique.

The standard implementation of Service is org.apache.catalina.core.StandardService

Engine
An Engine represents request processing pipeline for a specific Service. As a Service may have multiple Connectors, the Engine receives and processes all requests from these connectors, handing the response back to the appropriate connector for transmission to the client.
Exactly one Engine element MUST be nested inside a Service element, following all of the corresponding Connector elements associated with this Service.

A container object that cannot be contained by another container. This means that it is guaranteed not to have a parent container. It is at this level that the objects begin to aggregate child components. 
Strictly speaking, the container does not need to be an Engine, it just has to implement the container interface. This interface mandates the following: that the object implementing it is aware of its position in the hierarchy (it knows its parent and its children), that it provides access to logging, that it provides a Realm for user authentication and role-based authorization, and that it has access to a number of resources, including its session manager .

All implementations of Engine support the following attributes:

  • backgroundProcessorDelay : This value represents the delay in seconds between the invocation of the backgroundProcess method on this engine and its child containers, including all hosts and contexts. Child containers will not be invoked if their delay value is not negative (which would mean they are using their own processing thread). Setting this to a positive value will cause a thread to be spawn. After waiting the specified amount of time, the thread will invoke the backgroundProcess method on this engine and all its child containers. If not specified, the default value for this attribute is 10, which represent a 10 seconds delay.
  • className : Java class name of the implementation to use. This class must implement the org.apache.catalina.Engine interface. If not specified, the standard value (defined below) will be used.
  • defaultHost : The default host name, which identifies the Host that will process requests directed to host names on this server, but which are not configured in this configuration file. This name MUST match the name attributes of one of the Host elements nested immediately inside.
  • jvmRoute : Identifier which must be used in load balancing scenarios to enable session affinity. The identifier, which must be unique across all Tomcat servers which participate in the cluster, will be appended to the generated session identifier, therefore allowing the front end proxy to always forward a particular session to the same Tomcat instance.
  • name : Logical name of this Engine, used in log and error messages. When using multiple Service elements in the same Server, each Engine MUST be assigned a unique name.
  • startStopThreads : The number of threads this Engine will use to start child Host elements in parallel. The special value of 0 will result in the value of Runtime.getRuntime().availableProcessors() being used. Negative values will result in Runtime.getRuntime().availableProcessors() + value being used unless this is less than 1 in which case 1 thread will be used. If not specified, the default value of 1 will be used.

Special Features

  • Logging
  • Access Logs
  • Lifecycle Listeners
  • Lifecycle Listeners


Host
The Host element represents a virtual host, which is an association of a network name for a server (such as "www.mycompany.com" with the particular server on which Tomcat is running. For clients to be able to connect to a Tomcat server using its network name, this name must be registered in the Domain Name Service (DNS) server that manages the Internet domain you belong to.
One or more Host elements are nested inside an Engine element. Inside the Host element, you can nest Context elements for the web applications associated with this virtual host. Exactly one of the Hosts associated with each Engine MUST have a name matching the defaultHost attribute of that Engine.

Clients normally use host names to identify the server they wish to connect to. This host name is also included in the HTTP request headers. Tomcat extracts the host name from the HTTP headers and looks for a Host with a matching name. If no match is found, the request is routed to the default host. The name of the default host does not have to match a DNS name (although it can) since any request where the DNS name does not match the name of a Host element will be routed to the default host.

All implementations of Host support the following attributes:

  • appBase : The Application Base directory for this virtual host. This is the pathname of a directory that may contain web applications to be deployed on this virtual host. You may specify an absolute pathname, or a pathname that is relative to the $CATALINA_BASE directory.
  • xmlBase : The XML Base directory for this virtual host. This is the pathname of a directory that may contain context XML descriptors to be deployed on this virtual host. You may specify an absolute pathname for this directory, or a pathname that is relative to the $CATALINA_BASE directory. 
  • createDirs : If set to true, Tomcat will attempt to create the directories defined by the attributes appBase and xmlBase during the startup phase. The default value is true. If set to true, and directory creation fails, an error message will be printed out but will not halt the startup sequence.
  • autoDeploy
  • backgroundProcessorDelay
  • className
  • deployIgnore
  • deployOnStartup : This flag value indicates if web applications from this host should be automatically deployed when Tomcat starts. The flag's value defaults to true
  • failCtxIfServletStartFails
  • name
  • startStopThreads
  • undeployOldVersions : This flag determines if Tomcat, as part of the auto deployment process, will check for old, unused versions of web applications deployed using parallel deployment and, if any are found, remove them. This flag only applies if autoDeploy is true. If not specified the default value of false will be used.

The standard implementation of Host is org.apache.catalina.core.StandardHost
Special Features:
  • Logging
  • Access Logs
  • Automatic Application Deployment
  • Host Name Aliases
  • Lifecycle Listeners
  • Request Filters
  • Single Sign On
  • User Web Applications

Connector

A Connector handles communications with the client. There are multiple connectors available with Tomcat. These include the HTTP connector which is used for most HTTP traffic, especially when running Tomcat as a standalone server, and the AJP connector which implements the AJP procotol used when connecting Tomcat to a web server such as Apache HTTPD server. Creating a customized connector is a significant effort.

Connectors connect the applications to clients. They represent the point at which requests are received from clients and are assigned a port on the server. The default port for nonsecure HTTP applications is kept as 8080 to avoid interference with any Web server running on the standard HTTP port, but there is no reason why this cannot be changed as long as the port is free. Multiple Connectors may be set up for a single Engine or Engine-level component, but they must have unique port numbers.
The default port to which browsers make requests if a port number is not specified is port 80. If Tomcat is run in standalone mode, the port for the primary Connector of the Web application can be changed to 80 by reconfiguring this component.

The default port to which browsers make requests if a port number is not specified is port 80. If Tomcat is run in standalone mode, the port for the primary Connector of the Web application can be changed to 80 by reconfiguring this component.

HTTP Connector
The HTTP Connector element represents a Connector component that supports the HTTP/1.1 protocol. It enables Catalina to function as a stand-alone web server, in addition to its ability to execute servlets and JSP pages. A particular instance of this component listens for connections on a specific TCP port number on the server. One or more such Connectors can be configured as part of a single Service, each forwarding to the associated Engine to perform request processing and create the response.
If you wish to configure the Connector that is used for connections to web servers using the AJP protocol (such as the mod_jk 1.2.x connector for Apache 1.3), please refer to the AJP Connector documentation.

All implementations of Connector support the following attributes:

  • allowTrace : A boolean value which can be used to enable or disable the TRACE HTTP method. If not specified, this attribute is set to false.
  • asyncTimeout : The default timeout for asynchronous requests in milliseconds. If not specified, this attribute is set to the Servlet specification default of 30000 (30 seconds).
  • enableLookups : Set to true if you want calls to request.getRemoteHost() to perform DNS lookups in order to return the actual host name of the remote client. Set to false to skip the DNS lookup and return the IP address in String form instead (thereby improving performance). By default, DNS lookups are disabled.
  • maxHeaderCount
  • maxParameterCount
  • maxPostSize
  • maxSavePostSize
  • parseBodyMethods : A comma-separated list of HTTP methods for which request bodies will be parsed for request parameters identically to POST.
  • port : The TCP port number on which this Connector will create a server socket and await incoming connections.
  • protocol : Sets the protocol to handle incoming traffic
  • proxyName
  • proxyPort
  • redirectPort : If this Connector is supporting non-SSL requests, and a request is received for which a matching <security-constraint> requires SSL transport, Catalina will automatically redirect the request to the port number specified here.
  • scheme
  • secure
  • URIEncoding
  • useBodyEncodingForURI
  • useIPVHosts
  • xpoweredBy
The standard HTTP connectors
  • BIO
  • NIO
  • NIO2
  • APR/native
Special Features
  • HTTP/1.1 and HTTP/1.0 Support
  • Proxy Support
  • SSL Support


Context
Finally, there is the Web application, also known as a context. Configuration of a Web application includes informing the Engine/Hosts of the location of the root folder of the application.
Finally, there is the Web application, also known as a context. Configuration of a Web application includes informing the Engine/Hosts of the location of the root folder of the application.
The context may also include specific error pages, which enable a system administrator to configure error messages that are consistent with the look and feel of the application, and usability features (such as a search Engine, useful links, or a report-creating component that notifies the administrator of errors in the application). Finally, a context can also be configured with initialization parameters for the application it represents and for access control (authentication and authorization restrictions).

The web application used to process each HTTP request is selected by Catalina based on matching the longest possible prefix of the Request URI against the context path of each defined Context. Once selected, that Context will select an appropriate servlet to process the incoming request, according to the servlet mappings defined by the web application deployment.
You may define as many Context elements as you wish. Each such Context MUST have a unique context name within a virtual host. The context path does not need to be unique . In addition, a Context must be present with a context path equal to a zero-length string. This Context becomes the default web application for this virtual host, and is used to process all requests that do not match any other Context's context path.

Parallel deployment
You may deploy multiple versions of a web application with the same context path at the same time. The rules used to match requests to a context version are as follows:

  • If no session information is present in the request, use the latest version.
  • If session information is present in the request, check the session manager of each version for a matching session and if one is found, use that version.
  • If session information is present in the request but no matching session can be found, use the latest version.

Naming
When autoDeploy or deployOnStartup operations are performed by a Host, the name and context path of the web application are derived from the name(s) of the file(s) that define(s) the web application. Consequently, the context path may not be defined in a META-INF/context.xml embedded in the application and there is a close relationship between the context name, context path, context version and the base file name (the name minus any .war or .xml extension) of the file.

If no version is specified then the context name is always the same as the context path. If the context path is the empty string them the base name will be ROOT (always in upper case) otherwise the base name will be the context path with the leading '/' removed and any remaining '/' characters replaced with '#'.
If a version is specified then the context path remains unchanged and both the context name and the base name have the string '##' appended to them followed by the version identifier.

Defining a context
It is NOT recommended to place <Context> elements directly in the server.xml file. This is because it makes modifying the Context configuration more invasive since the main conf/server.xml file cannot be reloaded without restarting Tomcat.

Individual Context elements may be explicitly defined:

  • In an individual file at /META-INF/context.xml inside the application files
  • In individual files (with a ".xml" extension) in the $CATALINA_BASE/conf/[enginename]/[hostname]/ directory
  • Inside a Host element in the main conf/server.xml.

Default Context elements may be defined that apply to multiple web applications. Configuration for an individual web application will override anything configured in one of these defaults. Any nested elements, e.g. <Resource> elements, that are defined in a default Context will be created once for each Context to which the default applies. They will not be shared between Context elements.

  • In the $CATALINA_BASE/conf/context.xml file: the Context element information will be loaded by all web applications.
  • In the $CATALINA_BASE/conf/[enginename]/[hostname]/context.xml.default file: the Context element information will be loaded by all web applications of that host.
With the exception of server.xml, files that define Context elements may only define a single Context element.


The Valves
This is how valves look like in server.xml

Valves are components that enable Tomcat to intercept a request and pre-process it. They are similar to the filter mechanism of the Servlet specifications, but are specific to Tomcat. Hosts, contexts, and Engines may contain Valves
Values are commonly used to enable Single Sign-on for all Hosts on a server, as well as log request patterns. client IP addresses ans server usage patterns (peak traffic, bandwidth use, average request per unit time and so on). This is know as request dumping,  and a request dumper valve records the header information and any cookies send with the request. Response dumpong logs the response header and cookies to file.

Valves are typically reusable components, and can therefore be added and removed from the request path according to need. Their inclusion is transparent to Web  applications, although the response time will increase if a Valve is added). An application that wishes to intercept requests for pre-processing and responses for post-processing should use the filters that are a part of the Servlet specifications.

A Valve may intercept a request between an Engine and a Host/context, between a Host and a context,and between a context and a resource within the Web application.

The Realm

The Realm for an Engine manages user authentication and authorization. During the configuration of an application, the administrator sets the roles that are allowed for each resource or group of resources, and the Realm is used to enforce this policy.

Realms can authenticate against text files, database tables, LDAP servers, and the Windows network identity of the user

A Realm applies across the entire Engine or top-level container, so applications within a container share user resources for authentication. This means that, for example, a manager for the intranet will have the same rights as the manager of the e-commerce site should both these applications be in the same Engine.
By default, a user must still authenticate separately to each Web application on the server.




If you know anyone who has started learning java, why not help them out! Just share this post with them. Thanks for studying today!...

8 comments:

  1. thank you very much, for article very good

    ReplyDelete
  2. Thank you, your article made my day.

    ReplyDelete
  3. This article is awesome! Thanks a lot!

    ReplyDelete
  4. Nice Article,
    this is also a very good source although somewhat outdated : How Tomcat Works

    ReplyDelete