Feeds:
Posts
Comments

Things you will need on hand –
Apache Tomcat [version 7.0.20 used here]
Latest SOLR Release [3.4.0 used here]

1. Install Tomcat with $TOMCAT_HOME environement variable set.
2. Edit $TOMCAT_HOME\conf\server.xml and add the attribute – URIEncoding=”UTF-8″.
The modified portion of server.xml should look like :
<Connector port=”8080″ protocol=”HTTP/1.1″
connectionTimeout=”20000″
redirectPort=”8443″ URIEncoding=”UTF-8″/>

SOLR uses UTF-8 file encoding and thus we need to make sure that Apache Tomcat will be informed that all requests and responses made should use that encoding.

3. Create a proper context file [solr.xml] under $TOMCAT_HOME\conf\Catalina\localhost. The content of the file should look like:
<?xml version=”1.0″ encoding=”UTF-8″ ?>
<Context path=”/solr”>
<Environment name=”solr/home” type=”java.lang.String” value=”D:\solr” override=”true” />
</Context>

This context definition helps to say that SOLR application will be available under /solr context with the SOLR home folder configuration. The SOLR home folder is where we need to place SOLR configuration files.

4. Now deploy solr.war into $TOMCAT_HOME\webapps. If you need some additional libraries for Solr to see, you should add them to the $TOMCAT_HOME/lib directory.

5. Now its time to add SOLR configuration files [solrconfig.xml, schema.xml,elevate.xml] to solr/home folder. Please don’t forget that you need to ensure the proper directory structure. If you are not familiar with the Solr directory structure, please take a look at the example deployment that is provided with standard Solr package.

6. Now start tomcat and hit the URL: :     http://localhost:8080/solr/admin/  to view the Admin console.

If you followed exactly the above steps mentioned then —

Congratulations, you have just successfully configured and ran the Apache Tomcat servlet container with Solr deployed.

Advertisements

I just read about Sachin’s blog on Maven and just thought of writing this one i.e. integrating SURF into your web application using Maven.

Just to let you know if you were not using Maven and you needed to install SURF in to your web application all you need to do is just drop the below mentioned jars into your WEB-INF/lib directory of your project. The jars to be included are:

  1. alfresco-core-3.3.0-SNAPSHOT.jar
  2. alfresco-jlan-3.3.0-SNAPSHOT.jar
  3. alfresco-web-framework-3.3.0-SNAPSHOT.jar
  4. alfresco-webscript-framework-3.3.0-SNAPSHOT.jar

Now make sure that your application has the following imports in your web mvc application as :

<import resource=”classpath*:org/alfresco/web/scripts/*-context.xml” />
<import resource=”classpath*:org/alfresco/web/framework/*-context.xml” />

Once you are done with the above steps you just need to restart your application server and SURF will get bootstrapped and available as view resolver.

Now to achieve the same in MAVEN all you need to do is have the below mentioned dependency into your Maven powered web application:

<dependency>
<groupId>org.alfresco</groupId>
<artifactId>alfresco-web-framework</artifactId>
<version>3.3.0-SNAPSHOT</version>
</dependency>

and the above mentioned import statements.

Now run mvn install and the build process will get all the dependencies related to SURF.

Happy SURFing until next time!!!

A Tour to Maven

What is Maven?

Maven is a software project management and build automation tool. Maven incorporates the concept of convention over configuration by providing sensible default behaviors for projects i.e. source code is assumed to be in {basedir}src/main/java and resources are assumed to be in {basedir}src/main/resources. Tests are assumed to be in {basedir}/src/test.

The goals of Maven as:
• Make the build process easy
• Provide a uniform build system
• Provide quality project information
• Provide guidelines for best practices

Everything in Maven is controlled via the pom.xml (Project Object Model) file, which contains both information and configuration details on the project.

Lifecycles, Phases and Goals of Maven
Maven is designed around the concept of project lifecycles. While you can define your own, there are three built-in lifecycles: default, clean and site.
The default lifecycle builds and deploys your project.
The clean lifecycle cleans (deletes) compiled objects or anything else that needs to be removed or reset to get the project to a pristine pre-build state.
Finally, the site lifecycle generates the project documentation.
Within each lifecycle there are a number of phases that define various points in the development process.  The most commonly used phases in the default lifecycle are:
compile – compiles the main source code of the project
test – tests the main code using a suitable unit testing framework. These tests should not require that the code is packaged or deployed. This phase implicitly calls the  testCompile  goal to compile the test case source code.
package – packages the compiled code into its distributable format, such as a JAR. The POM controls how a project is packaged through the <packaging/> element
install – installs the package into the local repository for use as a dependency in other projects locally
deploy – used in an integration or release environment. Copies the final package to the remote repository for sharing with other developers and projects.

Maven is typically run from the command line  by executing command “mvn <phase>”, where <phase> is one of the phases listed above. Since phases are defined in order, all phases up to the one you specify will be run. For example, if you want to package your code, simply run “mvn package” and the compile and test phases will automatically be run. You can also execute specific goals for the various plugins that Maven uses. Execution of a specific goal is
done with the command “mvn <plugin>:<goal>”. For instance, the compile phase actually calls the compiler:compile goal by default.. You can also specify multiple phases/goals in one command line, and Maven will execute them in order. This is useful, for instance, if you want to do a clean build of your project. Simply run “mvn clean jetty:run” and the clean lifecycle will run, followed by the jetty:run goal (and all of the prerequisites for jetty:run, such as compile).

Repositories
Repositories are one of the key features of Maven. A repository is a location that contains plugins and packages for your project to use. There are two types of repository: local and remote.
Local repositories are local to the machine and represent a cache of artifacts downloaded from remote repositories as well as packages that you’ve installed from your own projects. The default locations of your local repo will be:
You can override the local repository location by setting the M2_REPO environment variable, or by editing the <home>/.m2/settings.xml file.
Remote repositories are repositories that are reachable via protocols like http and ftp and are generally where you will find the dependencies needed for your projects. Repositories are defined in the POM. Maven has an internal default set of repositories so usually you don’t need to define too many extra repos.
Defining a repository:
<repositories>
<repository>
<id>Repository ID</id>
<name> Repository Name</name>
<url> URL of repository</url>
</repository>
</repositories>

Plugins
Plugins add functionality to the Maven build system.
A plugin provides a set of goals that can be executed using the following syntax:

mvn [plugin-name]:[goal-name]

Configuring the Plugin:
<plugins>
<plugin>
<groupId>Plugin group ID</groupId>
<artifactId>Plugin artifact ID</artifactId>
<version>Plugin Version</version>
</plugin>
</plugins>

Dependencies
Dependency management is one of the more useful features of Maven.
The details of the specification are straightforward:
• The groupId and artifactId specify the artifact. A given group may have many artifacts under it.
• The version is specified either directly or with a range.
• The scope of the dependency is optional, and controls exactly where the dependency is used
Adding a Dependency:
<dependency>
<groupId> </groupId>
<artifactId></artifactId>
<version></version>
<scope> </scope>
</dependency>

I hope this will give you basic understanding of maven.

1.Minimize HTTP Requests

80% of the end-user response time is spent on the front-end. Most of this time is tied up in downloading all the components in the page: images, stylesheets, scripts, Flash, etc. Reducing the number of components in turn reduces the number of HTTP requests required to render the page. This is the key to faster pages.

Here are some techniques for reducing the number of HTTP requests, while still supporting rich page designs.

CSS Sprites are the preferred method for reducing the number of image requests. Combine your background images into a single image and use the CSS background-image and background-position properties to display the desired image segment.

Image maps combine multiple images into a single image. The overall size is about the same, but reducing the number of HTTP requests speeds up the page. Image maps only work if the images are contiguous in the page, such as a navigation bar. Defining the coordinates of image maps can be tedious and error prone. Using image maps for navigation is not accessible too, so it’s not recommended.

Inline images use the data: URL scheme to embed the image data in the actual page. This can increase the size of your HTML document. Combining inline images into your (cached) stylesheets is a way to reduce HTTP requests and avoid increasing the size of your pages. Inline images are not yet supported across all major browsers.

2.Use a Content Delivery Network

The user’s proximity to your web server has an impact on response times. Deploying your content across multiple, geographically dispersed servers will make your pages load faster from the user’s perspective. But where should you start?

As a first step to implementing geographically dispersed content, don’t attempt to redesign your web application to work in a distributed architecture. Depending on the application, changing the architecture could include daunting tasks such as synchronizing session state and replicating database transactions across server locations. Attempts to reduce the distance between users and your content could be delayed by, or never pass, this application architecture step.

Remember that 80-90% of the end-user response time is spent downloading all the components in the page: images, stylesheets, scripts, Flash, etc. This is the Performance Golden Rule. Rather than starting with the difficult task of redesigning your application architecture, it’s better to first disperse your static content. This not only achieves a bigger reduction in response times, but it’s easier thanks to content delivery networks.

A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity. For example, the server with the fewest network hops or the server with the quickest response time is chosen.

Some large Internet companies own their own CDN, but it’s cost-effective to use a CDN service provider, such as Akamai Technologies, EdgeCast, or level3. For start-up companies and private web sites, the cost of a CDN service can be prohibitive, but as your target audience grows larger and becomes more global, a CDN is necessary to achieve fast response times. At Yahoo!, properties that moved static content off their application web servers to a CDN (both 3rd party as mentioned above as well as Yahoo’s own CDN) improved end-user response times by 20% or more. Switching to a CDN is a relatively easy code change that will dramatically improve the speed of your web site.

3.Add an Expires or a Cache-Control Header

There are two aspects to this rule:

1.For static components: implement “Never expire” policy by setting far future Expires header
2.For dynamic components: use an appropriate Cache-Control header to help the browser with conditional requests

Web page designs are getting richer and richer, which means more scripts, stylesheets, images, and Flash in the page. A first-time visitor to your page may have to make several HTTP requests, but by using the Expires header you make those components cacheable. This avoids unnecessary HTTP requests on subsequent page views. Expires headers are most often used with images, but they should be used on all components including scripts, stylesheets, and Flash components.

Browsers (and proxies) use a cache to reduce the number and size of HTTP requests, making web pages load faster. A web server uses the Expires header in the HTTP response to tell the client how long a component can be cached. This is a far future Expires header, telling the browser that this response won’t be stale until April 15, 2010.

Expires: Thu, 15 Apr 2010 20:00:00 GMT

If your server is Apache, use the ExpiresDefault directive to set an expiration date relative to the current date. This example of the ExpiresDefault directive sets the Expires date 10 years out from the time of the request.

ExpiresDefault “access plus 10 years”

Keep in mind, if you use a far future Expires header you have to change the component’s filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component’s filename, for example, yahoo_2.0.6.js.

Using a far future Expires header affects page views only after a user has already visited your site. It has no effect on the number of HTTP requests when a user visits your site for the first time and the browser’s cache is empty. Therefore the impact of this performance improvement depends on how often users hit your pages with a primed cache. (A “primed cache” already contains all of the components in the page.) We measured this at Yahoo! and found the number of page views with a primed cache is 75-85%. By using a far future Expires header, you increase the number of components that are cached by the browser and re-used on subsequent page views without sending a single byte over the user’s Internet connection.

4.Gzip Components

The time it takes to transfer an HTTP request and response across the network can be significantly reduced by decisions made by front-end engineers. It’s true that the end-user’s bandwidth speed, Internet service provider, proximity to peering exchange points, etc. are beyond the control of the development team. But there are other variables that affect response times. Compression reduces response times by reducing the size of the HTTP response.

Starting with HTTP/1.1, web clients indicate support for compression with the Accept-Encoding header in the HTTP request.

Accept-Encoding: gzip, deflate

If the web server sees this header in the request, it may compress the response using one of the methods listed by the client. The web server notifies the web client of this via the Content-Encoding header in the response.

Content-Encoding: gzip

Gzip is the most popular and effective compression method at this time. It was developed by the GNU project and standardized by RFC 1952. The only other compression format you’re likely to see is deflate, but it’s less effective and less popular.

Gzipping generally reduces the response size by about 70%. Approximately 90% of today’s Internet traffic travels through browsers that claim to support gzip. If you use Apache, the module configuring gzip depends on your version: Apache 1.3 uses mod_gzip while Apache 2.x uses mod_deflate.

There are known issues with browsers and proxies that may cause a mismatch in what the browser expects and what it receives with regard to compressed content. Fortunately, these edge cases are dwindling as the use of older browsers drops off. The Apache modules help out by adding appropriate Vary response headers automatically.

Servers choose what to gzip based on file type, but are typically too limited in what they decide to compress. Most web sites gzip their HTML documents. It’s also worthwhile to gzip your scripts and stylesheets, but many web sites miss this opportunity. In fact, it’s worthwhile to compress any text response including XML and JSON. Image and PDF files should not be gzipped because they are already compressed. Trying to gzip them not only wastes CPU but can potentially increase file sizes.

Gzipping as many file types as possible is an easy way to reduce page weight and accelerate the user experience.

5.Put Stylesheets at the Top

While researching performance at Yahoo!, we discovered that moving stylesheets to the document HEAD makes pages appear to be loading faster. This is because putting stylesheets in the HEAD allows the page to render progressively.

Front-end engineers that care about performance want a page to load progressively; that is, we want the browser to display whatever content it has as soon as possible. This is especially important for pages with a lot of content and for users on slower Internet connections. The importance of giving users visual feedback, such as progress indicators, has been well researched and documented. In our case the HTML page is the progress indicator! When the browser loads the page progressively the header, the navigation bar, the logo at the top, etc. all serve as visual feedback for the user who is waiting for the page. This improves the overall user experience.

The problem with putting stylesheets near the bottom of the document is that it prohibits progressive rendering in many browsers, including Internet Explorer. These browsers block rendering to avoid having to redraw elements of the page if their styles change. The user is stuck viewing a blank white page.

The HTML specification clearly states that stylesheets are to be included in the HEAD of the page: “Unlike A, [LINK] may only appear in the HEAD section of a document, although it may appear any number of times.” Neither of the alternatives, the blank white screen or flash of unstyled content, are worth the risk. The optimal solution is to follow the HTML specification and load your stylesheets in the document HEAD.

6.Put Scripts at the Bottom

The problem caused by scripts is that they block parallel downloads. The HTTP/1.1 specification suggests that browsers download no more than two components in parallel per hostname. If you serve your images from multiple hostnames, you can get more than two downloads to occur in parallel. While a script is downloading, however, the browser won’t start any other downloads, even on different hostnames.

In some situations it’s not easy to move scripts to the bottom. If, for example, the script uses document.write to insert part of the page’s content, it can’t be moved lower in the page. There might also be scoping issues. In many cases, there are ways to workaround these situations.

An alternative suggestion that often comes up is to use deferred scripts. The DEFER attribute indicates that the script does not contain document.write, and is a clue to browsers that they can continue rendering. Unfortunately, Firefox doesn’t support the DEFER attribute. In Internet Explorer, the script may be deferred, but not as much as desired. If a script can be deferred, it can also be moved to the bottom of the page. That will make your web pages load faster.

7.Avoid CSS Expressions

CSS expressions are a powerful (and dangerous) way to set CSS properties dynamically. They were supported in Internet Explorer starting with version 5, but were deprecated starting with IE8. As an example, the background color could be set to alternate every hour using CSS expressions:

background-color: expression( (new Date()).getHours()%2 ? “#B8D4FF” : “#F08A00” );

As shown here, the expression method accepts a JavaScript expression. The CSS property is set to the result of evaluating the JavaScript expression. The expression method is ignored by other browsers, so it is useful for setting properties in Internet Explorer needed to create a consistent experience across browsers.

The problem with expressions is that they are evaluated more frequently than most people expect. Not only are they evaluated when the page is rendered and resized, but also when the page is scrolled and even when the user moves the mouse over the page. Adding a counter to the CSS expression allows us to keep track of when and how often a CSS expression is evaluated. Moving the mouse around the page can easily generate more than 10,000 evaluations.

One way to reduce the number of times your CSS expression is evaluated is to use one-time expressions, where the first time the expression is evaluated it sets the style property to an explicit value, which replaces the CSS expression. If the style property must be set dynamically throughout the life of the page, using event handlers instead of CSS expressions is an alternative approach. If you must use CSS expressions, remember that they may be evaluated thousands of times and could affect the performance of your page.

8.Make JavaScript and CSS External

Many of these performance rules deal with how external components are managed. However, before these considerations arise you should ask a more basic question: Should JavaScript and CSS be contained in external files, or inlined in the page itself?

Using external files in the real world generally produces faster pages because the JavaScript and CSS files are cached by the browser. JavaScript and CSS that are inlined in HTML documents get downloaded every time the HTML document is requested. This reduces the number of HTTP requests that are needed, but increases the size of the HTML document. On the other hand, if the JavaScript and CSS are in external files cached by the browser, the size of the HTML document is reduced without increasing the number of HTTP requests.

The key factor, then, is the frequency with which external JavaScript and CSS components are cached relative to the number of HTML documents requested. This factor, although difficult to quantify, can be gauged using various metrics. If users on your site have multiple page views per session and many of your pages re-use the same scripts and stylesheets, there is a greater potential benefit from cached external files.

Many web sites fall in the middle of these metrics. For these sites, the best solution generally is to deploy the JavaScript and CSS as external files. The only exception where inlining is preferable is with home pages, such as Yahoo!’s front page and My Yahoo!. Home pages that have few (perhaps only one) page view per session may find that inlining JavaScript and CSS results in faster end-user response times.

For front pages that are typically the first of many page views, there are techniques that leverage the reduction of HTTP requests that inlining provides, as well as the caching benefits achieved through using external files. One such technique is to inline JavaScript and CSS in the front page, but dynamically download the external files after the page has finished loading. Subsequent pages would reference the external files that should already be in the browser’s cache.

9.Reduce DNS Lookups

The Domain Name System (DNS) maps hostnames to IP addresses, just as phonebooks map people’s names to their phone numbers. When you type http://www.yahoo.com into your browser, a DNS resolver contacted by the browser returns that server’s IP address. DNS has a cost. It typically takes 20-120 milliseconds for DNS to lookup the IP address for a given hostname. The browser can’t download anything from this hostname until the DNS lookup is completed.

DNS lookups are cached for better performance. This caching can occur on a special caching server, maintained by the user’s ISP or local area network, but there is also caching that occurs on the individual user’s computer. The DNS information remains in the operating system’s DNS cache (the “DNS Client service” on Microsoft Windows). Most browsers have their own caches, separate from the operating system’s cache. As long as the browser keeps a DNS record in its own cache, it doesn’t bother the operating system with a request for the record.

Internet Explorer caches DNS lookups for 30 minutes by default, as specified by the DnsCacheTimeout registry setting. Firefox caches DNS lookups for 1 minute, controlled by the network.dnsCacheExpiration configuration setting. (Fasterfox changes this to 1 hour.)

When the client’s DNS cache is empty (for both the browser and the operating system), the number of DNS lookups is equal to the number of unique hostnames in the web page. This includes the hostnames used in the page’s URL, images, script files, stylesheets, Flash objects, etc. Reducing the number of unique hostnames reduces the number of DNS lookups.

Reducing the number of unique hostnames has the potential to reduce the amount of parallel downloading that takes place in the page. Avoiding DNS lookups cuts response times, but reducing parallel downloads may increase response times. My guideline is to split these components across at least two but no more than four hostnames. This results in a good compromise between reducing DNS lookups and allowing a high degree of parallel downloads.

10.Minify JavaScript and CSS

Minification is the practice of removing unnecessary characters from code to reduce its size thereby improving load times. When code is minified all comments are removed, as well as unneeded white space characters (space, newline, and tab). In the case of JavaScript, this improves response time performance because the size of the downloaded file is reduced. Two popular tools for minifying JavaScript code are JSMin and YUI Compressor. The YUI compressor can also minify CSS.

Obfuscation is an alternative optimization that can be applied to source code. It’s more complex than minification and thus more likely to generate bugs as a result of the obfuscation step itself. In a survey of ten top U.S. web sites, minification achieved a 21% size reduction versus 25% for obfuscation. Although obfuscation has a higher size reduction, minifying JavaScript is less risky.

In addition to minifying external scripts and styles, inlined and blocks can and should also be minified. Even if you gzip your scripts and styles, minifying them will still reduce the size by 5% or more. As the use and size of JavaScript and CSS increases, so will the savings gained by minifying your code.

11.Avoid Redirects

Redirects are accomplished using the 301 and 302 status codes. Here’s an example of the HTTP headers in a 301 response:

HTTP/1.1 301 Moved Permanently
Location: http://example.com/newuri
Content-Type: text/html

The browser automatically takes the user to the URL specified in the Location field. All the information necessary for a redirect is in the headers. The body of the response is typically empty. Despite their names, neither a 301 nor a 302 response is cached in practice unless additional headers, such as Expires or Cache-Control, indicate it should be. The meta refresh tag and JavaScript are other ways to direct users to a different URL, but if you must do a redirect, the preferred technique is to use the standard 3xx HTTP status codes, primarily to ensure the back button works correctly.

The main thing to remember is that redirects slow down the user experience. Inserting a redirect between the user and the HTML document delays everything in the page since nothing in the page can be rendered and no components can start being downloaded until the HTML document has arrived.

One of the most wasteful redirects happens frequently and web developers are generally not aware of it. It occurs when a trailing slash (/) is missing from a URL that should otherwise have one. For example, going to http://astrology.yahoo.com/astrology results in a 301 response containing a redirect to http://astrology.yahoo.com/astrology/ (notice the added trailing slash). This is fixed in Apache by using Alias or mod_rewrite, or the DirectorySlash directive if you’re using Apache handlers.

Connecting an old web site to a new one is another common use for redirects. Others include connecting different parts of a website and directing the user based on certain conditions (type of browser, type of user account, etc.). Using a redirect to connect two web sites is simple and requires little additional coding. Although using redirects in these situations reduces the complexity for developers, it degrades the user experience. Alternatives for this use of redirects include using Alias and mod_rewrite if the two code paths are hosted on the same server. If a domain name change is the cause of using redirects, an alternative is to create a CNAME (a DNS record that creates an alias pointing from one domain name to another) in combination with Alias or mod_rewrite.

12.Remove Duplicate Scripts

It hurts performance to include the same JavaScript file twice in one page. This isn’t as unusual as you might think. A review of the ten top U.S. web sites shows that two of them contain a duplicated script. Two main factors increase the odds of a script being duplicated in a single web page: team size and number of scripts. When it does happen, duplicate scripts hurt performance by creating unnecessary HTTP requests and wasted JavaScript execution.

Unnecessary HTTP requests happen in Internet Explorer, but not in Firefox. In Internet Explorer, if an external script is included twice and is not cacheable, it generates two HTTP requests during page loading. Even if the script is cacheable, extra HTTP requests occur when the user reloads the page.

In addition to generating wasteful HTTP requests, time is wasted evaluating the script multiple times. This redundant JavaScript execution happens in both Firefox and Internet Explorer, regardless of whether the script is cacheable.

One way to avoid accidentally including the same script twice is to implement a script management module in your templating system. The typical way to include a script is to use the SCRIPT tag in your HTML page.

An alternative in PHP would be to create a function called insertScript.

In addition to preventing the same script from being inserted multiple times, this function could handle other issues with scripts, such as dependency checking and adding version numbers to script filenames to support far future Expires headers.

13.Configure ETags

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser’s cache matches the one on the origin server. (An “entity” is another word a “component”: images, scripts, stylesheets, etc.) ETags were added to provide a mechanism for validating entities that is more flexible than the last-modified date. An ETag is a string that uniquely identifies a specific version of a component. The only format constraints are that the string be quoted. The origin server specifies the component’s ETag using the ETag response header.

HTTP/1.1 200 OK
Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT
ETag: “10c24bc-4ab-457e1c1f”
Content-Length: 12195

Later, if the browser has to validate a component, it uses the If-None-Match header to pass the ETag back to the origin server. If the ETags match, a 304 status code is returned reducing the response by 12195 bytes for this example.

GET /i/yahoo.gif HTTP/1.1
Host: us.yimg.com
If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT
If-None-Match: “10c24bc-4ab-457e1c1f”
HTTP/1.1 304 Not Modified

The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won’t match when a browser gets the original component from one server and later tries to validate that component on a different server, a situation that is all too common on Web sites that use a cluster of servers to handle requests. By default, both Apache and IIS embed data in the ETag that dramatically reduces the odds of the validity test succeeding on web sites with multiple servers.

The ETag format for Apache 1.3 and 2.x is inode-size-timestamp. Although a given file may reside in the same directory across multiple servers, and have the same file size, permissions, timestamp, etc., its inode is different from one server to the next.

IIS 5.0 and 6.0 have a similar issue with ETags. The format for ETags on IIS is Filetimestamp:ChangeNumber. A ChangeNumber is a counter used to track configuration changes to IIS. It’s unlikely that the ChangeNumber is the same across all IIS servers behind a web site.

The end result is ETags generated by Apache and IIS for the exact same component won’t match from one server to another. If the ETags don’t match, the user doesn’t receive the small, fast 304 response that ETags were designed for; instead, they’ll get a normal 200 response along with all the data for the component. If you host your web site on just one server, this isn’t a problem. But if you have multiple servers hosting your web site, and you’re using Apache or IIS with the default ETag configuration, your users are getting slower pages, your servers have a higher load, you’re consuming greater bandwidth, and proxies aren’t caching your content efficiently. Even if your components have a far future Expires header, a conditional GET request is still made whenever the user hits Reload or Refresh.

If you’re not taking advantage of the flexible validation model that ETags provide, it’s better to just remove the ETag altogether. The Last-Modified header validates based on the component’s timestamp. And removing the ETag reduces the size of the HTTP headers in both the response and subsequent requests. This Microsoft Support article describes how to remove ETags. In Apache, this is done by simply adding the following line to your Apache configuration file:

FileETag none

14.Make Ajax Cacheable

One of the cited benefits of Ajax is that it provides instantaneous feedback to the user because it requests information asynchronously from the backend web server. However, using Ajax is no guarantee that the user won’t be twiddling his thumbs waiting for those asynchronous JavaScript and XML responses to return. In many applications, whether or not the user is kept waiting depends on how Ajax is used. For example, in a web-based email client the user will be kept waiting for the results of an Ajax request to find all the email messages that match their search criteria. It’s important to remember that “asynchronous” does not imply “instantaneous”.

To improve performance, it’s important to optimize these Ajax responses. The most important way to improve the performance of Ajax is to make the responses cacheable, as discussed in Add an Expires or a Cache-Control Header. Some of the other rules also apply to Ajax:

* Gzip Components
* Reduce DNS Lookups
* Minify JavaScript
* Avoid Redirects
* Configure ETags

Let’s look at an example. A Web 2.0 email client might use Ajax to download the user’s address book for autocompletion. If the user hasn’t modified her address book since the last time she used the email web app, the previous address book response could be read from cache if that Ajax response was made cacheable with a future Expires or Cache-Control header. The browser must be informed when to use a previously cached address book response versus requesting a new one. This could be done by adding a timestamp to the address book Ajax URL indicating the last time the user modified her address book, for example, &t=1190241612. If the address book hasn’t been modified since the last download, the timestamp will be the same and the address book will be read from the browser’s cache eliminating an extra HTTP roundtrip. If the user has modified her address book, the timestamp ensures the new URL doesn’t match the cached response, and the browser will request the updated address book entries.

Even though your Ajax responses are created dynamically, and might only be applicable to a single user, they can still be cached. Doing so will make your Web 2.0 apps faster.

Benefits of desktop development

There are a number of benefits to desktop development, from technical advantages to fundamental benefits. Here are some reasons why you should be developing desktop applications.

Technical advantages

There are a number of basic technical advantages to desktop-based applications. They can interact with the local filesystem, allowing settings to be saved on the user’s computer, files to be retrieved and manipulated, databases to be stored and so on. They generally integrate with the GUI toolkit of the operating system, providing a UI style that the user is familiar with – users are still getting used to the web application interfaces of our web 2.0 applications. And finally, they use consistent interfaces, unlike the hope-for-the-best approach to writing HTML, allowing for complex UI elements.

Fundamental advantages

However, perhaps most importantly, they are geared to user-driven applications, especially where the user has content to share. Web applications can easily bring users together, for effective sharing of information.Desktop applications, on the other hand, can easily work with the user and manage information, especially when there is no need to share that information.

Gaming is an excellent example; interface and graphics issues aside, games are built around the user, and as a result work very well on the desktop. Image editing is another; despite the availability of online Photoshop alternatives, everything at the higher end of image editing is still squarely on the desktop. Sure, there’s
Flickr, but Flickr is more geared towards the sharing of photos and basic editing.

As a result, if you’re considering building an application that fits into one of these categories, and you’ve
currently got a web application in mind, a desktop application is definitely worth considering.

To demonstrate basic application development in PHP, we’ll take a look at the PHP-GTK extension. PHP-GTK offers GTK bindings for PHP, allowing you to build scripts that create windows using the GTK
graphics toolkit.

What is PHP-GTK?

GTK is an acronym for the GIMP Toolkit and GIMP is an acronym for GNU Image Manipulation Program, and is a fully featured graphics editing program that runs on Linux. It has many (if not all) of the features of popular Windows programs such as Photoshop and Paint shop. It’s the graphics editor of choice for most Linux users.

GTK is actually part of a set of libraries that was written in C called GTK+. GTK+ was built up over time and is now a main part of Gnome, which is a Linux GUI desktop environment. GTK+ is based on an object-oriented nature and also includes two other libraries:

1. GLib: A library of tools that can be used to assist developers when creating applications with GTK+.
2. GDK: Similar to GDI for Win32, GDK standard for GIMP drawing kit and wraps a set of lower level drawing functions into classes that make developing applications with GTK+ easier. If you’re thinking along the lines of MFC for C++ then you’re making a fair comparison: MFC wraps several controls and hides the calls to the underlying Windows API’s from the developer. GDK does the same thing for GTK+.

Where to get?

We can download binary as well as source code version of PHP-GTK from http://gtk.php.net/download.php . As a beginner, it would be a difficult process to download and install in this manner. Where we need to set up another php.ini file for PHP-GTK. Instead there is another way of installing it. We can get PHP-GTK2 in an executable form as we get WAMP.EXE. (Windows, Apache, Mysql, PHP) All we have to do is just download just download the files from http://www.gnope.org/download.php ,unzip them and double click on the icon GnopeSetup-1.5.1.exe . It will run through a step by step process where it will set up PHP-GTK automatically.
How to test the installation?

Once the installation is done we would eager to know about what is special in it? When we install PHP we would run phpinfo () from root directory .For this let us run a sample script which displays Hello world (as usual) .we can use Dreamweaver for editing the code. Another important point to be kept in mind is to save the file with extension .phpw it can be saved anywhere on your hard disk.

Here is the sample code:

set_title(‘Hello world’);
$wnd->connect_simple(‘destroy’, array(‘gtk’, ‘main_quit’));
$lblHello = new GtkLabel(“hello world”);
$wnd->add($lblHello);
$wnd->show_all();
Gtk::main();
?>

I have saved this sample file with the name hello.phpw at c:\test\. We should run this sample code from command line interface. (CLI) There may be a question rising in your mind asking, why we should run through command prompt than by just by double clicking it as it is a stand alone application. It is possible, for that we need to have a PHP compiler which converts our PHP-GTK code to EXE file . For now, let us try running it from the command prompt.

Steps for Executing a sample code:

1. Start -> Run -> cmd (for xp sp2 and later version) or command (for windows 98).
2. Now you could see a black window which is Command Prompt. Key in the following commands as shown below.

z:>c:
c:> cd test
c:\test>php hello.phpw

Once we finish this line and hit the Enter key, we should see the desired output.

This shows the successful installation of PHP-GTK2 and shouls us PHP output without a web-browser.

Some interesting Websites on PHP-GTK:

1. http://www.kksou.com/php-gtk2/
2. http://phpgtk.activeventure.com/gtk/gtk.gtkbox.html (describes about all base classes).
3. http://gtk.php.net/download.php
4. http://www.gnope.org/download.php

Continue Reading »

Introduction to PHP5

Hey readers,

U’ll be going through the features of PHP 5
Before going throught the new features added in PHP 5
lets brushup some basic topics abt PHP

what is PHP ?&
PHP is server side scripting language

What is server side scripting language.

PHP – Hypertext preprocessors.

Server side scripting language.:
A server side scripting languague is nothing but scripts are run directly on the webserver to generate dynamic web pages.
IT is usually used to provide interactive web sites that interface to databases or other data stores.
The primary advantage to server-side scripting is the ability to highly customize the response based on the user’s requirements, access rights, or queries into data stores.

Zend engine for the brain of PHP5. Before going into features of PHP5
We should know what is a ZEND ENGINE

What is a Zend engine?

The Zend Engine is the internal compiler and runtime engine used by PHP4. Developed by Zeev Suraski and Andi Gutmans, the Zend
Engine is anabbreviation of their names. Now Zend engine 2 is used by PHP5.

Coming back to Zend engine again.
The PHP script was loaded by the Zend Engine and compiled
into Zend opcode. Opcodes, short for operation codes, are low level binary
instructions. Then the opcode was executed and the HTML
generated sent to the client. The opcode was flushed from memory after
execution.

Now we’ll be briefly discussing about new features of PHP5.

An Introduction to PHP – PHP 5 Features

* Vastly improved object-oriented capabilities:
The object oriented features of PHP5 makes it an instant success.
The version5 includes numerous functional additions such as explicit constructors & destructors,
object cloning,class abstraction,variable scoping,interfaces,and a major improvement in object management

The VAR keyword for identifying class properties became deprecated and will throw an E_STRICT warning.

PHP Strict Standards: var: Deprecated. Please
use the public/private/protected modifiers in
obj.php on line 3.

Instead, you should use PUBLIC, PRIVATE or PROTECTED keywords

To copy an object in PHP 5 you need to make use of the clone keyword.

This keyword does the job that $obj2 = $obj; did in PHP 4.
Being a keyword, clone supports a number of different, but equivalent syntaxes.
class A { public $foo; }

$a = new A;
$a_copy = clone $a;
$a_another_copy = clone($a);

$a->foo = 1; $a_copy->foo = 2; $a_another_copy->foo = 3;

echo $a->foo . $a_copy->foo . $a_another_copy->foo;
// will print 123
__clone() can be extended to further modify the newly made copy.
class A {
public $is_copy = FALSE;

public function __clone() {
$this->is_copy = TRUE;
}
}
$a = new A;
$b = clone $a;
var_dump($a->is_copy, $b->is_copy); // false, true

U can declare FINA classes and methods in PHP 5
For methods it means that they cannot be overridden by a child class.

Classes defined as final cannot be extended.

U also have a __autoload() function in PHP%

?php
function __autoload($class_name) {
require_once “/php/classes/{$class_name}.inc.php”;
}
$a = new Class1;
?>

The function will be used to automatically load any needed class that is not yet defined.

Objects in PHP 5 can have 3 magic methods.

__sleep() – that allows scope of object serialization to be limited. (not new to PHP)

__wakeup() – restore object’s properties after deserialization.

__toString() – object to string conversion mechanism.

* Try/catch exception handling: u can handle the error using exception handling mechanism

* Improved XML and Web Services support: XML support is now based on the libxml2 library, and a new and rather promising extension for parsing and manipulating XML, known as SimpleXML, has been introduced. In addition, a SOAP extension is now available. Now we’ll see bit about SOAP

SOAP -Simple Object Access Protocal
This is a lightweight XML-based protocol for exchanging structured information between distributed applications over native web protocols, such as HTTP. SOAP specifies the formats that XML messages should use, the way in which they should be processed, a set of encoding rules for standard and application-defined data types, and a convention for representing remote procedure calls and responses.

* Native support for SQLite: PHP5 supports SQLite database server. SQLite offers a convenient solution for developers looking for many of the features found in some of the heavyweight database products without incurring the accompanying administrative overhead.

This is all about the newly added features of PHP5..