Web Architecture and Frameworks

This lecture is an introduction to the technologies enabling mobile web apps and making programming easier for them. The lecture attempts to introduce important terms and give an overview of the structure of the technologies. It is not intended to be exhaustive list of all the potential technologies you could use to create a web app or a comprehensive description of the technologies. It does contain references for you to find more information about the technologies. I hope it is a good start for you, and it will give us all a common language to speak about the technologies.

The Internet: Client-Server Model and HTTP

The internet is a network of machines that interact using a client-server model. That model assumes that the client sends “requests” to the server, and the server sends “responses” back to the client. For our web apps, the client is the browser and the server is the Apache Tomcat running on the server machine. The machines in the network need a protocol to interact with (or language to speak with each other) and there are many languages or “protocols.” In fact not only are there many different protocols but there are layers of protocols, called the “internet protocol suite” going from the link, to internet, transport and finally application layers. We are only concerned with the topmost protocol layer, the “application layer.” HTTP (meaning Hypertext Transfer Protocol) is one of the application protocols, the other protocol that machines can use to speak with each other are POP, SSH, Telnet, FTP etc.

HTTP is the most pervasive protocol and is used for serving webpages. Because it is an application protocol, the request and response use ASCII text. A request consist of three parts: the “request line,” the “header,” and an optional “body.” The request line expresses the “request URL”, the  “request method” and the HTTP version. The header contains information and since HTTP 1.1 must contain the host domain name. The response contains the “response line,” the “header,” and almost always the “body.” The response line contains the HTTP version, “status code” and “status message.” The response header contains information, for example the “content-type” and “connection.” The response body is typically the content that was requested, for example the web page expressed as html.

You can view the request headers and responses using the browser’s developer tools. After opening the developer tool, search for the network tab, which shows a list of requests-response pairs. Even relatively, simple webpages make several requests for a single page. In fact, it is hard to find a webpage consisting of a single request. Go to

example.com

to find an example of a single GET request. Click on a request line and you should be able to see the header information.  Note that the response body is the HTML code for the page. Point the browser to other webpages and you see that the typical webpages has many request. Many of the requests are for loading images, javascript or css into the webpages. In fact, every link to a resource in the page is a GET request.

Status Codes

You should become familiar with the common status codes and messages that the server sends.

200s are success

  • 200 OK

400s are client errors

  • 401 Unauthorized
  • 403 Forbidden
  • 404 Not Found
  • 408 Request Timeout

500s are server errors

  • 500 internal Sever Error
  • 504 Gateway Timeout

For the complete list of status code list:

http://en.wikipedia.org/wiki/List_of_HTTP_status_codes

Methods

You should become familiar with some of the common methods.

  • GET – should only retrieve information from the server. The method for static webpages, but can also be used for dynamic webpages that only read from the database. A GET request does not have a body.
  • POST – request that the body be added to the resource, typically a database table. The map (meaning the data) from a web form is in the body of a post request. Typically, a new database entry would be a POST request.
  • PUT – request that the body of the request replace the resource. Typically, PUT requests are updates into the database.
  • DELETE – request to remove an entry from the resource. A remove from the database is typically a DELETE request.

For the complete list of HTTP request methods:

http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods

The use of proper request methods is only a guideline and only enforced by your server coding standards. Note that the Grails auto generated code adhering to the standards. GET is consider a safe request because it does not change anything on the server, while POST, PUT, DELETE change data on the server. Also PUT and DELETE are idempotent because a single call has the same effect as multiple calls.

Representational State Transfer (REST)

REST is an web architecture for web services. It is only a standard for the design of the web services and can only be enforced by server coding practices. Some of the important properties of REST:

  • Client-server – separation of concerns between the client and server. For example the client should not be concerned with data storage or the database that is provided by the server.
  • Stateless – client-server communication does not depend on the state of the server or client. No client information is stored on the server. Session state is held by the client.
  • Uniform Interface – contains “identification of resources” and “self-descriptive messages” besides others.

Specifically, to web services the REST standards implies using a base URI, JSON or XML for data transfer, proper use of the request methods (GET, PUT, POST, and DELETE), and hypertext links for representing the state of the client or requested resource. The advantages of using REST architecture is that it scales very well with the number of client requesting services and makes the client request codes very visible. As much as possible website should adhere to the REST architecture

But it is not always possible to adhere to the REST architectural standards. For example, the typical practice is not to store all the session data on the client. Generally only the session ID is stored on the client. For security, we would not want all the session data stored on the client and then transmitted for every request.  For example, should the authentication happen on every webpage request? Also if the web app is to work offline then the client must be concerned with data storage.

HTTP Cookies

Cookies enable the server to store a small amount of data on the client. Actually the amount of storage space is not limited by most browsers, but it is considered bad technique to store more than a few hundred bytes in a cookie. Cookies are set in the header of the server response and stored by the browser and associated with the domain of the URL. On the next request made by the browser, the browser checks if there are any cookies associated with the domain of the URL in the request. If there are any cookies, the browser attaches them to the header of the request. Cookies contain a map of names and values. You can see the cookies sent using the developer tool for the browser.

The typical use of cookies is to store state information on the client, these cookies are called session cookies. For example when a user login in, the server creates a session id and a file containing the session parameters on the sever for the session. A file is used to store the session values because the session parameter are not long lasting, but also a database can be used to store session values. The session id is sent to the browser in a cookie. On the next request the cookie is sent back to the server. The server can then find the file containing the session value for this request from the session id.

Dynamic Web Technology

 Common Gateway Interface (CGI)

In the early 90s, the National Center for Supercomputing Applications (NCSA) developed a specification for calling scripts on their servers from the URL. It was quickly adopted by other web servers. The technique is simple.  The last portion of the URL is divided into two sections.

 <script name>?<parameter-value list>

The script name and parameter value list are separated by a question mark, “?.” And the parameter value list is a list of parameter name and value pairs separated by an ampersand, “&.”

 <parameter-value list> = <parameter 1 name>=<value 1>&<parameter 2 name>=<value 2>

An example from the NOAA website for calling the script to generate weather for Houghton is

 http://forecast.weather.gov/MapClick.php?CityName=Houghton&state=MI&site=MQT

An advantage of CGI is that all the information for generating the webpage is contained in the URL. So, the user can save the URL and reuse it just as any other link.

Before PHP, Perl was the scripting language for the server backend. The Perl script would generate the entire html code and send it to “standard out”.  The server then includes the output in the body of the response. This was a very clumsy technique and mixed backend (Perl scripts) with frontend (HTML) code. Consequently, web developers had to be proficient at both Perl and HTML.

PHP 3 (PHP: Hypertext Preprocessor) (1997) helped to separated front and backend code by using php tags, <?php  … ?> in the html pages. The frontend developer can learn a little PHP code and write the rest of the webpage using html tags.

Java Servlet

In 1997, Pavni Diwaji specified the Servlet while working at Sun Macrosystems. Servlets are a middle layer between a request from the browser and the database or applications on the server.  Java Servlets is a Java alternative to CGI. It has several advantages over CGI:

  • Performance: because it creates a thread for each request not a process. CGI opens a new process for each request which results in more memory consumption because processes get their own block of memory. Threads in a process share the memory. Opening a new process requires more overhead and is slower then opening a new thread.
  • Portability: because it uses java language.
  • Robust: Servlets are managed by JVM, so we don’t need to worry about memory leak, garbage collection etc. But this does not always work.
  • Secure: because it can uses the Java language and Java APIs.

Java Servlets require a web container to manage the Servlets. Apache Tomcat is an open-source Java web server developed by the Apache Software Foundation (ASF). Tomcat implements several Java EE specifications including Java Servlet, JavaServer Pages (JSP), and more. Tomcat has many components. The most important competent for us is the Catalina component. Catalina is the servlet container, often called “container” for short. The container packages the request, decides which servlet should process the request and passes the request to servlet.

Servlet have life cycle which include:

  • init – initializes when the servlet is instantiated.
  • service – processes client’s requests
  • destroy – runs when the servlet is terminated

Consequently, the servlet has access to scope parameters

  • Web context – life of the servlet
  • Session – across multiple request from a client
  • Request – a single request
  • Page – The JSP page that creates the object

References:

Backend Frameworks

After the development of backend scripting languages for web servers, backend frameworks emerged. Some of the key components of the backend frameworks are

  • Model-View-Controller design pattern
  • Routing or URL Mapping.
  • Database including Object Relation Mapping (ORM)
  • Security
  • Template System and addition tags
  • Scaffolding

Model-View-Controller (MVC) design pattern has become the standard for web development, although one very popular web development tool, WordPress, is not MVC based. MVC is very effective at separating programming concerns.

Routing or URL Mapping using the MVC paradigm has also become standard with frameworks. Nearly all frameworks use controller-action URL mappings.

<domain>/<controller>/<action>/<parameters>

For PHP frameworks the real mapping is

<domain>/index.php/<controller>/<action>/<parameters>

The index.php is added by the .htaccess file. This mapping illustrates how the framework works. The index.php script is a bootstrap script which calls all the supporting scripts, sets parameters values and parses the rest of the URL to call the action in the controller. As you know, the controller then accesses the database through the model and sends the appropriate map to the view.

Grails is an example of a JSP/Servlet framework. Grails is built on top of the Spring framework which is a larger framework for rapid development of Java native and web applications.

https://spring.io/

The SpringWebMVC component of a Spring application creates a DispatcherServlet and registers the DespatcherServlet with the container. The DispatcherServlet delegates the request to the proper controller which returns the model/map to the DispatcherServlet. The DispatcherServlet then sends the model/map to the view to compose the response. The response is passed to the container.

request  -----> Front      ----------> specific
response <----- Controller <--model--- Controller
                 ^   |
                 | model
                 |   |
                 |   V
                View
                template

The DespatcherServlet is the Front Controller.

https://docs.spring.io/spring/docs/current/spring-framework-reference/html/mvc.html

A template system reduces the html code duplication. Also frameworks generally provide a few simple html tags to reduce the amount of backend coding a frontend developer has to learn. They also simplify or clean up the appearance of the html code in the view.

Most frameworks offer an ORM for accessing the database.

Most frameworks offer some form of authentication support. They generally offer login, session parameters and tags to expose section of pages to different user types. Grail is fairly unique in offer annotations and URL Mapping control.

Only a few frameworks offer scaffolding or automatic code generation for the CRUD and backend administration. The scaffolding is almost always only appropriate for what I call the “administrative backend views.”

I have chosen Grail for the development of your web apps because it offers all these features. Also because it is well documented and supported. In addition, the school can support it. The only other technology the school will support is PHP.

JavaScript

JavaScript was written by the Netscape in the mid 1990s for their brand of web browsers. It does not have any relations to Java. Netscape bought a license from Sun to use the Java name for marketing. JavaScript is very much like C, but is a prototyping object based language. Objects are made on the fly rather than defined by the class. But, objects can inherit through the prototype.

The original role of JavaScript was to enable frontend dynamics such as button shading to indicate clicking on a button or to animate portions of the webpage.

CGI and JavaScript combined with the html “form” tag formed the original bases for dynamic webpages. The CGI enabled interaction with backend scripts that could then interface with the database while JavaScript could enhance the frontend interaction with the human.

As mentioned before this was a clunking way to develop website. Although, PHP tags help to separate frontend code from backend code, there was not much more to provide structure to the code.

AJAX (Asynchronous JavaScript + XML)

AJAX is combination of technologies used to generate portions of webpages asynchronously (in the background). The technologies are:

  • HTML and CSS for the presentation
  • Document Object Model (DOM) for dynamic display of the data
  • XML for data transfer from the server the client, although now JSON is more common.
  • XMLHttpRequest object for asynchronous communication with server
  • JavaScript to tie everything together

Document Object Mode (DOM)

The Document Object Mode (DOM) Is a language and platform independent standard for representing objects in the HTML coded page. The browser internally models all the elements of the web page as nodes in a tree. This is possible because HTML is inherently hierarchical. All the page is contained between the <html> … </html> tags, which represents the “root” node. The rest of the tags, <div> etc, are nested within each other designating parent-child relation and constructing a tree.

Each object type also implements an interface. This is a “Java” like interface, meaning that elements objects have methods that can be called and properties that can be accessed. Not only is the webpage represented hierarchically but also the interfaces are hierarchical.

Important DOM objects and interfaces

  • document – is the root element of the DOM. It contains all the content of the webpage including the body tag.
  • element – represents any object in the DOM, for example the “document” is an element. It is one of the fundamental types.
  • node – is another base type.
  • nodeList – is an array of nodes.

There are also interfaces for additional objects that are not really in the DOM

  • window – is the an object representing the browser window.
  • text – is the object representing the text in a node or an attribute.
  • event – events generated by the document elements.

There are too many properties and methods in the interfaces for me to delineate.  A good resource is the summary in the Mozilla Developer Blog:

https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model

and this introduction:

https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction

JavaScript Object Notation (JSON)

JSON has replaced XML for most AJAX data exchanges for two primary reasons:

  • It is smaller than XML because it has very little mark up.
  • It is more human readable because it has less mark up.

There are only two basic structures or mark ups

  • {…} – representing an object
  • […] – representing an array

Objects may contain string-value pairs

{"string 1": value1, "string 2": value2}

Arrays are ordered and can contain values which are strings, numbers, objects or other arrays (also true, false, null).

[value1, value2]

so it could also be

[value1, {"string 2": value2}]

or

{"string 1": [value1, value2]}

Jason is very compatible with JavaScript, and a JSON object or array looks much like JavaScript code for an object or array.

The official JSON page.

http://json.org/

XMLHttpRequest

XMLHttpRequest is a JavaScript object that was designed by Microsoft and released in 1999. It was later adopted by Mozilla in 2001. It is the primitive mechanism in JavaScript for writing a request.

Coding for the XMLHttpRequest requires only four steps

  1. Create the XMLHttpRequest object
  2. Attach the callback handler for the response
  3. Open the request using the XMLHttpRequest open method
  4. Send the request using the XMLHttpRequest send method

an example:

var oReq = new XMLHttpRequest();
oReq.onload = reqListener;
oReq.open("get", "yourFile.txt", true);
oReq.send();

The three parameters for the open methods:

  1. A string representing the request method
  2. A string representing the URL, which can be an absolute path or a relative path
  3. A boolean representing asynchronous for true.

Another important method is setRequestHeader, which specifies a name-value pair.

Event Listeners can be set, but must be set before the request is opened. This makes sense because you need to have the listeners attached before the request is made. A common event listener is the onreadystatechange.

The response for the request is found in the XMLHttpRequest object responseXML attribute.

A good reference is the Mozilla Developer blog:

https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest

https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Using_XMLHttpRequest

Note that AJAX is only a proposed standard, but as most web standards they are implement by the browser long before the standard becomes fixed.

HTML5

HTML5 is a series of proposed JavaScript API that attempt to make HTML modern. These APIs have not become standards, yet many are implement in most web browsers.

A website delineating the APIs and the browser implementing the APIs is Mobile HTML 5.

http://mobilehtml5.org/

There are quite few API, but the ones that interest us the most are

  • Application Cache
  • GeoLocation
  • Multimedia
  • File API
  • HTML Media Capture
  • XMLHttpReqpuest 2.0
  • Web storage

You’ll notice that all of the above are implemented by the big 4 browser: Safari, Chrome, IE and Firefox.

In the left column is a link to the W3C API. After getting through the boil plating in the API specification, these guides explain the methods and use of the JavaScript objects they define.

For example the GeoLocation API defines the GeoLocation interface.

interface Geolocation {

    void getCurrentPosition(PositionCallback successCallback,
         optional PositionErrorCallback errorCallback,
         optional PositionOptions options);
    long watchPosition(PositionCallback successCallback,
    optional PositionErrorCallback errorCallback,
    optional PositionOptions options);
    void clearWatch(long watchId);
};

callback PositionCallback = void (Position position);

callback PositionErrorCallback = void (PositionError positionError);

But the specifications read like specifications, so a better reference is the text “Dive into HTML 5” by Mark Pilgrim with contributions from the community.

http://diveintohtml5.info/index.html

It covers many of the important API with examples, including GeoLocation, File, Application Cache.

Frontend Frameworks

There are only a few more technologies left for us to talk about. These frameworks make programming the frontend easier. There are basically three types of frontend frameworks.

  • Styling or CSS frameworks although these typically have some JavaScripting
  • JavaScipt libraries for manipulating the DOM and making AJAX calls
  • JavaScript frameworks for data binding

We’ll only discuss the first two. Styling and CSS framework provide css style sheets for your code. Many are available for free, but two of the most popular and powerful are Twitter Bootstrap and Foundation. I have chosen Twitter Bootstrap because

  • It is mobile first, which means designing for the mobile devices will be easy for you.
  • There is a Grails plugin that makes installing it easy
  • There is a W3 Schools tutorial

http://www.w3schools.com/bootstrap/default.asp

Besides the official website is also well documented.

http://getbootstrap.com/

There are also many JavaScript libraries for manipulating the DOM and making AJAX calls. Two of the most common are JQuery, Prototype, Dojo used by ArcGIS.  I have chosen JQuery because it is:

  • Very very popular
  • Elegant and has a simple syntax
  • Grails comes with compact JQuery and has a grails plugin
  • There is a W3 Schools tutorial

http://www.w3schools.com/jquery/default.asp

The official JQuery website is also well documented.

http://jquery.com/

JQuery syntax is very simple

$("<selector>").<method>(…)

$(“<selector>”) selects a DOM element or element list and returns a jQuery object tied to the element or element list. The selectors look like selectors in the CSS style sheets, so not much overhead in learning.

There is a vast array of methods, examples:

  • add()
  • ajax()
  • attr()
  • html()

JQuery follows the method chain design pattern, which means that nearly all methods return the JQuery object so that you can just add more and more method calls to the same selected elements to create complex expressions.

All these technologies enable mobile web apps and make it easy to program them.