COMM 260, Principles of Internet Web-Based Design
Instructor: Ross Collins

Lecture Synopsis Seven: Extending capabilities of the web

XML
XML (Exensible Markup Language) is related to the HTML mark-up language, in that it's also based on the SGML base language, but it is more flexible. Webmasters can customize tags specifically for a document, such as <ingredients> and <cooking time> for a recipe site.

Web browsers
A browser is an application designed to read pages coded using HTML and related mark-up languages. The first was Mosaic, which became Netscape. Netscape was acquired by AOL in 1998. When AOL discontinued developing the browser, former Netscape programmers established Mozilla as an open-source (code available to everyone) browser code. The Firefox browser by Mozilla has captured about 20 percent of the browser market, compared to about 70 percent for Microsoft's Internet Explorer. Other browsers include Macintosh's Safari and Opera--a Norwegian browser that's fast, small, often used for mobile phones.

Plug-ins
"Helper" applications that extend the capabilities of a browser are called plug-ins. These may be bundled with a browser program, or may be downloaded separately. Common plug-ins include Macromedia Flash for multimedia content; Shockwave, also for multimedia content but creating larger files so often used for CDs; Adobe Acrobat to view PDF (Portable Document Format) files, usually forms and booklets; RealPlayer and Quick-Time to view streaming video and audio--that is, data transferred to a user's computer and played before the entire package is downloaded (saves time). MP3 is one of several plug-in applications used to download music. Java is a programming languages used to write applets that can add functions to a browser, such as animation or automated advertisements. Javascript is not a true programming languages, but offers scripts to control attributes such as rollovers, clocks, counters, etc. I have several on my web site that I've downloaded free from the 'net. ASP, JSI, CGI, PERL, PHP are programming languages used to set up special services, including databases, shopping carts, and dynamic (changing) content.

Cookies
People sometimes worry that these bits of code stored on a user's hard drive will steal content or e-mail addresses. They can't do that. They can, however, track usage around the web, and while this is normally done to target advertisements to user's interests, some people don't appreciate this snooping. Cookies can be a valuable time-saver, though, offering automatic log-ins to sites and remembering a user's profile for financial or retail sites. For more information and ways to bake cookies, check out Cookie Central.

Search engines and domain names
Web search engines gather pages to index by "crawling" or "spidering" through the networks. Pages are gathered, indexed, and offered to users searching for information. How they do that depends on the crawler, but generally users need to know that much of the web never reaches a search engine. It is estimated that only one-third of the web is indexed. Search engines index pages by evaluating content based on META tags, titles, and keywords in content. To find this, they normally begin by examining submitted domain names, or by referring to domain name registries. A domain name is the words that correspond to an IP (Internet Protocol) number address, such as www.ndsu.edu. Top-level domain is the suffix or extension; .edu for higher education, .mil for military, .gov for government, .org for non-profit organization; .net for net-related services, etc. Every country also has a suffix, such as .ca for Canada and .fr for France. Domain names must be registered using authorized registrars. Is your name a registered domain name? Check it out at Network Solutions. For the general domain name list and more information on registry, consult InterNIC.

If you wish to be found by web spiders, set up a META tag in the Head section of your HTML, such as <META NAME="keywords" CONTENT="?">, with the ? replaced by words you think define your page. If, on the other hand, you want to keep your page away from the spider's beady little prying eyes, write a META tag such as <META NAME="robots" CONTENT="noindex, nofollow"> For more tips on optimizing your site for search engines, check out this article by Jennifer Kyrnin in About.com (sorry about all those ads).

Copyright 2004 by Ross F. Collins <www.ndsu.edu/communication/collins>