A Newsletter Enabling Information Technologies by the IRMC IT Department

 

Spring 1999 (continuation)

What's Inside

The NDU Knowledge Net for CIO's is Born! - Dr. Jay Alden writes about a planned initiative to provide quality continuing education over the World Wide Web.

Data Administration: The Nuts and Bolts - In her second of two articles, Prof. Polydys shares her knowledge on this critical component of information management.

The Lighter Side of Y2K - Forget the doom and gloom.

A Managers Guide to Neural Network Technology - Dr. William Hodson provides a concise tutorial on this fascinating technology!


The NDU Knowledge Net For CIO's is Born!

By Jay Alden

Like many other corporate universities, the IRM College recognizes that formal courses are only one way of serving the information needs of its constituents. The emergence of the World Wide Web, its growing accessibility, and its rapidly developing functionality offers a supplemental approach for achieving its mission of continuing education. The model conceived to take advantage of the evolving capabilities of the World Wide Web has been designated as the NDU Knowledge-Net.

The NDU Knowledge-Net is planned as a web-enabled service of the IRM College. It is seen as an alternative Information Age strategy for the National Defense University to deliver just-in-time continuing education to its constituency and provide a vehicle for life-long learning. The NDU Knowledge Net will ultimately consist of multiple channels for different university constituents. The first channel being developed as a pilot is intended for CIO staff, primarily in DoD organizations. It provides the following services for the ten competencies identified by OSD:

BASICS: Fundamental descriptions of the CIO competencies

NEWS: Summary of recent developments and emerging issues concerning the competencies

LINKS: Annotated links to other web reference sources grouped by the competencies

EVENTS: A listing and links to conferences, training, and educational programs grouped by the competencies

PEOPLE: Names, titles, profiles, and e-mail addresses of special interest group members

DISCUSS: An asynchronous threaded conferencing service for discussion of topical issues

CHAT: Scheduled synchronous chats on particular topical issues

The home page for the CIO Knowledge-Net (which will have a link from the IRM College home page) contains a brief description of the service, a disclaimer in accordance with DoD policy, information on services available to registered users and non-registered users, links to the topical pages, as well as links to a registration page, a site map, and a search function.

Standardized web pages are used for the various Knowledge Net services - Basics, News, Links, Events, People, Discussions, Chats (some of which are restricted to registered users). A common set of navigational tools will be used in each topical area. (See example on first page).

Plans call for augmenting the basic Knowledge Net information services with an intelligent search capability to help users locate particular kinds of information, streaming media presentations from selected courses and special events, and "push" technologies that automatically e-mail updates and relevant announcements concerning each topical area to registered subscribers.

The Knowledge Net service will be pilot tested with a group of about 50 volunteers in the third quarter of 1999 and go public by the end of the year.


Data Administration and Management: "The Nuts and Bolts"

Final Part in a Series of Two Articles by Mary Linda Polydys

In the Department of Defense (DoD), the Joint Technical Architecture (JTA) requires compliance in DoD information and data element standards. In order to be compliant with the JTA, information technology acquisition managers need to understand the "ins-and-out" of data administration in DoD as well as how to apply data administration concepts and practices in their acquisition programs. This article provides a discussion of many of those DoD data administration concepts and practices including an examination of their use in information technology acquisitions.

Functional Elements of Data Administration

Typical functional elements of data administration might include some of the following activities:

These functions are performed in DoD under the DoD Data Administration Program. The following sections address some key elements of the DoD Data Administration Program including the data administration organization, data architecture, data models, data element standards, metadata, tools in DoD for metadata management, and standard data element usage in information technology acquisition programs.

Organizing for Data Administration

Defining, organizing, supervising, and protecting data across functions and across multiple information systems is a nontrivial task. This requires an organization of individuals skilled in data analysis representing all critical mission areas across an enterprise. They must be skilled negotiators and knowledgeable in their mission or business area.

In DoD, such skilled individuals reside in a data administration organization that consists of "enterprise" data administrators. The "enterprise" data administrators include a DoD Data Administrator, Functional Data Administrators (at the Secretarial level and often the Principal Staff Assistants), and Component Data Administrators (for separate Agencies and military Services). The Functional Data Administrators accept stewardship for data elements that support their function. Stewardship involves creating and maintaining data models and standard data elements for the functional area information needs. Lastly, "enterprise" data administrators are supported by a central organization that provides services, such as, policy development and central maintenance of the data models and data repositories.

Data Architectures and Models

One of the major elements of data administration includes developing an enterprise data architecture. The Federal Enterprise Architecture Framework describes data architecture as a component of the technology architecture. The Framework further states that the data architecture contains the contents of data models, includes a metadata repository, is used to standardize data for sharing across systems, and is used to define system and infrastructure requirements. In other words, the data architecture, as a minimum, is the data model (or object model) and description of the data model (e.g., the metadata). A data/object model contains entities/object classes and attributes and relationships that are depicted by connecting lines between entities/objects.

In DoD, the DoD Data Architecture includes the contents of the DoD Data Model (DDM). The DDM defines overall information requirements and business rules that support the warfighter and depicts the logical relationship between DoD data standards. The data are depicted graphically through entity-relationship diagrams using the Logic Works, Inc. ERwinã data modeling tool. The DDM is available on-line for downloading and reuse.

Figure 1 shows an example of a data model, which supports the information needs of the Information Resources Management (IRM) College. This model is called a data model "view" because it is part of a larger enterprise data architecture or model.

The IRM College data model "view" organizes information needed to identify people associated with the College. This data model "view" is depicted graphically through an entity-relationship diagram using the ERwinã data modeling tool and contains entities (i.e., race, person, person-association, etc.) and attributes (i.e., race code, person birth date, person name text, etc.). The reader should note that the data model "view" was extracted from the DoD Data Model, as mandated by the JTA. Notice that the Integrated Definition For Information Modeling (IDEF1X) standard (FIPS PUB 184), as mandated by the JTA, is also used.

The IRM College uses the data model and data modeling process to elicit, define, and document the College’s information needs. To that end, data models are an important coordination and communication tool. And lastly, the model is used as a basis for implementing DoD data element standards in database applications for the College.

Data Element Standards from Data Models

A data/object model is the basis for data element standards. More specifically, data elements for use throughout the information systems environment and life cycle are representations of data/object model entities/objects and their attributes. The entities and attributes in Figure 1 are DoD data element standards. These standards are normally implemented in databases using a shortened or truncated name.

Metadata for Data Models and Standards

Supporting data models and the data element standards are metadata repositories. These repositories are normally automated databases, which contain data about the models, entities and attributes, data element standards, and information systems that use the standards. This metadata includes such information as data model names, data standard definitions, security classification, and standard access names for each data element standard.

Tools in DoD for Metadata Management

The DoD’s current tools for managing data about data models (i.e, metadata) and data element standards include a metadata repository called the Defense Data Dictionary System (DDDS) and an online DDM mentioned earlier. A personal computer version of the DDDS called PC Access Tool (PCAT) is also available. Lastly, DoD maintains a repository of reusable data objects for physical implementation. These installable objects are created under the SHADE program and are available online.

Using Data Elements in Information Technology Acquisition Programs

Using data architectures (data models) and data element standards in information technology acquisitions is not an easy task and requires planning before, during, and after software development. Planning includes reusing logical data model objects to construct a systems data model view and installable data objects developed by SHADE. In addition, planning should deal with current data stores in legacy systems as well as data exchanges through system-to-system interfaces. Lastly, the policy for relying on commercial items (including COTS software) complicates data standards implementation. Planning for the use (or nonuse) of COTS data elements is also important.

Last Thoughts

Implementing data administration throughout the DoD, including information technology acquisition programs, will improve the enterprise’s management of its information resource. The responsibility for implementation lies with those who plan for, design, develop, field, and maintain information systems regardless of whether these actions occur within DoD or through contractors. The impact of the lightning speed of change associated with the information age has a major effect on data administration concepts and practices. As an example, this article only deals with transaction type data. However, data administration for multimedia data such as audio and visual data is still in its infancy. In addition, DoD has learned that implementing data element standards using current processes is more complex than anticipated. To that end, DoD is refining its data element standards approach to include processes and reusable structures for physical implementation. Maintaining currency in emerging data administration concepts and practices is a challenge to all information technology professionals.


The Lighter Side of Y2K

Dear Boss:

I hope that I haven't misunderstood your instructions. Because to be honest, none of this Y to K problem makes any sense to me. At any rate I have finished the conversion of all of the months on all the company calendars for next year (year 2000). The calendars have returned from the printer and are ready to be distributed with the following new months:

Januark, Februark, Mak, Julk

______________________________________________

January 1, 2000

Dear Valued Employee:

Re: Vacation Pay

Our records indicate that you have not used any vacation time over the past 100 year(s). As I'm sure you are aware, employees are granted 3 weeks of paid leave per year or pay in lieu of time off. One additional week is granted for every 5 years of service.

Please either take 9,400 days off work or notify our office and your next pay check will reflect payment of $8,277,432.22 which will include all pay and interest for the past 1,200 months.

Sincerely,

Automated Payroll Processing

______________________________________________

I understand that Garry Kasparov has a plan. He's going to schedule the rematch with the computer for just after midnight, 01 January 2000.

______________________________________________

Q. How do Y2K practitioners greet each other?

A. "May the 'source' be with you."

______________________________________________

An executive is vacationing on the beach. A bottle washes up. He picks it up and uncorks it. A genie oozes out and says, "Look. It's been a tough week and I'm all tuckered out. I can only grant you one wish."

The exec thinks for a moment and says, "I've always wanted a bridge from California to Hawaii." Genie says, "Give a break. No can do a bridge. Try again."

The exec says, "OK. Tell me everything I need to know to keep my business from failing in the Year 2000." Genie sighs and says, "All right. Do you want that bridge two lanes or four?"

______________________________________________

Q. What are Y2k analysts and programmers going do after Year 2000?

A. Become expert witnesses.

______________________________________________

Q. Why is getting an elephant pregnant like fixing the Y2K problem?

A. Both require tremendous resources, are logistically very difficult, and you won't know for a couple of years if you got the job done.

______________________________________________

Experts warned today of a new and deadly threat to our beleaguered civilization: the 100GB Bug.

As most people know, McDonald's restaurant signs show the number of hamburgers the giant chain has sold. That number now stands at 99 billion burgers, or 99 Gigaburgers (GB). Within months or even weeks, that number will roll over to 100GB. McDonald's signs, however, were designed years ago, when the prospect of selling one hundred billion hamburgers seemed unthinkably remote. So the signs have only two decimal places.

This means that, after the sale of the 100 billionth burger, McDonald's signs will read "00 Billion Burgers Sold." This, experts predict, will convince the public that, in over thirty years, no McDonald's hamburgers have ever in fact been sold, causing a complete collapse of consumer confidence in McDonald's products.

The ensuing catastrophic drop in sales is seen as almost certain to force the already-troubled company into bankruptcy. This, in turn, will push the teetering American economy over the brink, which, finally, will complete the total devastation of the global economy, ending civilization as we know it, and forcing us all to live on beetles.

"The people who know -- the sign-makers -- are really scared of 100GB," one expert said. "I don't know about you, but I'm digging up a copy of THE FIELD GUIDE TO INSECTS and heading for the hills."

______________________________________________

Q. What are the four B's you need on January 1, 2000?

A. Bottled water, blankets, batteries, and booze.


A Managers Guide to Neural Network Technology (Abridged Version)

By William T. Hodson

The digital computer, as we know it today, processes information in a serial manner -- one instruction after another in order. While this has turned out to be a sound approach for many applications neurophysiologists have found that this approach is quite different from the way which our brains process information. The brain operates in a highly parallel manner, enabling it to see patterns in data and to connect information from several sources. While this capacity is as yet not completely understood, The very architecture of the brain wherein each brain cell, or neuron, is connected to hundreds or even thousands of others is suggestive of an immense capability for parallel processing. So, if the brain can perform certain types of tasks which conventional computers cannot, researchers in the field of artificial intelligence reasoned that a computer specifically designed to emulate the brain, both in terms of its architecture and the way in which it acquires and processes information, might well be able to perform in a similar way.

Architecture and Operation. Of all the neural network architectures which have been proposed for these purposes, the Multi-Layer Perceptron (MLP) is the most commonly used in practiced and for that reason, we will focus on it in our explanations. The MLP is composed of a number of individual processing elements (artificial neurons) organized into three or more layers: the input layer, one or more "hidden" layers and the output layer as shown in the figure below.

Its operation is quite straightforward. A "signal" (simply a number) is applied to each of the inputs across the bottom of the network. These numbers are been multiplied by a number (the "Weight") assigned to each of the branches which connect these inputs to the first layer of artificial neurons. The sum of these products is then computed by each neurons and the result transformed by a non-linear function becoming the output of the neuron. These outputs serve as the inputs to the next layer. The process continues, until a signal appears at the output nodes at the top of the figure. So if the weights stay the same, the same input signal will always yield the same output signal. But if the weights are changed, a different output will be generated.

For a given application, some thought must be given to the specifics of the architecture, e. g., how many input neurons will there be, how many hidden layers and how many neurons in each, how many output neurons will there be, will each neurons in each layer be connected to all other neurons in the next layer, etc. These questions are best left to a specialist, but while they're not directly the concern of the manager, he or she should recognize that the ability of the network to perform as hoped for is heavily dependent on these choices.

Training the Network. The concept of training is what fundamentally separates neural network approaches from other types of decision technologies. Rather than developing a mathematical procedure which will be programmed to give a certain output (representing the decision) when a particular input (representing the problem) is applied, many actual historical examples of sets of inputs and outputs are collected and then "presented to the network" to train it. In other words, the network learns by example rather than by being told the precise logic of the decision process through a step-by-stepping procedure for "getting the answer". This, in fact, is how the human brain apparently approaches the decision-making process.

Changing the weights on the branches of the network is the key to this process of learning. The set of weights in the network is essentially the network’s memory; there is no special storage area for memory as there is in the conventional digital computer. This is how it is in the brain as well; information is stored electro-chemically on the dendrites and axons of each of the neurons and changes as learning takes place. The key to training, then, is to find a way to change the weights in an efficient and effective way based on the training examples which are presented to the network. In the early 1980's several researchers operating independently found a method to do this which has been called "backpropagation". Since that time many enhancements to this approach have been developed to increase the speed and stability of training.

In backpropagation training the network is initialized by assigning weights at random to each of the branches. Then the inputs in the first training example are applied to the inputs of the network. The network operates as described above and an output is generated. This output is then compared to the actual results in the training example and the difference between the two (the so-called "error term") is then calculated. A rather complex formula, beyond the scope of this paper, is used to change the weights in the networks to reduce the value of this error term, thereby making the networks response closer to the way in which the decision was actually made and the training example. Each of the available training examples is presented to the network in turn and the weights are readjusted. It often happens that changing the weights to reduce the error term in one example actually increases the error term in another. To overcome this affect the entire set of training examples is presented again and again and again--sometimes hundreds of times--with the goal of finding the best "compromise" for all the weights, wherein the total error in all of the training examples is reduced to the smallest possible level. Generally speaking, the network is considered to be fully trained when the error is not further reduced with the repetition of presentations of the training set.

Testing the network. The purpose of the neural network, of course, is to make decisions in new situations where the result is not already known. And other words, the value of the network lies in its ability to generalize from what it learns from the examples in the training set to new situations. To test its ability to do this, a number of the available examples are held back and not used in the training phase. After the network has been trained, these examples are presented to the network and the output from which the network produces is compared with the actual results as contained in the example. If the magnitude of the errors in the testing phase are about the same as in the training phase , this means that the network is capable of generalizing to new situations as it was intended to do.

At this point the overall quality of the decision-making capabilities of the network can be assessed. In doing this, two distinct questions are addressed. First, is the quality of the decision which is being produced by the network good enough for the particular application? If it is not, it might mean that a different network architecture should be pursued, that more training examples are required, or that a modified method of learning should be tried. Or it might be that neural networks are really not appropriate for the application. (although neural networks are a fascinating decision tool, they are not a panacea for situations in which all other decision methodologies have failed!) The second question is whether neural networks is the best approach, or is expert systems, statistical methods or some other technique superior. Here, the quality of the decision is of great importance, so also are ease of application, reasonable cost, and a high degree of credibility in the results.

Public Sector Neural Network Applications

Military Operations. A large number of specific problems in military operations have been addressed using neural network technology. A particularly important one is Automatic Target Recognition in which a target (such as a truck, tank, or bunker) is identified automatically in a photograph or radar image without the aid of a human analyst. This is accomplished by "showing" the network images of each class of target under a variety of lighting conditions and from a variety of different perspectives. The network then learns to identify the type of target. Weapon-to-Target Assignment problems have been solved by training a neural network to seek the patterns which emerged as human targeteers perform this function. The neural networks which control the flight of fighter aircraft simulate cores have been trained to emulate the Air Combat Maneuvers of experienced fighter pilots by "observing" the way which they react to a wide variety of flight situations. Military transportation requirements have been generated by neural networks after they have been "shown" the transportation requirements generated by human logisticians to satisfy a wide variety of campaign plans.

Weather. Establishing cause and effect relationships in weather prediction has always been difficult, but certain patterns of weather phenomenon often seem to lead to particular effects. Neural networks have been used for a number of different types of predictions by training them to see these patterns an associate them with the resulting effects. One of the earliest efforts dating to the 1960's, was used for general weather prediction in the San Francisco Bay area. In this effort, tomorrow's weather in San Francisco was predicted on the basis of the pattern of today's weather in San Francisco as well as two specific locations in the Pacific Ocean. In a more recent effort, neural network pattern Recognition techniques were used to Forecast Short-term Movement and Intensification of Cyclones, with results better than those of conventional weather forecasting. Predicting Cloud-to-Ground lightning based upon ambient weather conditions has been accomplished with neural networks with the same or slightly better accuracy than with current techniques. Cloud Pattern Recognition based on information received from satellite imaging systems is another area of weather research being addressed with neural networks. In each of these cases, the large quantity of data available for training and testing networks has contributed in a significant way to the success of the research.

Medical Diagnosis. Military units are often employed to remote locations without physicians capable of diagnosing medical problems. This has led to research in the Department of Defense to develop high quality computer-assisted medical diagnosis techniques. Statistical methods have been tried, as have expert systems, but most recently neural network technology has been employed in this area. Essentially, neural networks are trained to interpret the patterns of symptoms and medical histories of individuals in making a diagnosis. This is really a classification problem, in that the network associates a certain set of conditions with one disorder, another set with a different disorder and so on. The networks are also capable of providing secondary and tertiary responses. Armed with a diagnosis, medical corpsmen assigned to the units are often capable of implementing a treatment plan.

Signal Recognition and Interpretation. Neural networks have been found to be remarkably adept at recognizing and interpreting various types of signals. For example, sonar signal classification with neural networks has turned out to be even more accurate than with all but the most experienced U. S. Navy sonar operators. The networks have been taught to distinguish among various types of objects found in the ocean based upon the acoustical signals which bounce off them when exposed to a sonar system. Intelligence-gathering organizations are using Communications Signal Processing by neural networks to classify various types of radio signals by type and location of emitter. And Seismic Signal Classification is also being accomplished with neural networks, to locate underground explosions and estimate there magnitude.

Personnel Profiling. Motivated in part by the work being done in the private sector in using neural networks to assess the credit risk of individuals, personal security evaluations with neural networks have been considered. Such a system would train a network on the background characteristics of individuals which had applied for security clearances in the past coupled with the determination which was made concerning the awarding of a clearance. The network would then attempt to replicate the adjudication process to save the staff time currently required to do this. In another research effort, a profile of characteristics of student pilots was matched with their successful completion of the pilot training program. This data was then used to train a neural network in forecasting successful completion of pilot training by applicants in subsequent classes. Clearly this type of admissions selection process could be used in other types of educational programs as well.

Closing Caveat

1. The fact that using a neural network eliminates the need for developing an algorithm to reach a decision on the basis of the input data provided does not mean relieve the decision maker from thinking about the problem! Countless failures in neural network development programs can be attributed to "Throwing data at a network" to see what comes out. Some understanding of the way in which the decision will be made is essential in deciding on the architecture of the network and the type of pre-processing which should be done to the data which is presented to it.

2. The fact that neural networks do not use identifiable cause and effect relationships in reaching decisions (and in this way, they are like statistical decision methods) can result in ethical and legal issues when they are used. Consider, for example, using the results of a neural network alone to deny an individual a security clearance, based on the fact that his or her profile indicated a higher degree of risk, but with no specific reason. Or, a mistake on a medical diagnosis caused by interpreting a pattern of symptoms; Who's to blame--the developer of the neural network or the one who used its results?

3. The fact that there must really be patterns and relationships in the data in order for the neural network to extract it and make high quality decisions is sometimes overlooked. All too often neural networks are tried when other decision methods have failed on a problem only to discover that it was the lack of good data--not the methodology--that prevented a good decision from being made.

Summary

Neural networks provide yet another decision tool to the public sector manager. As the discipline matures, we can expect to see more and more applications in both stand-alone systems and embedded in large Automated Information Systems. Their principal application seems to be in those areas in which it is difficult to specify hard and fast rules for positions, but where it is clear that a certain "pattern of factors" is used by a human in arriving at a decision. If enough data from pass decisions is available to train the networks to accomplish decisions of a similar type in the future, neural network technology may be a good candidate to consider.


itt-971.gif (12815 bytes)

itt-972.gif (5455 bytes)

Visit us at http://www.ndu.edu/irmchp

Editor Les Pang, e-mail: pangl@ndu.edu, (202) 685-2060, http://members.aol.com/lpang10473/default.htm

Graphics Designer Jim Looney