Thursday, May 10, 2007

Semantically Enabled SOA

Semantically Enabled SOA

SSOA makes it possible for underlying technology to make decisions

Arunava Chatterjee

Semantic Service-Oriented Architectures introduce semantic enhancements to services so that agents can dynamically combine services to satisfy business goal.

Arunava is a manager at BearingPoint working on SOA and Java Enterprise Architectures. He obtained a Ph.D. in Physics from Florida State University and can be contacted at chatterjeeb@yahoo.com.


Service-Oriented Architectures aren't new. The underlying ideas of encapsulation, location independence, and programming-by-contract have been in use since it was apparent that dependencies between components (processes, objects, and the like) should be minimized. SOA is the most recent incarnation of these concepts in the current technologies.

Like other attempts to support software intercommunication on a large scale, SOA has its strengths and weaknesses. It must overcome obstacles of different technologies, communication protocols, and Quality-of-Service (QoS). If you're familiar with existing concepts of distributed enterprise computing (such as CORBA or DCOM), the elements composing SOA architecture have clear analogies to the older technologies.

In short, SOA is a means by which a distributed, heterogeneous grouping of hardware and software can exchange information to satisfy a business goal.

The principal motivation for SOA is to improve business agility—allowing organizations to adapt to changes as quickly as possible. To achieve this end, the enterprise must adopt policies that favor agility; for instance:

· Consolidating business logic and data. Components used by different groups for essentially the same purpose should be consolidated. Data should be consistent across groups and used in a consistent way.

· Leveraging existing systems. Systems the organization already uses to support business goals should be leveraged to satisfy the current business need.

To fulfill the business motivations, SOA espouses certain design principles:

· Services based. The fundamental element in SOA architecture is the "service." A service has certain responsibilities that can be used by the rest of the enterprise.

· Loosely coupled. The client (other services, end users, or external programs) leveraging the services need not be aware of how the services fulfill their responsibilities. Internal data structures, calls to other services, transaction management, and storage requirements should be hidden from the client.

· Location independent. The client need not be aware of where a service resides or executes.

· Interoperable. Services should be able to exchange data amongst each other.

· Combinable. Services should be designed and implemented such that they can be combined to create new services.

· Business aligned. Services exposed to end users or external programs should represent business process functionality. The inputs/outputs they expose should be expressible in terms of business-aligned needs.

The strategies used to implement SOA remain topics of discussion and research. However, certain ideas have been demonstrated as being useful across projects. In particular, services should exist on different levels of granularity, ranging from business process services to technical function services.

· Business process services that support end-user business processes or external program requirements.

· Business transaction services that support business process services, addressing the need to maintain transactional integrity.

· Business function services used by multiple business process or transaction services to fulfill a business need. These services encapsulate the smallest business-related functionality that can be used by other services.

· Technical function services that are independent of the business needs, but are necessary to satisfy functional and nonfunctional requirements.

In aligning SOA and Business Process Management (BPM), the capture and correlation of business events becomes an enabler of agility. Bringing trends to the awareness of decision makers or possessing the intelligence to act on a business trend improves the agility of the architecture and the agility of the business. Two business event-handling approaches are at the forefront of discussion:

· Business Event Management (BEM) captures events in real time. On matching a pattern, BEM notifies a decision-maker to act accordingly.

· Complex Event Processing (CEP) correlates events and automatically orchestrates a response. In general, to orchestrate a response that has business meaning without human intervention requires semantic knowledge of the domain.

Semantic Processing

Because satisfying business needs defines the value of Information Technology (IT), it becomes apparent that for businesses to be agile, the underlying technology must be able to make decisions. This is the fundamental motivation for introducing "semantic processing" into IT.

Semantic processing, which expresses relationships among concepts represented by phrases, has been a favorite research topic in academia and industry. The ongoing desire for software to inference information from context has been one of the goals of artificial intelligence. Semantic analysis has been studied in the context of:

· Natural language processing. Text and speech recognition (language translation, for instance).

· Correlation and data mining. Trend analysis (for data warehousing, threat detection, and so on).

· Thematic searches. Refined searches and queries that leverage an awareness of the business context.

Likewise, Semantic Service-Oriented Architectures (SSOAs) have been proposed. SSOA introduces semantic enhancement to services such that an agent aware of the semantic model can combine services dynamically to satisfy business goals.

For instance, IBM uses semantic processing in its Websphere Business Fabric software (www-306.ibm.com/software/solutions/soa/servicesfabric.html), and Software AG is doing likewise with its Information Integrator (www.softwareag.com/Corporate/products/cv/inf_int). Progress Software has partnered with Microsoft to provide its Progress Apama Event Processing Platform as a component of Microsoft's "Markets in Financial Instruments Directive" suite. Semagix (recently purchased by Fortent) used its Semantic Enhancement Technology to create an SSOA framework to build a money-laundering detection application called "CIRAS" (short for "Customer Identification and Risk Assessment"). And Ontology Works (www.ontologyworks.com) is involved in a project with NASA to create an SSOA for internal use (www.semantic-conference.com).

But before discussing semantic analysis in an SOA environment, understanding the relative terms can be useful. The definitions assume that a vocabulary (formally, a controlled vocabulary) exists for a domain of interest:

· Taxonomy. A classification scheme that uses parent-child or associative relationships among terms.

· Ontology. Both the vocabulary and a set of formal rules for combining elements to express something meaningful in the domain. Taxonomies can be considered a subset of ontologies.

To semantically enhance services, ontologies need to be defined on the domain of interest. There are different approaches to building ontology:

· Deterministic approaches, which usually imply manual ontology creation. This generally involves tools to help create ontology in a specific format. The ontology is created by domain experts and can be tedious and time consuming.

· Statistical approaches, which attempt to address the fact that domain knowledge may be incomplete, uncertain, or may change over time (changes in business models, for example). Statistical approaches try to build ontologies based on input sampling. Starting with an initial ontology, relevant documents associated with a particular concept are sampled. Using a technique such as Bayesian networks (www.ddj.com/dept/architect/184406064) to extend the First Order Logic of a language such as the Web Ontology Language (OWL), inputs are associated to types or subtypes using conditional probabilities. Inferencing is used to suggest the addition or removal of nodes and leaves.

Semantic Web

A real-world implementation of semantic analysis is the Semantic Web (www.w3.org/2001/sw). The Semantic Web, sometimes referred to as "Web 3.0," introduces interpretive analysis to the vast amounts of information available over the World Wide Web. In achieving this goal, authors must augment existing data in a formal way that lets machines interpret the information unambiguously. For example, the meaning of "hand" is context sensitive. Therefore, ontology must somehow account for the different ways in which the word may be used. The Semantic Web as proposed by W3C provides two mechanisms to handle semantic information and ontology creation: Resource Definition Framework (RDF) and Web Ontology Language (OWL).

A Resource Definition Framework (www.w3.org/RDF) is a means to create simple associations between elements. RDF introduces a 3-tuple (triple) to define associations between resources, in a manner similar to sentence structure:

· Subject. A resource as defined by a URI.

· Predicate. Indicates the type of relationship between subject and object.

· Object. The resource that is the terminating end of the relationship.

Put another way, a resource (subject) has a property (predicate) given by another resource (object). There is no constraint on the number of objects associated with a subject. That is to say, a resource can have multiple resources as properties. For example, Listing One uses RDF to define a website with two coauthors. RDF triples can be represented by directed graphs like that in Figure 1.

http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:dc="http://purl.org/DC/"

xmlns:os="http://somesite.org/Schema/">

http://rama.cpe.fr/index.html">

Index of my web site

[http://bat710.univ-lyon1.fr/~champin/rdf-tutorial/node23.htmlBC1]

Listing One


Figure 1: RDF triples.

The Web Ontology Language (www.w3.org/TR/owl-features) is a formal language for specifying resources and their relationships. OWL is built on top of RDF, and supports a number of mathematical relations among resources.

OWL has several versions. OWL Full is an extension of RDF to provide a formal ontology language. OWL Descriptive Logic (OWL DL) supports semantics expressible as First Order Logic. OWL Lite is a version of OWL DL constructed for ease of implementation.

SSOA Implementation

What I've discussed up to this point might lead you to the conclusion that to implement SSOA, you must create a domain-specific ontology, semantically enhanced services, and provide a process-aware mechanism, such as a Business Process Execution Language (BPEL) engine, that leverages the ontology and the semantic extensions to build business processes. RDF and OWL arise, therefore, as a means to enable SSOA.

Semantically enhancing services implies including metadata into the service definition. This can be handled within the service definition or within the service discovery infrastructure. In a web services environment, extensions to WSDL may be used, or the UDDI can be extended to provide additional semantic-level detail.

As suggested by the desire for BEM and CEP, the ultimate goal of SSOA is dynamic process creation and management. Deterministic, statistical, and hybrid approaches may be used to create processes upon deployment or at runtime. A number of approaches are being studied.

Deterministic process creation can be represented in an activity diagram and achieved by creating process templates using a business-process-specific language, such as a BPEL or Business Process Modeling Language (BPML). The process template specifies start and end states, and a sequence of activities that occur during the process. The service implementations are not defined in the template but are bound to services at deployment time or runtime.

The service-binding mechanism can use a ranking system to determine the best service for the activity in the given context. As an example, the relative importance of properties of the resource (see RDF) can be weighted. A weighted sum over the properties leads to the best candidate service for the activity. In this fashion, properties such as QoS can be defined to select the appropriate service.

Statistical process creation intends to discover processes through inferencing of business events. In this approach, the candidate processes are determined using input/output information and probabilities based on business event correlation. Given the input information, probabilities are calculated for activities that may satisfy the output requirements. This requires ontological reasoning services such as Bayesian networks or case-based reasoning. The topic remains an area of active debate and research.

Conclusion

While it is clear that SOA can be seen as a reinvention of older programming concepts at the enterprise level, the introduction of semantic processing is a more complex endeavor that broaches mathematics and artificial intelligence to allow machine-based analysis and interpretation. The foundations of the latter have been provided through efforts in the Semantic Web, and more generally, semantic analysis in other fields. In the business arena, the use of reasoning engines as part of real-time processing remains in a nascent state. Institutions in academia and industry are taking the lead in semantically enabling business technology.

For More Information

Rong Pan, Zhongli Ding, Yang Yu, Yun Peng. "A Bayesian Network Approach to Ontology Mappings" (ebi.seu.edu.cn/ISWC2005/papers/3729/37290563.pdf).

Rama Akkiraju, Richard Goodwin, Prashant Doshi, Sascha Roeder. "A Method for Semantically Enhancing the Service Discovery Capabilities of UDDI" (www.isi.edu/infoagents/workshops/ ijcai03/papers/Akkiraju-SemanticUDDI-IJCA%202003.pdf).

Alexandra Galatescu, Taisia Greceanu. "Ontology-Driven Improvement of Business Process Quality" (www.i3s.unice.fr/odbis2005/Ontology-driven%20Improvement.pdf).

Martin Hepp, Frank Leymann, Chris Bussler, John Domingue, Alexander Wahler, Dieter Fensel. "Semantic Business Process Management: Using Semantic Web Services for Business Process Management" (dip.semanticweb.org/documents/Hepp-et-al-Semantic-Business-Process-Management-Using-Semantic-Web-Services-for-Business-Pro.pdf).

Mathieu d'Aquin, Jean Lieber, Amedeo Napoli. "Decentralized Case-Based Reasoning for the Semantic Web" (www.loria.fr/equipes/orpailleur/Documents/daquin05b.pdf).

Juhnyoung Lee, Richard Goodwin, Rama Akkiraju, Anand Ranganathan, Kunal Verma, SweeFen Goh. "Towards Enterprise-Scale Ontology Management" (uk.builder.com/whitepapers/0,39026692,60132313p-39000926q,00.htm).

Pavel Hruby. "Ontology-Based Domain-Driven Design" (www.softmetaware.com/oopsla2005/hruby.pdf).

Brand Neimann. "Data Reference Model: Update on Status." (web.gov/scope03072005.ppt).

Paulo Cesar G. da Costa, Kathryn B. Laskey, Kenneth J. Laskey. "PR-OWL: A Bayesian Ontology Language for the Semantic Web" (ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-173/paper3.pdf).

Shashi Kant, Evangelos Mamas. "Statistical Reasoning: A Foundation for Semantic Web Reasoning" (ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-173/pos_paper6.pdf).

No comments: