Thursday, April 30, 2009

HP and Oracle team up on 'data warehouse appliances' that re-architect database-storage landscape

Oracle CEO Larry Ellison today introduced the company’s first hardware products, a joint effort with Hewlett-Packard, to re-architect large database and storage configurations and gain whopping data warehouse and business intelligence performance improvements from the largest data sets.

The Exadata Programmable Storage Server appliance and the HP Oracle Database Machine, a black and red refrigerator-size full database, storage and network data center on wheels, made their debt at the Oracle OpenWorld conference in San Francisco. Ellison called the Machine the fastest database in the world.

HP Chairman and CEO Mark Hurd called the HP Oracle Database Machine a “data warehouse appliance.” It leverages the architecture improvements in the Exadata Programmable Storage Server, but at the much larger scale and with other optimization benefits. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

The hardware-software tag team also means Oracle is shifting its relationships with storage array vendors, including EMC, Netezza, NetApp and Terradata. The disk array market has been hot, but the HP-Oracle appliance may upset the high end of the market, and then bring the price-performance story down market, across more platforms.

I think we can safely say that HP is a preferred Oracle storage partner, and that Oracle wants, along with HP, some of those high-growth storage market profits for their own. There’s no reason to not expect a wider portfolio of Exadata appliances and more configurations like the HP Oracle Database Machine to suit a variety of market segments.

“We needed radical new thinking to deliver high performance,” said Ellison of the new hardware configurations, comparing the effort to the innovative design for his controversial America’s Cup boat. “We need much more performance out of databases than what we get.”

This barnburner announcement may also mark a market shift to combined and optimized forklift data warehouses, forcing the other storage suppliers to find database partners. IBM will no doubt have to respond as well.

The reason for the 10x to 72x performance improvements cited by Ellison are do to bringing the “intelligence” closer to the data, that is bringing the Exadata Programmable Storage Server appliance into close proximity to the Oracle database servers, and then connecting them through InfiniBand connections. In essence, this architecture mimics some of the performance value created by cloud computing environments like Google, with its MapReduce technology.

Ellison said that rather than large data sets moving between storage and database servers, which can slow up performance at 1TB and larger databases, the new Exadata-driven configuration moves only the query information across the networks. The current versions of these optimized boxes use Intel dual-core technology, but they will soon also be fired up by six-way Intel multi-core processors.

Talk about speeds and feeds …. But the market driver in these moves is massive data sets that need to be producing near real-time analytics paybacks. We’re seeing more and more data, and varyinf kinds of data, brought into data warehouses and being banged on by queries of applications and BI servers from a variety of business users across the enterprise.

HP and Oracle share some 150,000 joint customers worldwide, said HP Executive Vice President, Technology Solutions Group Ann Livermore. That means that these database boxes will have an army of sales and support personnel. HP will support the Machine hardware, Oracle the software. Both will sell it.

50% of data warehouse projects to fail in 2005-2007

Through 2007, more than 50% of data warehouse projects will have limited acceptance, or will be outright failures, as a result of a lack of attention to data quality issues, according to Gartner. CIOs said in a recent Gartner survey that BI implementation will be a significant factor in delivering IT’s contribution to business growth. However, most businesses are failing to use BI strategically. Gartner analysts said integration of business and IT requirements is critical to any successful BI strategy.

Monday, April 27, 2009

Benefits and Harms of Knowledge Management into Organization

Benefits of KM
As eighty percent of knowledge in organization is tacit knowledge, so calculating ROI (return of investment) is very difficult and it may not be possible also. The impact of successful knowledge management program can be seen in terms of new and better product development, higher customer satisfaction, reduce in input cost, higher productivity etc..

KM creates a platform for extensive data mining and business intelligence knowledge. It assists today’s managers with better decision support systems. With sharing of data and information across the organization and getting the details from customers and suppliers, managers can forecast the future trend and take better decision. Knowledge creation process does not limit itself within the organization. Conference and seminars brings new learning to the organization and introduces ones knowledge and expertise to others. It encourages partnership and collaboration to develop new products and services leveraging the experience of different organizations. Because of the strong networking culture Knowledge Management creates it helps in developing collaborative decision making and threat analyzing capability of any organization. Organizations enter into new line of business with their knowledge and collaborations with others.
Today's business is not located near availability of raw materials or capital. They are flowing to the areas where knowledge is available. Companies’ future valuation heavily depends on its future market potential based on today's knowledge creating capability.


The dark side of KM

The implementation of Knowledge Management may face a widespread resistance because people do not like the very idea of their knowledge being managed, particularly in non-profit organizations, where a business case for knowledge management may not resonate with the values of the organization. Furthermore other factors such as the extensive use of Information Technology and complex information share systems may alienate individuals who are not computer literates.
Culture factors such as a high degree of competitiveness among employees can be a barrier on sharing information through an information system. Furthermore, the individuals may hesitate to share their knowledge out of fear of criticism or of misleading the community members (Ardichivili et al, 2003).
Organizational resistance to internal Knowledge Management efforts also stems from hierarchical structures that function to reinforce norms of competition by creating winners and losers (De Long and Seeman, 2000). When management decides to initiate knowledge management techniques, it is probable that they resist, because they do not like changes due to various reasons such as job losses.
De Long and Seeman (2000) argue that the successful implementation of Knowledge Management and its integration with the strategic aims may transform to a source of conflict. Top managers and executives may conflict in order to benefit from the success applying Knowledge Management systems in business. Though the relative literature focus on the early stages of knowledge management and on the process of implementing it, it seems that even in a case of a success a conflict may arise. This means that from the very beginning of knowledge management managers and executives shall define whose property the plan is and to delegate power in order not to have any conflicts that may disrupt a successful knowledge management plan.

Piyush patel
Software Engineer

Monday, April 20, 2009

Data Mining in 21st Century

Data Mining in Economic Crisis

In this time of economic turmoil the whole world is slowly plunging into recession. We have seen bailouts from governments worth billions of dollars for giant companies in US, Europe and Asia. The businesses are finding hard times to operate and so ability to make smarter and intelligent business decisions are imperative to overcome in such harsh conditions.

Businesses who have already invested in business intelligence solutions will be in a better position to undertake right measures to survive and continue its growth. However, many may argue that most of the giant companies were using Data mining and BI solutions but still could not avoid the collapse. The important thing to note here is that data mining solutions provides an analytical perspective into the performance of an organization based on historical data but the economic impact on an organization is linked to many issues and in many cases to external forces and unscrupulous activities. The failure to predict this does not undermine the role of data mining for organizations but on the contrary makes it more important especially for regulatory bodies of governments to predict and identify such practices in advance and take necessary measures to avoid such circumstances in future.


Application of Data Mining

Data Mining has been able to grasp the attention of many in the field of scientific research, businesses, banking sector, intelligence agencies and many others from the early days of its inception. However its use was not as easy as it is now. The rapid growth of various tools and software during the past few years enable it to be used more widely than ever before. The ease with which one can carry out complex data mining techniques using these tools is simply outstanding.

Data Mining is used by businesses to do improve its marketing and to understand the buying patterns of its clients. Attrition Analysis, Customer Segmentation and Cross Selling are the most important ways through which data mining is showing the new ways in which businesses can multiply its revenue.

Data Mining is now used in the banking sector for credit card fraud detection by identifying the patterns involved in fraudulent transactions. It is also used to reduce credit risk by classifying a potential client and predicting bad loans.

Data Mining is used by intelligence agencies like FBI and CIA to identify threats of terrorism. After the 9/11 incident it has become one of the prime means to uncover terrorist plots. However this led to concerns among the people as data collected for such works undermines the privacy of a large number of people.



Posted By Piyush Patel
Software Engineer

Friday, April 17, 2009

HP teams with Microsoft, VMware to expand appeal of desktop virtualization solutions

As the sour economy pushes more companies into the arms of virtual desktop infrastructure (VDI) for cost cutting, the vendor community is eagerly wiping out obstacles to adoption by broadening the appeal of desktops as a service for more users, more types of applications and media.

This became very clear this week with a flurry of announcements that package the various parts of VDI into bundled solutions, focus on the need to make rich applications and media perform well, and expands the types of endpoints that can be on the receiving end of VDI.

Hewlett-Packard (HP) expanded its thin-client portfolio with new offerings designed to extend virtualization across the enterprise, while providing a more secure and reliable user experience. The solutions bundle software from Microsoft and VMware along with HP's own acceleration and performance software, as well as three thin client hardware options.

I can hardly wait for HP to combine the hardware and software on the server side, too. I have no knowledge that HP is working up VDI appliances that could join the hardware configurations on the client side. But it sure makes a lot of sense.

Seriously, there are few companies in the better position to bring VDI to the globe, given what technologies they gain with Mercury and Opsware, along with internal development ... Oh, and there's EDS to make VDI hosting a service in itself. Look for a grand push from HP into this enterprise productivity solutions area.

Leading the pack in this latest round of VDI enhancements are the three thin clients -- the HP gt7720 Performance Series, and the HP t5730w and t5630w Flexible Series. These offer new rich multimedia deployment and management functionality -- rich Internet applications (RIA), Flash, and streaming media support -- that enhance the Microsoft Windows Embedded Standard. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

The Palo Alto, Calif. company also announced several other new features:
The thin clients feature Microsoft Internet Explorer 7, Windows Media Player 11 and the ability to run applications locally, they also include Microsoft Remote Desktop Protocol 6.1, which enables devices to connect and take advantage of the latest security and enterprise management technologies from Windows Server 2008.

RDP Enhancements multimedia and USB redirection enable users to easily run web applications, videos and other files within a virtual desktop environment, while avoiding frame skipping and audio or video synchronization issues. The software downloads the processing directly to the thin client, creating an enhanced multimedia experience while lowering the load on the server, which results in increased server scalability.

This also creates a near-desktop experience for VMware View environments, including support for the latest VMware View Manager 3 broker with no need for additional employee training. Users simply log in on the thin client to take advantage of its multimedia features, such as training videos, and USB device support.

HP and VMware also are working together to enable VMware View Manager’s universal access feature to leverage RGS for remote desktop sessions.

RGS is designed for customers requiring secure, high-performance, collaborative remote desktop access to advanced multimedia streaming and workstation-class applications. The software includes expanded, real-time collaboration features to allow multiple workers from remote locations to see and share content-rich visualizations, including 2-D design, 3-D solid modeling, rendering, simulation, full-motion video, heavy flash animation and intense Web 2.0 pages.

Not suprisingly, a lot of the technology being used in these VDI bundles originated with secure CAD/CAM virtual workstation implementations, where graphics and speed are essential. If it works for developers in high-security areas, it should work for bringing ERP apps and help desk apps to the masses of workers who don't need a full PC on every desktop. They just need an interactive window into the apps and data.

Expected to be available in early May, the new thin clients will be priced from $499 to $799. More information is available through HP or authorized resellers or from http://www.hp.com/go/virtualization. I would expect that EDS is going to have some packages that drive the total cost down even more.

Welcome to BI-AJ! Business Intelligence - Agile Initiatives

Welcome to my blog on software development with primary focus on data warehousing and business intelligence projects.

Often data warehouse and business intelligence initiatives use long delivery cycles with tangible results, whilst businesses rapidly change and adjust to the market. The consequence of long delivery cycles usually means that whenever a solution is in place the business has already changed and the effect of the BI initiative is reduced. Agile development methods are especially useful for rapid software development with requirements that are hard to define and within rapidly changing businesses, many times with a high success factor. However, agile development methodologies have not yet been acknowledged as delivery models for developing business intelligence systems. Many claim that small iterations are not suitable for data warehousing because of the demand for a strong and reliable infrastructure. The purpose of this blog is to present that the agile foundations for sure can be utilized with great success for developing and deploying business intelligence systems within small iterations.

Through-out a number of posts I will give my thoughts and ideas on how to develop software in the context of DW/BI using small iterations and a pragmatic perspective on development. I do have clear influences in agile development, however I try to stay pragmatic to rules and structures as all development initiatives are different. They are within different organisations with different cultures and with different goals with different employees and traditions. These are things that need to be considered when planning and deploying software efficiently.

I hope that I can start a debate on how agile principles can or cannot be used to deliver software within DW/BI projects, and I hope that I get feedback and response from you. I will continuously discuss and solve(?) any questions or scenarios that I received from you.

Stay tuned!

GoogleTV being created by Motorola

A couple years ago I blogged about what I thought would eventually happen — Google would create an internet connected set-top box. Well, it took years for something to really happen on that front, but now it sounds like Motorola is going to be taking the project on.

Android, the operating system designed and developed for mobile devices like the G1, is now said to be the brains behind a new set-top box being created by Motorola called “au Box”. The time is definitely ripe for something like this.

An internet connected cable box holds many possibilities. Would you consider buying a set-top box that would let you log into your Google Account? Staying connected (to email, news, ads, etc), even while enjoying television or a movie could either be extremely annoying, or great — let’s hear what you think in the TalkBack!

Thursday, April 16, 2009

Data Mining Interview Questions

1. What is “Data Warehousing”?
2. What are Data Marts?
3. What are Fact tables and Dimension Tables?
4. What is Snow Flake Schema design in database?
5. What is ETL process in Data warehousing?
6. How can we do ETL process in SQL Server?
7. What is “Data mining”?
8. Compare “Data mining” and “Data Warehousing”?
9. What is BCP?
10. How can we import and export using BCP utility?
11. In BCP we want to change field position or eliminate some fields how can we achieve this?
12. What is Bulk Insert?
13. What is DTS?
14. Can you brief about the Data warehouse project you worked on?
15. What is an OLTP (Online Transaction Processing) System?
16. What is an OLAP (On-line Analytical processing) system?
17. What is Conceptual, Logical and Physical model?
18. What is Data purging?
19. What is Analysis Services?
20. What are CUBES?
21. What are the primary ways to store data in OLAP?
22. What is META DATA information in Data warehousing projects?
23. What is multi-dimensional analysis?
24. What is MDX?
25. How did you plan your Data ware house project?
26. What are different deliverables according to phases?
27. Can you explain how analysis service works?
28. What are the different problems that “Data mining” can solve?
29. What are different stages of “Data mining”?
30. What is Discrete and Continuous data in Data mining world?
31. What is MODEL is Data mining world?
32. How are models actually derived?
33. What is a Decision Tree Algorithm?
34. Can decision tree be implemented using SQL?
35. What is Naïve Bayes Algorithm?
36. Explain clustering algorithm?
37. Explain in detail Neural Networks?
38. What is Back propagation in Neural Networks?
39. What is Time Series algorithm in data mining?
40. Explain Association algorithm in Data mining?
41. What is Sequence clustering algorithm?
42. What are algorithms provided by Microsoft in SQL Server?
43. How does data mining and data warehousing work together?
44. What is XMLA?
45. What is Discover and Execute in XMLA?

C4.5 algorithm in Data mining

1.1 Introduction

Systems that construct classifiers are one of the commonly used tools in data mining. Such systems take as input a collection of cases, each belonging to one of a small number of classes and described by its values for a fixed set of attributes, and output a classifier that can accurately predict the class to which a new case belongs. These notes describe C4.5 [64], a descendant of CLS [41] and ID3 [62]. Like CLS and ID3, C4.5 generates classifiers expressed as decision trees, but it can also construct classifiers in more comprehensible ruleset form. We will outline the algorithms employed in C4.5, highlight some changes in its successor See5/C5.0, and conclude with a couple of open research issues.

1.2 Decision trees

Given a set S of cases, C4.5 first grows an initial tree using the divide- and-conquer algorithm as follows:
• If all the cases in S belong to the same class or S is small, the tree is a leaf labeled with the most frequent class in S.
• Otherwise, choose a test based on a single attribute with two or more outcomes. Make this test the root of the tree with one branch for each outcome of the test, partition S into corresponding subsets S1, S2, . . . according to the outcome for each case, and apply the same procedure recursively to each subset.

There are usually many tests that could be chosen in this last step. C4.5 uses two heuristic criteria to rank possible tests: information gain, which minimizes the total entropy of the subsets {Si } (but is heavily biased towards tests with numerous outcomes), and the default gain ratio that divides information gain by the information provided by the test outcomes. Attributes can be either numeric or nominal and this determines the format of the test outcomes. For a numeric attribute A they are {A ≤ h, A > h} where the threshold h is found by sorting S on the values of A and choosing the split between successive values that maximizes the criterion above. An attribute A with discrete values has by default one outcome for each value, but an option allows the values to be grouped into two or more subsets with one outcome for each subset. The initial tree is then pruned to avoid overfitting. The pruning algorithm is based on a pessimistic estimate of the error rate associated with a set of N cases, E of which do not belong to the most frequent class. Instead of E/N, C4.5 determines the upper limit of the binomial probability when E events have been observed in N trials, using a user-specified confidence whose default value is 0.25.

Pruning is carried out from the leaves to the root. The estimated error at a leaf with N cases and E errors is N times the pessimistic error rate as above. For a subtree, C4.5 adds the estimated errors of the branches and compares this to the estimated error if the subtree is replaced by a leaf; if the latter is no higher than the former, the subtree is pruned. Similarly, C4.5 checks the estimated error if the subtree is replaced by one of its branches and when this appears beneficial the tree is modified accordingly. The pruning process is completed in one pass through the tree. C4.5’s tree-construction algorithm differs in several respects from CART [9], for instance:

• Tests in CART are always binary, but C4.5 allows two or more outcomes.
• CART uses the Gini diversity index to rank tests, whereas C4.5 uses information-based criteria.
• CART prunes trees using a cost-complexity model whose parameters are estimated by
cross-validation; C4.5 uses a single-pass algorithm derived from binomial confidence
limits.
• This brief discussion has not mentioned what happens when some of a case’s values are unknown. CART looks for surrogate tests that approximate the outcomes when the tested attribute has an unknown value, but C4.5 apportions the case probabilistically among the outcomes.

1.3 Ruleset classifiers

Complex decision trees can be difficult to understand, for instance because information about one class is usually distributed throughout the tree. C4.5 introduced an alternative formalism consisting of a list of rules of the form “if A and B and C and ... then class X”, where rules for each class are grouped together. A case is classified by finding the first rule whose conditions are satisfied by the case; if no rule is satisfied, the case is assigned to a default class. C4.5 rulesets are formed from the initial (unpruned) decision tree. Each path from the root of the tree to a leaf becomes a prototype rule whose conditions are the outcomes along the path and whose class is the label of the leaf. This rule is then simplified by determining the effect of discarding each condition in turn. Dropping a condition may increase the number N of cases covered by the rule, and also the number E of cases that do not belong to the class nominated by the rule, and may lower the pessimistic error rate determined as above. A hill-climbing algorithm is used to drop conditions until the lowest pessimistic error rate is found.

To complete the process, a subset of simplified rules is selected for each class in turn. These class subsets are ordered to minimize the error on the training cases and a default class is chosen. The final ruleset usually has far fewer rules than the number of leaves on the pruned decision tree. The principal disadvantage of C4.5’s rulesets is the amount of CPU time and memory that they require. In one experiment, samples ranging from 10,000 to 100,000 cases were drawn from a large dataset. For decision trees, moving from 10 to 100K cases increased CPU time on a PC from 1.4 to 61 s, a factor of 44. The time required for rulesets, however, increased from 32 to 9,715 s, a factor of 300.

1.4 See5/C5.0

C4.5 was superseded in 1997 by a commercial system See5/C5.0 (or C5.0 for short). The
changes encompass new capabilities as well as much-improved efficiency, and include:

• A variant of boosting [24], which constructs an ensemble of classifiers that are then voted to give a final classification. Boosting often leads to a dramatic improvement in predictive accuracy.
• New data types (e.g., dates), “not applicable” values, variable misclassification costs, and mechanisms to pre-filter attributes.
• Unordered rulesets—when a case is classified, all applicable rules are found and voted. This improves both the interpretability of rulesets and their predictive accuracy.
• Greatly improved scalability of both decision trees and (particularly) rulesets. Scalability is enhanced by multi-threading; C5.0 can take advantage of computers with multiple CPUs and/or cores. More details are available from http://rulequest.com/see5-comparison.html.


1.5 Research issues

We have frequently heard colleagues express the view that decision trees are a “solved problem.” We do not agree with this proposition and will close with a couple of open research problems.

Stable trees. It is well known that the error rate of a tree on the cases from which it was constructed (the resubstitution error rate) is much lower than the error rate on unseen cases (the predictive error rate). For example, on a well-known letter recognition dataset with 20,000 cases, the resubstitution error rate for C4.5 is 4%, but the error rate from a leave-one-out (20,000-fold) cross-validation is 11.7%. As this demonstrates, leaving out a single case from 20,000 often affects the tree that is constructed! Suppose now that we could develop a non-trivial tree-construction algorithm that was hardly ever affected by omitting a single case. For such stable trees, the resubstitution error rate should approximate the leave-one-out cross-validated error rate, suggesting that the tree is of the “right” size.

Decomposing complex trees. Ensemble classifiers, whether generated by boosting, bagging, weight randomization, or other techniques, usually offer improved predictive accuracy. Now, given a small number of decision trees, it is possible to generate a single (very complex) tree that is exactly equivalent to voting the original trees, but can we go the other way? That is, can a complex tree be broken down to a small collection of simple trees that, when voted together, give the same result as the complex tree? Such decomposition would be of great help in producing comprehensible decision trees.

Research on C4.5 was funded for many years by the Australian Research Council.
C4.5 is freely available for research and teaching, and source can be downloaded from
http://rulequest.com/Personal/c4.5r8.tar.gz.

What is Data Mining

Data mining is usually defined as searching, analyzing and sifting through large amounts of data to find relationships, patterns, or any significant statistical correlations. With the advent of computers, large databases and the internet, it is easier than ever to collect millions, billions and even trillions of pieces of data that can then be systematically analyzed to help look for relationships and to seek solutions to difficult problems. Besides governmental uses, many marketers use data mining to find strong consumer patterns and relationships. Large organizations and educational institutions also data mine to find significant correlations that can enhance our society.

While data mining is amoral in the fact that it only looks for strong statistical correlations or relationships, it can be used for either good or not so good purposes. For instance, many government organizations depend on data mining to help them create solutions for many societal problems. Marketers use data mining to help them pin point and focus their attention on certain segments of the market to sell to, and in some cases black hat hackers can use data mining to steal and scam thousands of people.

How does data mining work? Well the quick answer is that large amounts of data are collected. Usually most entities that perform data mining are large corporations and government agencies. They have been collecting data for decades and they have lots of data to sift through. If you are a fairly new business or individual, you can purchase certain types of data in order to mine for your own purposes. In addition, data can also be stolen from large depositories by hackers by hacking their way into a large database or simply stealing laptops that are ill protected.

If you are interested in a small case study on how data mining is collected, used and profited off of, you can look at your local supermarket. Your supermarket is usually an extremely lean and organized entity that relies on data mining to make sure that it is profitable. Usually your supermarket employs a POS (Point Of Sale) system that collects data from each item that is purchased. The POS system collects data on the item brand name, category, size, time and date of the purchase and at what price the item was purchased at. In addition, the supermarket usually has a customer rewards program, which also is input into the POS system. This information can directly link the products purchased with an individual. All this data for every purchase made for years and years is stored in a database in a computer by the supermarket.

Now that you have a database with millions upon millions of data fields and records what are you going to do with it? Well, you data mine it. Knowledge is power and with so much data you can uncover trends, statistical correlations, relationships and patterns that can help your business become more efficient, effective and streamlined.

The supermarket can now figure out which brands sell the most, what time of the day, week, month or year is the most busiest, what products do consumers buy with certain items. For instance, if a person buys white bread, what other item would they be inclined to buy? Typically we can find its peanut butter and jelly. There is so much good information that a supermarket can use just by data mining their own data that they have collected.

Thursday, April 9, 2009

Mobile Knowledge Management









Researches on Mobile devices are limited on technical issues. As a result of this mobile networks and device are mostly limited to mobile telephoning and email communication.

Knowledge management has quite a history of proofed concepts how to support the user in his daily work, and how to make especially knowledge intensive work more effective.

So combining both research works, knowledge management and mobile computing approaches, new concept called Mobile Knowledge Management, sometimes referred to as mKM.


Mobile Knowledge Management Architecture:



As shown above mobile knowledge architecture divided into four layers:

1. Mobile agent

This human agent interacts with the system to access knowledge. The human agent then utilizes this knowledge, transfer it to others or enter new knowledge back into the system. The knowledge management activities, knowledge presentation and acquisition are the interface between layer one and two.

2. Mobile devices

These devices are java enabled, PDA, or laptop computers. Software in these devices determines the information about bandwidth, screen resolution and location (by GPS). This data sent to contextualizer for specific knowledge.

3. Contextualizer

The contextualizer is a middleware component, which serve as a bridge between human agent and knowledge base. It takes care of communication with the knowledge base (knowledge storage and retrieval) as well as linking the server and client side according to the context (knowledge presentation and acquisition). Contextualizer receive a XML query from the mobile device, it parse the request and analyzes the context accordingly four context elements: user dependency, technological environment, situation and task. Finally the database retrieves the knowledge and sent back to the client application.

Context elements:

User dependency: not all members of an organization work with all available knowledge. The users’ role determines what knowledge they are allowed to and need to access. Authentication and authorization is incorporated within the startup phase. For example, all available data would lead to an information overflow for executive members of the organization. They need an abstract overview of the accessible knowledge

Technological Environments: The technical environment comprises elements of the device and the network. The device plays a major role for the Contextualizer. Variables like the screen size, processing power, available colors, or programming interfaces are key determinants for putting the knowledge into context. The network connection is another important influencing factor. The available bandwidth determines the transferable data size.

Situational elements: Situational elements are those connected with time and location. The time zone is important when retrieving time-critical information from database in far away countries. The position of the user is especially interesting for graphical knowledge representation in maps or sketches.

Task-specific elements: Contextualizer will decide what knowledge retrieves from which database based on task specific elements.

4. Knowledge Base

This layer contains knowledge source which is built up of different database types. In this model I represented three most popular databases: relational database (SQL), Object oriented database and XML database. This layer responsible to retrieve specific knowledge and sent it to contextualizer or insert new knowledge into database.


Mobile Knowledge Management Applications:


· Mobile Knowledge Management helps a professional and car drivers in accessing problems related to car that broken down. There are two scenario for this problem, get the solution for this problem or contact nearest garage.

· Mobile tourist guide: A specific mobile tourist guide application, for example, may provide contextual information to some degree: photos and descriptive textual information are geo-referenced in order to retrieve relevant information for the user at his current location.

· Site inspection support: A mobile workforce application supporting engineers in doing site inspections for maintenance objectives has to follow the specific workflow of the business processes.




>