 |
KNOWLEDGE
DISCOVERY & RESEARCH STUDIES
During the last decade, we have seen an explosive growth
in our capabilities to both generate and collect data. Advances in data
collection, widespread use of bar codes for most commercial products,
and the computerization of many business and government transactions
have flooded us with information, and generated an urgent need for new
techniques and tools that can intelligently and automatically assist
us in transforming this data into useful knowledge. This is the emerging
field of knowledge discovery in databases (KDD) and data mining that
derives from statistics, databases, machine learning, and artificial
intelligence.
In the data is the knowledge. In the dissemination of
that knowledge lies the power. We use knowledge-based techniques to
find provable new value in our clients' data. Rather than replace current
database technology, we extend its typical query-and-response approach
to one which incorporates knowledge of the enterprise, its purposes,
processes, and problems, opening the door to the discovery of valuable
new business knowledge in existing data.
Amid a flood of data, there is a thirst for knowledge.
Are your people drowning in data? The metaphor is ubiquitous. Relying
on faster, larger, and cheaper storage media and using better database
management systems, the world of business is awash in data. Few would
disagree with Francis Bacon's claim of 400 years ago that "Knowledge
itself is power". But, in this age of data (some would say too much
data), knowledge can be elusive. Diffuse, fragmented, useful, it seems,
for narrow purposes, the very data we rely on for better decision making,
better marketing and operation, and better control of the enterprise
overwhelm us. Looking beyond the bare facts, we look for the underlying
meaning. We look for insight. We seek the knowledge hidden in the data.
Existing knowledge is the catalyst for finding new knowledge.
Although we implement and use proprietary software, our focus is not
on computer programs, but rather on the client's issues, concerns, unmet
goals, or unsolved problems. And although we do extensive computation
on client data, the focus is not on specific programs, but rather on
customizing the process by capturing and encoding the particulars of
each case.
The Artilligence Knowledge Discovery process consists
of five steps:
- Problem Specification: Starting with issues, concerns, and general
objectives, a problem description evolves. Finally, a problem specification
is arrived at. This includes quantifiable measures for later test
and validation. Business context representation, background/contextual
knowledge, prior practice, rules of operation, etc., are first recorded
and then encoded as computable objects.
- Data Preparation: Phases include encoding the data dictionary and
data field semantics, sample selection, and data cleaning. Technical
issues addressed include missing data fields, data uncertainty, and
ordering of events in time.
- Data Analysis: Selection, application, integration, and customization
of data mining and data analysis methods.
- Presentation of Results: Test, validate, evaluate, implement, and
report. Results must be provably novel, useful, and understandable.
- Systems Integration
|
 |
 |