Current Activities
-
Text-Mining
With the growth of the World Wide Web in recent years, text-mining has
emerged as a specialised sub-activity within the field of knowledge
discovery/data-mining. One application of such techniques is analysis of a user's Web
browsing behaviour to induce a profile of their document preferences.
-
Feature Selection & Dimensionality Reduction
Real-world applications of machine learning and knowledge discovery are
often faced with the problem of massive data sets containing large numbers
of instances, often with sizeable feature sets. Methods for dimensionality
reduction attempt to reduce the size of the instance space or feature
space (or sometimes both) in order to make the learning task more
tractable and to improve performance.
-
Clustering
Conventional techniques do
not make use of domain knowledge when performing clustering. The KC
approach does make use of such knowledge, in various forms: attribute
dependencies (both strong and weak) and goal attributes. Experimental
comparisons of KC with an earlier clustering system (COBWEB) have shown a
marked improvement in performance (as measured by predictive accuracy vs.
number of training instances).
-
Adaptive Information Agents
Our work focuses on the use of learning techniques to enhance the
capabilities of agent-based systems (user-interface agents, network agents,
and multi-agent systems), allowing agents to adapt to available resources,
the nature of tasks, user characteristics, etc.
-
Distributed Data-Mining
Increasingly, data exists in a shared, distributed environment. The
application of
data-mining techniques in such an enviornment presents significant
challenges. Our recent work has investigated the problem from both an
abstract and empirical perspective.
-
Analogical Reasoning
The ACHAB system tackles the problem of the discovery of analogies in
large, multi-functional knowledge bases. The system is based on the fusion
of the access, mapping, and generalization phases of "classical" Analogical
Reasoning; this fusion allows for the search of analogies in a knowledge
base which has not been built specifically for the analogy task. ACHAB
also exploits abstraction, in the form of a set of abstraction operators,
to allow more distant analogies through the relaxation of mapping
constraints. More recently the concept of competition among evolving
analogies has been introduced by the exploitation of concurrency of
processes, that is, each tentative analogy is incrementally built by a
separate process which has to compete for resources with other processes
which are attempting to construct alternative analogies.
Former Activities
-
Predicting Abnormal States & Situations
The TIGON system accepts a data-dependency graph for a system to be
modelled and a number of labelled data sets, and using curve fitting
techniques learns rules to describe how the variables relate when in the
"normal" state, and when in one of the abnormal states (these correspond to
the labels given to the data sets). TIGON then compares the variables in
the abnormal and normal state and produces rules which can be used to
predict abnormal states and situations. Part of the labelled data is held
back to test the inferred rules; if the predictions are not good on the
test data, then the expert is given a chance to modify the data-dependency
graph or to change the labelling of the data sets (by modifying the
labelling or introducing new labels). The approach has been
applied to gas turbines; several enhancements of the data labelling have so
far been suggested.
-
Computational Models of Scientific Discovery
Scientific discovery is perhaps amongst the most complex of intellectual
activities and its study has attracted the attention of historians and
philosophers for many years. Advances in AI and cognitive
psychology have provided new approaches and fresh insights into the nature
of science through work on computational models of the discovery process.
Activities:
- The proposal of a framework in
which the complex process of Theory Formation was formulated in terms of
Informal Qualitative Models (IQMs). Using this framework, we successfully replicated many of the 18th and 19th Century discoveries in
the area of colligative properties of solutions (depression of freezing
point and osmotic pressure).
- A graphical tool to find and correct errors in a
scientific theory. In outline, the expert is asked to provide an initial
theory and to suggest a data set on which to test this theory. Statistical
techniques determine those points which conform to the theory, and
highlights those which do not. These "rogue" points are investigated
further to determine whether they have any features in common (the system
is provided with a large set of previously defined chemical concepts). If
an inconsistency is found then the issue is how to "patch" the original
theory. A prototype version of the system was implemented and used,
with reasonable success, to predict properties of molten slags. In the
course of this investigation, we were able to show that many of the
transition metal elements, e.g. iron, behave differently in different
situations. The system employed a GUI to enable the Chemist/Metallurgist to
clearly see inherent trends.
- The IULIAN system fused ideas from machine discovery and
case-based reasoning to discover new explanations which could be used to revise an
initial theory of some domain. The term exploratory discovery refers to an
integration of self-questioning and experimentation which aims to overcome
a weakness of current machine discovery systems, namely that they use
experimental results to generate explanations without using previous
experience. IULIAN employed case-based planning techniques to learn how to
improve not only its existing theory, but also its theory revision
methods.
- The majority of work concerned with developing computational models of
scientific discovery has focused on modelling individual scientists and
their endeavours. Such work has neglected one very important aspect of
science - the extent to which scientists communicate/ cooperate during the
discovery process. The MAMaLS system was used to explore some of these
issues, by modelling the discovery of the structure of DNA. MAMaLS
represents individual scientists (agents) as objects and supports several
inter-agent communication strategies. Agents are provided with
problem-solving and learning capabilities, allowing them to apply
background knowledge to solve problems and to form generalisations over
results.
|