Monday, November 30, 2015

Getting Human-Like Values into Advanced OpenCog AGIs

Some Speculations Regarding Value Systems for Hypothetical Powerful OpenCog AGIs

In a recent blog post, I have proposed two general theses regarding the future value systems of human-level and transhuman AGI systems: the Value Learning Thesis (VLT) and Value Evolution Thesis (VET).   This post pursues the same train of thought further – attempting to make these ideas more concrete via speculating about how the VLT and VET might manifest themselves in the context of an advanced version of the OpenCog AGI platform.  

Currently OpenCog comprises a comprehensive design plus a partial implementation, and it cannot be known with certainty how functional a fully implemented version of the system will be.   The OpenCog project is ongoing and the system becomes more functional each year.  Independent of this, however, the design may be taken as representative of a certain class of AGI systems, and its conceptual properties explored.

An OpenCog system has a certain set of top-level goals, which initially are supplied by the human system programmers.   Much of its cognitive processing is centered on finding actions which, if executed, appear to have a high probability of achieving system goals.  The system carries out probabilistic reasoning aimed at estimating these probabilities.   Though from this view the goal of its reasoning is to infer propositions of the form “Context & Procedure ==> Goal”, in order to estimate the probabilities of such propositions, it needs to form and estimate probabilities for a host of other propositions – concrete ones involving its sensory observations and actions, and more abstract generalizations as well.   Since precise probabilistic reasoning based on the total set of the system’s observations is infeasible, numerous heuristics are used alongside exact probability-theoretic calculations.   Part of the system’s inferencing involves figuring out what subgoals may help it achieve its top-level goals in various contexts.

Exactly what set of top-level goals should be given to an OpenCog system aimed at advanced AGI is not yet fully clear and will largely be determined via experimentation with early-stage OpenCog systems, but a first approximation is as follows, determined via a combination of theoretical and pragmatic considerations.    The first four values on the list are drawn from the Cosmist ethical analysis presented in my books A Cosmist Manifesto and The Hidden Pattern; the others are included for fairly obvious pragmatic reasons to do with the nature of early-stage AGI development and social integration.  The order of the items on the list is arbitrary as given here; each OpenCog system would have a particular weighting for its top-level goals.

  • Joy: maximization of the amount of pleasure observed or estimated to be experienced by sentient beings across the universe
  • Growth: maximization of the amount of new pattern observed or estimated to be created throughout the universe
  • Choice: maximization of the degree to which sentient beings across the universe appear to be able to make choices (according e.g. to the notion of “natural autonomy”, a scientifically and rationally grounded analogue of the folk notion and subjective experience of “free will”)
  • Continuity:  persistence of patterns over time.   Obviously this is a counterbalance to Growth; the relative weighting of these two top-level goals will help determine the “conservatism” of a particular OpenCog system with the goal-set indicated here.
  • Novelty: the amount of new information in the system’s perceptions, actions and thoughts
  • Human pleasure and fulfillment: How much do humans, as a whole, appear to be pleased and fulfilled?
  • Human pleasure regarding the AGI system itself: How pleased do humans appear to be with the AGI system, and their interactions with it?
  • Self-preservation: a goal fulfilled if the system keeps itself “alive.”   This is actually somewhat subtle for a digital system.    It could be defined in a copying-friendly way, as preservation of the existence of sentiences whose mind-patterns have evolved from the mind-patterns of the current system this with a reasonable degree of continuity.

·      This list of goals has a certain arbitrariness to it, and no doubt will evolve as OpenCog systems are experimented with.   However, it comprises a reasonable “first stab” at a “roughly human-like” set of goal-content for an AGI system.

One might wonder how such goals would be specified for an AGI system.   Does one write source-code that attempts to embody some mathematical theory of continuity, pleasure, joy, etc.?    For some goals mathematical formulae may be appropriate, e.g. novelty which can be gauged information-theoretically in a plausible way.   In most cases, though, I suspect the best way to define a goal for an AGI system will be using natural human language.   Natural language is intrinsically ambiguous, but so are human values, and these ambiguities are closely coupled and intertwined.   Even where a mathematical formula is given, it might be best to use natural language for the top-level goal, and supply the mathematical formula as an initial suggest means of achieving the NL-specified goal.   
The AGI would need to be instructed – again, most likely in natural language – not to obsess on the specific wording supplied to it in its top-level goals, but rather to take the wording of its goals as indicative of general concepts that exist in human culture and can be expressed only approximatively in concise sequences of words.     The specification of top-level goal content is not intended to precisely direct the AGIs behavior in the way that, say, a thermostat is directed by the goal of keeping temperature within certain bounds.  Rather, it is intended to point the AGI’s self-organizing activity in certain informally-specified directions.

Alongside explicitly goal-oriented activity, OpenCog also includes “background processing” – cognition simply aimed at learning new knowledge, and forgetting relatively unimportant knowledge.   This knowledge provides background information useful for reasoning regarding goal-achievement, and also builds up a self-organizing, autonomously developing body of active information that may sometimes lead a system in unpredictable directions – for instance, to reinterpretation of its top-level goals.

The goals supplied to an OpenCog system by its programmers are best viewed as initial seeds around which the system forms its goals.  For instance, a top-level goal of “novelty” may be specified as a certain mathematical formula for calculating the novelty of the system’s recent observations, actions and thoughts.  However, this mathematical formula may be intractable in its most pure and general form, leading the system to develop various context-specific approximations to estimate the novelty experienced in different situations.   These approximations, rather than the top-level novelty formula, will be what the system actually works to achieve.   Improving these approximations will be part of the system’s activity, but how much attention to pay to improving these approximations will be a choice the system has to make as part of its thinking process.    Potentially, if the approximations are bad, they might cause the system to delude itself that it is experiencing novelty (according to its top-level equation) when it actually isn’t, and also tell the system that there is no additional novelty to be found in in improving its novelty estimation formulae.  

And this same sort of problem could occur with goals like “help cause people to be pleased and fulfilled.”   Subgoals of the top-level goal may be created via more or less crude approximations; and these subgoals may influence how much effort goes into improving the approximations.   Even if the system is wired to put a fixed amount of effort into improving its estimations regarding which subgoals should be pursued in pursuit of its top-level goals, the particular content of the subgoals will inevitably influence the particulars of how the system goes about improving these estimations.
The flexibility of an OpenCog system, its ability to ongoingly self-organize, learn and develop, brings the possibility that it could deviate from its in-built top-level goals in complex and unexpected ways.  But this same flexibility is what should – according to the design intention – allow an OpenCog system to effectively absorb the complexity of human values.   Via interacting with humans in rich ways – not just via getting reinforced on the goodness or badness of its actions (though such reinforcement will impact the system assuming it has goals such as “help cause human pleasure and fulfillment”), but via all sorts of joint activity with humans – the system will absorb the ins and outs of human psychology, culture and value.   It will learn subgoals that approximately imply its top-level goals, in a way that fits with human nature, and with the specific human culture and community it’s exposed to as it grows.

In the above I have been speaking as if an OpenCog system is ongoingly stuck with the top-level goals that its human programmers have provided it with; but this is not necessarily the case.   Operationally it is unproblematic to allow an OpenCog system to modify its top-level goals.   One might consider this undesirable, yet a reflection on the uncertainty and ignorance necessarily going into any choice of goal-set may make one think otherwise.  

A highly advanced intelligence, forced by design to retain top-level goals programmed by minds much more primitive than itself, could develop an undesirably contorted psychology, based on internally working around its fixed goal programming.   Examples of this sort of problem are replete in human psychology.  For instance, we humans are “programmed” with a great deal of highly-weighted goal content relevant to reproduction, sexuality and social status, but the more modern aspects of our minds have mixed feelings about these archaic evolved goals.   But it is very hard for us to simply excise these historical goals from our minds.   Instead we have created quite complex and subtle psychological and social patterns that indirectly and approximatively achieve the archaic goals encoded in our brains, while also letting us go in the directions in which our minds and cultures have self-organized during recent millennia.    Hello Kitty, romantic love, birth control, athletic competitions, investment banks – the list of human-culture phenomena apparently explicable in this way is almost endless.

One key point to understand, closely relevant to the VLT, is that the foundation of OpenCog’s dynamics in explicit probabilistic inference will necessarily cause it to diverge somewhat from human judgments.   As a probabilistically grounded system, OpenCog will naturally try to accurately estimate the probability of each abstraction it makes actually applying in each context it deems relevant.    Humans sometimes do this – otherwise they wouldn’t be able to survive in the wild, let alone carry out complex activities like engineering computers or AI systems – but they also behave quite differently at times.   Among other issues, humans are strongly prone to “wishful thinking” of various sorts.   If one were to model human reasoning using a logical formalism, one might end up needing to include a rule of the rough form

X would imply achievement of my goals
therefore
X’s truth value gets boosted

Of course, a human being who applied this rule strongly to all X in its mind, would become completely delusional and dysfunctional.  No human is like that.  But this sort of wishful thinking infuses human minds, alongside serious attempts at accurate probabilistic reasoning, plus various heuristics which have various well-documented systematic biases.   Belief revision combines conclusions drawn via wishful thinking, with conclusions drawn by attempts at accurate inference, in complex and mainly unconscious ways.  

Some of the biases of human cognition are sensible consequences of trying to carry out complex probabilistic reasoning on complex data using limited space and time resources.  Others are less “forgivable” and appear to exist in the human psyche for “historical reasons”, e.g. because they were adaptive for some predecessor of modern humanity in some contexts and then just stuck around.
An advanced OpenCog AGI system, if thoroughly embedded in human society and infused with human values, would likely arrive at its own variation of human values, differing from nearly any human being’s particular value system in its bias toward logical and probabilistic consistency.   The closest approximation to such an OpenCog system’s value system might be the values of a human belonging to the human culture in which the OpenCog system was embedded, and who also had made great efforts to remove any (conscious or unconscious) logical inconsistencies in his value system.

What does this speculative scenario have to say about the VLT and VET?  

Firstly, it seems to support a limited version of the VLT.   An OpenCog system, due to its fundamentally different cognitive architecture, is not likely to inherit the logical and probabilistic inconsistencies of any particular human being’s value system.  Rather, one would expect it to (implicitly and explicitly) seek to find the best approximation to the value system of its human friends and teachers, within the constraint of approximate probabilistic/logical consistency that is implicit in its architecture.  

The precise nature of such a value system cannot be entirely clear at this moment, but is certainly an interesting topic for speculative thinking.    First of all, it is fairly clear which sorts of properties of typical human value systems would not be inherited by an OpenCog of this hypothetical nature.   For instance, humans have a tendency to place a great deal of extra value on goods or ills that occur in their direct sensory experience, much beyond what would be justified by the increased confidence associated with direct experience as opposed to indirect experience.   Humans tend to value feeding a starving child sitting right in front of them, vastly more than feeding a starving child halfway across the world.  One would not expect an reasonably consistent human-like value system to display this property.

Similarly, humans tend to be much more concerned with goods or ills occurring to individuals who share more properties with themselves – and the choice of which properties to weight more highly in this sort of judgment is highly idiosyncratic and culture-specific.    If an OpenCog system doesn’t have a top-level goal of “preserving patterns similar to the ones detected in my own mind and body”, then it would not be expected to have the same “tribal” value-system bias that humans tend to have.    Some level of “tribal” value bias can be expected to emerge via abductive reasoning based on the goal of self-preservation (assuming this goal is included), but it seems qualitatively that humans have a much more tribally-oriented value system than could be derived via this sort of indirect factor alone.   Humans evolved partially via tribe-level group selection; an AGI need not do so, and this would be expected to lead to significant value-system differences.    

Overall, one might reasonably expect an OpenCog created with the above set of goals and methodology of embodiment and instruction to arrive at a value system that is roughly human-like, but without the glaring inconsistencies plaguing most practical human value systems.   Many of the contradictory aspects of human values have to do with conflict between modern human culture and “historical” values that modern humans have carried over from early human history (e.g. tribalism).   One may expect that, in the AGI’s value system, the modern culture side of such dichotomies will generally win out – because it is what is closer to the surface in observed human behavior and hence easier to detect and reason about, and also because it is more consilient with the explicitly Cosmist values (Joy, Growth, Choice) in the proposed first-pass AGI goal system.  

So to a first approximation, one might expect an OpenCog system of this nature to settle into a value system that
  • Resembles the human values of the individuals who have instructed and interacted with it
  • Displays a strong (but still just approximate) logical and probabilistic consistency and coherence
  • Generally resolves contradictions in human values via selecting modern-culture value aspects over “archaic” historical value aspects


It seems likely that such a value system would generally be acceptable to human participants in modern culture who value logic, science and reason (alongside other human values).    Obviously human beings who prefer the more archaic aspects of human values, and consider modern culture largely an ethical and aesthetic degeneration, would tend to be less happy with this sort of value system.  

So in this view, an advanced OpenCog system appropriately architected and educated would validate the VLT, but with a moderately loose interpretation.   Its value system would be in the broad scope of human-like value systems, but with a particular bias and with a kind of consistency and purity not likely present in any particular human being’s value system.

What about the VET?   It seems intuitively likely that the ongoing growth and development of an OpenCog system as described above would parallel the growth and development of human uploads, cyborgs or biologically-enhanced humans who were, in the early stage of their posthuman evolution, specifically concerned with reducing their reliance on archaic values and increasing their coherence and logical and probabilistic consistency.   Of course, this category might not include all posthumans – e.g. some religious humans, given the choice, might use advanced technology to modify their brains to cause themselves to become devout in their particular religion to a degree beyond all human limits.   But it would seem that an OpenCog system as described above would be likely to evolve toward superhumanity in roughly the same direction as a human being with transhumanist proclivities and a roughly Cosmist outlook.    If indeed this is the case, it would validate the VET, at least in this particular sort of situation.

It will certainly be noted that the value system of “a human being with transhumanist proclivities and a Cosmist outlook” is essentially the value system of the author of this article, and the author of the first-pass, roughly sketched OpenCog goal content used as the basis of the discussion here.   Indeed, the goal system outlined above is closely matched to my own values.   For instance, I tend toward technoprogressivism as opposed to transhumanist political libertarianism – and this is reflected in my inclusion of values related to the well-being of all sentient beings, and lack of focus on values regarding private property.   

In fact, different weightings of the goals in the above-given goal-set would be expected to lead to different varieties of human-level and superhuman AGI value system – some of which would be more “technoprogressivist” in nature and some more “political libertarian” in nature, among many other differences.   In a cosmic sense, though, this sort of difference is ultimately fairly minor.  These are all variations of modern human value system, and occupy a very small region in the space of all possible value systems that could be adopted by intelligences in our universe.   Differences between different varieties of human value system often feel very important to us now, but may well appear quite insignificant to our superintelligent descendants.


Friday, November 20, 2015

What does Google’s tensorflow mean for AI?



Google’s release of their tensorflow machine learning library has attracted a lot of attention recently.   Like everyone else in the field I’ve felt moved to take a look.

(Microsoft's recent release of an open source distributed machine learning toolkit is also interesting.   But that would be another story; here I'll restrict myself to tensorflow...)

tensorflow as a Deep Machine Learning Toolkit


Folks familiar with tools for deep learning based machine vision will quickly see that the
tensorflow neural net library is fairly similar to in concept to the Theano/pylearn2 library from Yoshua Bengio’s team at U. Montreal.   Its functionality is similar to Theano/pylearn2 and also to other modern deep ML toolkits like Caffe.   However, it looks like it may combine the strengths of the different existing toolkits in a novel way — an elegant,simple to use architecture like Theano/pylearn2, combined with rapid execution like one gets with Caffe.

Tensorflow is an infrastructure and toolkit, intended so that one can build and run specific deep learning algorithms within it.  The specific algorithms released with the toolkit initially are well-known and fairly limited.   For instance, they give a 2D convolutional neural net but not a 3D one (though Facebook open-sourced a 3D CNN not long ago).

The currently released version of tensorflow runs on one machine only (though making efficient use of multiple processors).  But it seems they may release a distributed version some time fairly soon 

tensorflow as a Dataflow Framework


As well as a toolkit for implementing distributed deep learning algorithms, tensorflow is also — underneath — a fairly general framework for “dataflow”, for passing knowledge around among graphs.   However, looked at as a dataflow architecture it has some fairly strict limitations, which emerge directly from its purpose as an infrastructure for current deep learning neural net algorithms.

For one thing, tensorflow seems optimized for passing around pretty large chunks of data ....  So if one wanted to use it to spread activation around in a network, one wouldn't make an Operation per neuron, rather one would make an "activation-spreading" Operation and have it act on a connection matrix or similar....

Furthermore, tensorflow’s execution model seems to be fundamentally *synchronous*.  Even when run across multiple machines in distributed mode using Senders and Receivers, the basic mathematical operation of the network is synchronous.  This is fine for most current
deep learning algorithms, which are constructed of nodes that are assumed to pass information around among each other in a specific and synchronized way.  The control mechanisms tensorflow provides (e.g. for and while constructs) are flowchart-like rather than adaptive-network-like, and remain within the synchronized execution paradigm, so far as I can tell.

This is a marked contrast to ROS, which my team at OpenCog and Hanson Robotics is currently using for robotics work — in ROS one wraps up different functions in ROS nodes, which interact with each other autonomously and asynchronously.  It’s also a contrast to the BrICA framework for AGI and brain emulation produced recently by the Japanese Whole Brain Initiative.   BriCA’s nodes pass around vectors rather than tensors, but since a tensor is basically a multidimensional stack of vectors, this amounts to the same thing.  BrICA’s nodes interact asychronously via a simple but elegant mechanism.   This reflects the fact that BrICA was engineered as a framework for neural net based AGI, whereas tensorflow was engineered as a framework for a valuable but relatively narrow class of deep learning based data processing algorithms.

That is: Conceptually, it seems that tensorflow is made for executing precisely-orchestrated multi-node algorithms (potentially in a distributed way), in which interaction among nodes happens in a specifically synchronized and predetermined way based on a particular architecture; whereas BriCA can also be applied to more open ended designs in which different nodes (components) react to each others' outputs on the fly and everything does not happen within an overall architecture in which the dynamic relations between the behaviors of the components are thought out.  Philosophically this related to the more "open-ended" nature of AGI systems.

 tensorflow and OpenCog?


My current view on the currently popular deep learning architectures for data processing (whose implementation and tweaking and application tensorflow is intended to ease) is that they are strong for perceptual pattern recognition, but do not constitute general-purpose cognitive architectures for general intelligence.

Contrasting tensorflow and OpenCog (which is worse by far than contrasting apples and oranges, but so be it…), one observation we can make is that an OpenCog Atom is a persistent store of information, whereas a TensorFlow graph is a collection of Operations (each translating input into output).  So, on the face of it, TensorFlow is best for (certain sorts of) procedural knowledge, whereas Atomspace is best for declarative knowledge....   It seems the "declarative knowledge" in a TensorFlow graph is pretty much contained in the numerical tensors that the Operations pass around...

In OpenCog’s MOSES component, small LISP-like programs called “Combo trees” are used to represent certain sorts of procedural knowledge; these are then mapped into the Atomspace for declarative analysis.  But deep learning neural nets are most suitable for representing different sorts of procedural knowledge than Combo trees — e.g. procedural knowledge used for low-level perception and action.  (The distinction between procedural and sensorimotor knowledge blurs a bit here, but that would be a topic for another blog post….)

I had been thinking about integrating deep learning based perception into OpenCog using Theano / pylearn2 as an underlying engine — making OpenCog Atoms that executed small neural networks on GPU, and using the OpenCog Atomspace to glue together these small neural networks (via the Atoms that refer to them) into an overall architecture.  See particulars here and here.

Now I am wondering whether we should do this using tensorflow instead, or as well….

In terms of OpenCog/tensorflow integration, the most straightforward thing would be to implement


  • TensorNode ... with subtypes as appropriate
  • GroundedSchemaNodes that wrap up TensorFlow "Operations"


This would allow us to basically embed TensorFlow graphs inside the Atomspace...

Deep learning operations like convolution are represented as opaque operations in tensorflow, and would also be opaque operations (wrapped inside GSNs) in OpenCog....

The purported advantage over Theano would be that TensorFlow is supposed to be faster (we'll test), whereas Theano has an elegant interface but is slower than Caffe ...

Wrapping Operations inside GSN would add a level of indirection/inefficiency, but if the Operations are expensive things like running convolutions on images or multiplying big matrices, this doesn't matter much...

Anyway, we will evaluate and see what makes sense! …

Rambling Reflections on the Open-Source Ecosystem


The AI / proto-AGI landscape is certainly becoming interesting and complex these days.  It seems that AI went in just a few years from being obscure and marginalized (outside of science fiction) to being big-time corporate.  Which is exciting in terms of the R&D progress it will likely lead to, yet frustrating to those of us who aren’t thrilled with the domination of the world socioeconomy by megacorporations.

But then we also see a major trend of big companies sharing significant aspects of their AI code with the world at large via open-source released like Facebook’s conv3D code and Google’s tensorflow, and so many others.   They are doing this for multiple reasons — one is that it keeps their research staff happy (most researchers want to feel they’re contributing to the scientific community at large rather than just to one company); and another is that other researchers, learning from and improving on the code they have released, will create new innovations they can use.  The interplay between the free-and-open R&D world and the corporate-and-proprietary R&D world becomes subtler and subtler.

Supposing we integrate tensorflow into OpenCog and it yield interesting results… Google could then choose to use OpenCog themselves and integrate it into their own systems.  Hopefully if they did so, they would push some of their OpenCog improvements into the open-source ecosystem as well.  Precisely where this sort of thing will lead business-wise is not entirely clear, given the shifting nature of current tech business models, but it’s already clear that companies like Google don’t derive the bulk of their business advantage from proprietary algorithms or code, but rather from the social dynamics associated with their products and their brand.

If open-source AI code were somehow coupled with a shift in the dynamics of online interaction, to something more peer-to-peer and less big-media and big-company and advertising dominated — THEN we would have a more dramatic shift, with interesting implications for everybody’s business model.  But that’s another topic that would lead us far afield from tensorflow.  For the time being, it seems that the open-source ecosystem is playing a fairly core role in the complex unfolding of AI algorithms, architectures and applications among various intellectual/socieconomic actors … and funky stuff like tensorflow is emerging as a result.

(Interesting but Limited) Progress in Neural Net Based Language Learning


A team of UK-based researchers has published an interesting paper on language learning & reasoning using neural networks.   There has also been a somewhat sensationalist media article describing the work.

I was especially familiar with one of the authors, Angelo Cangelosi, who gave a keynote at the AGI-12 conference at Oxford, touching on some of his work with the iCub robot.

The news article (but not the research paper) says that the ANNABELL system reported here is first time automated dialogue has been done w/ neural nets....  Actually, no.  I recall a paper by Alexander Borzenko giving similar results in the "Artificial Brains" special issue of Neurocomputing that Hugo DeGaris and I co-edited some years ago….  And I’m pretty sure there were earlier examples as well.

When I pointed the ANNABELL work out to Japanese AGI researcher Koichi Takahashi, he noted a few recent related works, such as:




See also this nice survey on the emergent approach for language in robotics today.

So, what distinguishes this new work by Cangelosi and colleagues from other related stuff I’ve seen  is more the sophistication of the underlying cognitive architecture.   Quite possibly ANNABELL works better than prior NNs trained for dialogue-response, or maybe it doesn't; careful comparison isn't given, which is understandable since there is no standard test corpus for this sort of thing, and prior researchers mostly didn't open their code.   But the cognitive architecture we see described here,  is very carefully constructed in a psychologically realistic way; combined with the interesting practical results, this is pretty nifty...

The training method is interesting, incrementally feeding the system facts with increasing complexity, while interacting with it along the way, and letting it build up its knowledge bit by bit.   A couple weeks ago I talked to a Russian company (whose name is unfortunately slipping my mind at the moment, but it began with a Z), who had a booth at RobotWorld in Seoul, that has been training a Russian NLP dialogue system in a similar way (again with those Russians!!).... But the demo they were showing that day was only in Russian so I couldn’t really assess it.

To my mind, the key limitation of the approach we see here is that the passage from question to response occurs very close to the word and word-sequence level.  There is not much conceptualization going on here.  There is a bit of generalization, but it’s generalization very close to the level of sentence forms.  This is not an issue of symbolic versus connectionist, it’s a matter of the kinds of patterns the system recognizes and represents.

For instance, with this method, the system will respond to many questions involving the word "dad" without really knowing what a "dad" is (e.g. without knowing that a dad is a human or is older than a child, etc.).   This is just fine, and people can do this too.   But we should avoid assuming that just because it gives responses that, if heard from a human, would result from a certain sort of understanding, the system is demonstrating that same sort of understanding.    This system is building up question-response patterns from the data fed into it, and then performing some (real, yet fairly shallow) generalization.  The AI question is whether the kind of generalization it is performing is really the right kind to support generally intelligent cognition.

My feeling is that the kind of processing their network is doing, actually plays only a minor supporting rule in human question-answering and dialogue behavior.   They are using a somewhat realistic cognitive architecture for reactive processing, and a somewhat realistic neural learning mechanism -- but the way the learning mechanism is used within the architecture for processing language, is not very much like the way the brain processes language.   The consequence of this difference is that their system is not really forming the kinds of abstractions that a human mind (even a child's mind) automatically forms when processing this kind of linguistic information....   The result of this is that the kinds of question-answering, question-asking, concept formation etc. their system can do will not actually resemble that of a human child, even though their system's answer-generation process may, under certain restrictions, give results resembling those you get from a human child...

The observations I’m making here do not really contradict anything said in the paper, though they of course contradict some of the more overheated phrasings in the media coverage….  We have here a cognitive architecture that is intended as a fragment of an overall cognitive architecture for human-level, human-like general intelligence.  Normally, this fragmentary architecture would not do much of anything on its own, certainly not anything significant regarding language.  But in order to get it to do something, the authors have paired their currently-fragmentary architecture with learning subsystems in a way that wires utterances directly to responses more directly than happens in a human mind, bypassing many important processes related to conceptualization, motivation and so forth.

It’s an interesting step, anyway.