The AICamp “Unconference” in Frankfurt follows the tradition of the Cloud Camps which have been taking place since 2009 and offer lectures and discussions.
I will talk about the "7 challenges of AI project"s and am looking forward to inspiring discussions afterwards.
A team of scientists at the University of New York has developed a new multi-scale, sliding window approach that can be utilized for image classification, detection, and localization. This approach in comparison to others based on the ILSVRC datasets, ranked 4th in classification, 1st in localization and 1st in detection.
Another important contribution in this paper is the clarification of how ConvNets can be effectively used for detection and localization tasks. This team is the first to clarify how this could be achieved with regards to ImageNet. This proposed plot includes substantial modifications to the neural network design for classification. Also, through this approach, it is shown how different tasks can be learned simultaneously by using a single shared network.
The term Artificial Intelligence (short-term A.I.) refers to the investigation and the design of the intelligent behavior of the machines. The modern form of the field was developed in the fifties and at this early stage, while a considerable optimism was noticed also in other technological areas, its promise did not remain sustainable during the following decades. Fundamental problems questioning the Artificial Intelligence meaning and its perception were distinguished. However, concrete successes in the credible intelligent systems design were achieved too.
A characteristic of this new skepticism is the scientific debate which later on developed the question of how the machine intelligence can be demonstrated at all. In the early 1950s, Alan Turing proposed the Turing test which was named after him. In this test, a person communicates via a telegraph with a counterpart that does not know whether it is a computer or a human being. If the interlocutor is a computer and the test person nevertheless identifies it as a human being, then this effect according to Turing would be a proof of the machine intelligence.
The method of this test was criticized several times in the debate. Consequently, in the 1960s Joseph Weizenbaum created the ELIZA program which according to today’s standards was quite simply structured and could simulate a partner conversation in various roles. ELIZA – especially in the role as a psychotherapist – repeatedly managed to “deceive” the human partner conversation with regard to Weizenbaum’s own bemusement. Later on, the pseudo-therapeutic context also suggested that this was at least partly a projection on the human side: the individual tests were interpreted more in the ELIZA’s observations than were actually meant to be read.
There were also objections from a philosophical point of view: in 1980 John Searle designed the Chinese room argument. In this experiment, a person who had no knowledge of the Chinese language would take part. The individual would be in the context of a Turing test within a room equipped with all imaginable books and resources in Chinese language and culture. Another individual would then ask the test subject questions in Chinese and receive answers in Chinese. Searle’s argument is that people in the Chinese room could read and answer questions without “understanding” a word in Chinese. Thus, we can notice here the same situation like a computer that receives and responds to the human user (in Chinese or not).
The machine does not “understand” what is going on, it only works on the basis of syntactic rules without an insight into its content meaning. Of course, against Searle’s thought experiment, linguistic and cynological objections can be asked. However, here it is raised a fundamental question: is intelligence and communication pure symbol processing on a syntactic basis or is there something of a deeper semantic level of content and meaning?
Before such fundamental problems were observed, research and development in the field of Artificial Intelligence began to be divided into two main strands: the background of the Artificial General Intelligence field, that is, the design of machines - which can assume all conceivable tasks with intelligent methods. But the utmost is the strong AI, which strives for the exact reproduction of human intelligence on a machine basis. The decisive factor here is that if such a replica would be equivalent to the model, the simulation would then be identical with the original.
The field of the narrow A.I. has been successful for a long time, as well as the weak A.I. or applied A.I.
Here, machine intelligence is only considered in restricted and specialized areas. Examples for this are:
In recent decades, many of these areas have made great progress. Search engines, master data mining, blurred search strategies and mobile wizards in smartphones recognize the spoken word of their users almost flawlessly and access their search engines seamlessly. Robotics has become a part of the artificial intelligence. Assuming the founded global knowledge as the basis of intelligent thinking as well as the actions that can only come about through physical contact with reality.
The artificial intelligence has therefore travelled a long way. While mobile assistants are now communicating with human beings, the fundamental questions about intelligence and the distinction between the simulation, syntactic symbol processing, and semantic understanding are still open.
While neural networks currently run as software implementations on classic CPU or GPU architectures, Intel goes one step further with the it’s latest development and introduces a "neuromorphic" chip.
The Loihi chip has 1,024 artificial neurons or 130,000 simulated neurons with 130 million possible synaptic connections. This is roughly the same complexity as a lobster’s brain, but is far from the 80 billion neurons of a human brain. The learning process is executed directly on the chip, which is supposed to be up to 1,000 times faster than conventional processor-controlled architectures.
The development of appropriate algorithms is necessary for practical application. Therefore, the chip will initially be delivered to Universities and research Institutes at the beginning of 2018.
Google released a new Open Source Tool named "Facets" in order to visualize data sets for machine learning. (https://research.googleblog.com/2017/07/facets-open-source-visualization-tool.html)
In this way, users can quickly understand and analyze the distribution of the data sets. The tool is created in Polymer and Rypescript and it supports Jupyter embedding in notebooks and web pages.
We at aiso-lab are currently investigating whether the technology is also available for visualizing some of our products.
The Network Security Lab team at the University of Washington has discovered a reputed weakness of deep neural networks (https://arxiv.org/pdf/1703.06857v1.pdf). DNN seem to have problems with identifying negative images.
If the network was trained to recognize three images in white, it was not trained to recognize them in black. The researchers perceive here a weakness in the ability of the DNN to generalize. A simple solution to this problem would be to create the negative examples for the automated training.
However, the question that arises is how far a generalization of the network is desired at all. For color images, it is important that the network does not ignore the color information. The network should have a general view i.e. the red color, but still be able to distinguish red and green. This shows the major role that the careful selection of the training set and the learning function play in order to train the net correctly to your own needs.
In his latest paper, Yoshua Bengio—one of the world's leading AI professors at the Université de Montréal—established a link between deep learning and the concept of consciousness (https://arxiv.org/pdf/1709.08568.pdf).
The idea of a “Consciousness Prior” is inspired by the phenomenon of consciousness defined as the formation of a low-dimensional combination of—a few—concepts constituting a conscious thought, i.e., consciousness manifests itsel as awareness at a particular time or instant.”
The notion is that the interim results generated by a Recurrent Neural Network (RNN) can be used to explain the past and to plan the future. The system does not act on the basis of input signals, such as images or texts, but rather controls the "consciousness" established by the information abstracted from input signals.
Consciousness Prior could be used to translate information contained within trained neural networks back into natural language or into classical AI procedures with rules and facts. An implementation of this concept is not presented in the document, but Bengio proposes to integrate the approach into reinforcement learning systems.
Bengio’s ideas may very well lead the way to new frontiers in artificial intelligence. Time will tell whether his proposal is a revolutionary idea or just a "visionary” mind game.
New Imagenet results are published. A team from Beijing provides a new top-5 classification record with an error rate of 2.25%. In the last two years, big companies such as Google, Microsoft and Facebook have not participated in this challenge.
In a new release of Google titled "Revisiting Unreasonable Effectiveness of Data in Deep Learning Era," Google describes what are the results of training a neural network on an image set of 300x times more images than in ImageNet.
The conclusion was that even with an increase of 3 million to 300 million training examples, the performance of the network linearly scales. Even after 300 million images, no flattening of the learning curve was observed.
To that end, the trained network placed a record in the COCO object detection benchmark. They came to the result that only the number of training data was increased, there were no improvements to the model itself.
This is an impressive demonstration of the importance of BigData in the context of deep learning. The best models can only be developed by companies that have the expertise to store and efficiently process enormous amounts of data.
aiso-lab is at your disposal as a competent partner for all challenges in the field of software and hardware.
Earlier this year, Google hosted a competition on Kaggle for YouTube video classification. Google provided 7 million videos with a total of 450,000 hours, which would be classified in 4716 categories. The third-placed team, a group of researchers from Tsinghua University and Baidu, have recently published their approach.
With a 7-layer deep LSTM architecture, an accuracy of 82.75% is achieved according to the used Global Average Precision metric.
The architecture of the temporal residual CNN used is as follows:
Source: Tsinghua University, 2017