Understanding Artificial Intelligence through Algorithmic Information Theory

A MOOC by Institut Mines-Telecom available on the edX platform.

Can we characterize intelligent behavior?
Are there theoretical foundations on which
Artificial Intelligence can be grounded?
 This course on Algorithmic Information will offer you such a theoretical framework.
• You will be able to see machine learning, reasoning, mathematics, and even human intelligence as abstract computations aiming at compressing information.
• This new power of yours will not only help you understand what AI does (or can’t do!),
but also serve you as a guide to design AI systems.

Follow the course on the edX platform

 Kolmogorov complexity Randomness Aesthetics Coincidences

 Half a century ago, three mathematicians made the same discovery independently. They understood that the concept of information belonged to computer science; that computer science could say what information means. Algorithmic Information Theory was born.Algorithmic Information is what is left when all redundancy has been removed. This makes sense, as redundant content cannot add any useful information. Removing redundancy to extract meaningful information is something computer scientists are good at doing.Algorithmic information is a great conceptual tool. It describes what artificial intelligence actually does, and what it should do to make optimal choices. It also says what artificial intelligence can’t do. Algorithmic information is an essential component in the theoretical foundations of AI.

Follow the course on the edX platform

 Artificial Intelligence is more than just a collection of brilliant, innovative methods to solve problems. If you are interested in machine learning or are planning to explore it, the course will make you see artificial learning in an entirely new way. You will know how to formulate optimal hypotheses for a learning task. And you will be able to analyze learning techniques such as clustering or neural networks as just ways of compressing information. If you are interested in reasoning, you will understand that reasoning by analogy, reasoning by induction, explaining, proving, etc. are all alike; they all amount to providing more compact descriptions of situations. If you are interested in mathematics, you will be amazed at the fact that crucial notions such as probability and randomness can be redefined in terms of algorithmic information. You will also understand that there are theoretical limits to what artificial intelligence can do. If you are interested in human intelligence, you will find some intriguing results in this course. Thanks to algorithmic information, notions such as unexpectedness, interest and, to a certain extent, aesthetics, can be formally defined and computed, and this may change your views on what artificial intelligence can achieve in the future.

## What you will learn

• How to measure information through compression
• How to compare algorithmic information with Shannon’s information
• How to detect languages through joint compression
• How to use the Web to compute meaning similarity
• How probability and randomness can be defined in purely algorithmic terms
• How algorithmic information sets limits to the power of AI (Gödel’s theorem)

• a criterion to make optimal hypotheses in learning tasks
• a method to solve analogies and detect anomalies
• a new understanding of machine learning as a way to achieve compression

• why unexpected means abnormally simple
• why coincidences are unexpected
• why subjective information and interest are due to complexity drop
• why relevance, aesthetics, emotional intensity and humour rely on coding.

1. Caveat:    This course DOES NOT address the notion of "computational complexity" which measures the speed of algorithms.

Follow the course on the edX platform

If you have ever wanted to gain a deeper understanding of what machine learning actually does, if you are curious about what artificial intelligence can or cannot achieve, if you wish to know what the notion of information really means, not only in computer science but also in daily life, join us and enrol in this course.

## Intended audience

If you have a Bachelor’s degree in mathematics, in computer science or in engineering, and if you are interested in the fundamentals of computer science, then this course is for you.

### Prerequisites

Examples of what you are expected to know before embarking on this course:
• what a convex curve looks like,
• that $$\log(7^n)$$ is $$n \times \log(7)$$,
• that rational numbers have finite or periodic expansion,
• that rational numbers are countable, but that real numbers are not,
• that the probability of "A and B" is the probability of "A knowing B" times the probability of B,
• that 65 is 1000001 is in base 2 and 41 in base 16,
• how to compute the sum of a finite geometric series,
• that {'a':1, 'i':0} is a Python dictionary and why ('ab'*4)[::2] yields 'aaaa',
• that k-means is a clustering method,
• what Bayes’ theorem tells us,
• how Shannon’s information is related to probability,
• that what is called a Turing machine is NOT the machine that Alan Turing (Benedict Cumberbatch)
is using in the movie The imitation game.

Follow the course on the edX platform

## Course format

The Course alternates short texts, video clips, slideshows, small dialogs and ungraded exercises. A small evaluative quiz is proposed at the end of each chapter. There are also small interviews about ongoing research for those who are interested in knowing more.

## Syllabus

 Chapter 1 Describing Data Gentle introduction to description complexity, which is a central concept of the course. A few examples to begin with, and then a formal definition. examples:    complexity of passwords, complexity of π, coincidences Chapter 2 Algorithmic information & Compression How compression techniques or Web search engines can be used to approximate algorithmic information. examples:    language detection, semantic distance, Zipf’s law. Chapter 3 Algorithmic information & mathematics How theoretical computer science defines information, probability and randomness. Also: the power and limits of artificial intelligence (and the shortest proof of Gödel’s incompletude theorem). examples:    disjunctive ("universe") sequence, paradox of randomness, unprovable truths. Chapter 4 Algorithmic information & Machine learning How algorithmic information can guide machine learning by providing criteria for optimal learning. examples:    clustering as compression, best analogies, anomaly detection. Chapter 5 Algorithmic information & Cognitive Science How algorithmic information can improve artificial intelligence by defining notions such as subjective information, interest, relevance and even aesthetics. examples:    remarkable lottery draws, beautiful shapes, newsworthiness and, again, coincidences.

Follow the course on the edX platform

## Lecturers

Jean-Louis Dessalles is a senior lecturer at Telecom Paris (Institut Polytechnique de Paris).
In recent years, he has been developing a new theoretical framework called Simplicity Theory, which serves as a basis for modeling interest and relevance in human communication.

Pierre-Alexandre Murena (then PostDoc at Telecom Paris) is now researcher in machine learning in the PML group at Aalto University (Finland).

Do I need to be highly proficient in Python?

No. Basic knowledge of Python is sufficient. From time to time, you’ll have to use short Python programs, to understand them locally and to perform easy transformations.

I am feeling a bit rusty in maths. Is it a problem?

You need to feel comfortable with math concepts such as the ones listed above. If you are not sure about some of them ("oh, I knew that...!"), but are ready to refresh your memory using external Web resources such as Wikipedia, then make an attempt and visit the course. You will probably get most of it. Anyway, this is not a math course!

I already know a lot about Algorithmic Information. Will I learn something new?

Yes, definitely. This course includes original content. You may know mathematical aspects of AIT, but not its applications. Or the converse, you know for instance how to apply "MDL", but may not be fully aware of the underlying theoretical motivations. And you will be amazed to see how AIT applies to human intelligence!

## Proceed to the course.

The course is available on the edX platform.

Acknowledgments

This course was initially taught at Telecom Paris, a French "Grande Ecole", now part of Institut Polytechnique de Paris (and also the birthplace of the word "Telecommunication" some 120 years ago). It has been significantly improved and augmented to become a MOOC on the edX platform.

This upgrade into a MOOC was made possible under the lead of Institut Mines-Telecom, thanks to a grant of the Patrick and Lina Drahi Foundation.

Many thanks to Pierre-Alexandre Murena, who contributed to this MOOC during his PostDoc at Telecom-Paris.
Building this MOOC proved to be a challenging endeavor. Many thanks to Delphine Lalire (IMT), to Émilie Dupré (Mooncat studio) and to Bethany Cagnol Telecom Paris for all the efforts they put into bringing the MOOC to professional standards and making it a reality.
We are grateful to all the betatesters who helped improve the quality and accuracy of the content.
We hope you’ll appreciate the end result!

Follow the course on the edX platform

Watch
[in French]     [in French]

Alan Turing’s soul

 Alan Turing was just 24 years old when he defined the notion of computability. Mathematicians define all sorts of functions. Turing was able to prove that only a subclass of them is what he termed "computable". We could say that Computer Science was born then, on paper, thanks to Turing’s insight, ten years before actual computers were constructed.By the time the first operational computers were just running the same Alan Turing, ahead of his time again, started grappling with the question: “Can machines think?”, which in turn led to the founding of Artificial Intelligence.Another fifteen years later, in the 1960s, Andrei Kolmogorov and others extended Turing’s work on computability to redefine the notion of information. Algorithmic Information Theory (AIT) was born.This MOOC is about applying AIT to Artificial Intelligence. It owes everything to Turing’s initial ideas on mechanical computation and on mechanical intelligence. Two reasons, then, to feel Turing’s soul floating over its five chapters.

Follow the course on the edX platform