What is learning according to machine learning?

It is (for supervised learning) looking at numerous samples, decomposing them into input variables and their associated target variable, and deriving according to an algorithm how to predict the target variable given input variables.

It is the (potentially lossy) compression of the observed samples, where the learning algorithm describes the compression/decompression algorithm. The compressed data is the information necessary for the algorithm to make predictions (decompression).

It is the creation of some "memory" of the observed samples. Whereas an untrained model has no memory of the dataset since it hasn't seen the data, a trained model has some form of memory. A simple model such as sklearn's DummyRegressor will learn and memorize the mean of the target variable. It may not have learned and memorized much, but it has built its internal model of the data.

It is to imitate as closely as possible the source of data it is trained on. This means that given input variables, it should produce target values that are as close as possible to those observed during training (learning).

How does one determine the best material to learn a topic?

My approach has always been based on the wisdom of the crowd.

If I want to read a book on a given topic, I will first look for the books that are available and create a list. From that list I will then look at websites like goodreads to gather people's opinion on the book. I look for two factors: how many people read the book, and the overall rating of the book. The book should have been read by as many people as possible and have the highest rating. I will also read some of the low rating comments to get a sense of the negative feedback that has been provided on the book.

If I already have some knowledge about the topic I am studying, I might be looking for specific sub-topics to be covered. I would then inspect the table of contents of the books to determine if those sub-topics are covered. I may end up reading only a chapter of a book because it addresses something I'm interested in. The topic may also only be covered in this book.

In some cases, I will ask people that are more knowledgeable than I am to give me references. I use my knowledge of their expertise to determine the topics on which they can and cannot offer expertise and suggestions.

If there are multiple options of equally good books, then I will skim a few of them, maybe read a few passages to get a personal impression of the book. I prefer concise and non-repetitive content.

With my chosen book in hand, I will start to read through it from beginning to end. If at some point the content is less than ideal, I don't mind skipping it and looking for additional sources (e.g., books or websites) to fill those gaps.

16 Jan 2020

Identifying areas to improve

History / Edit / PDF / EPUB / BIB / 2 min read (~326 words)
Questions

How can one identify the areas he needs to improve?

Through introspection.

As we do things through our daily lives, there are events which we wished had unfolded better. As we realize those areas of weaknesses, it is important to write them down so that we can build a list of those areas we may want to spend time improving in the future.

Another way one can identify the areas he needs to improve is to use the Feynman technique.

  1. Choose a concept
  2. Pretend you are teaching it to someone else
  3. Identify gaps in your explanation, go back and learn some more, then try explaining some more
  4. Review and simplify

This technique can also be applied to the skills you lack. Instead of trying to teach someone else, you basically want to be able to explain the skill to someone else as accurately as possible. As you identify gaps in the description of the skill to perform, you are basically documenting what you need to work on.

For example, you might want to determine which one of your programming skills are lacking. You might start by asking yourself how you would design and implement things you use in your daily life: how is a text editor implemented? what happens when I type a URL and press enter in a browser? how are files read from a disk? how is this special effect rendered in a video game?

Make sure that you ask yourself questions that are relevant to areas you want to improve. If you don't know how networking works, but it has no incidence on you, then you do not need to address this weakness. Spend time improving skills which will be useful to you in the future.

Look for things that create friction in your life, those are generally places where you'll find potential for improvement.

Why are biology and genetics interesting to AGI researchers?

Because it may provide interesting ideas and clues that can help with the development of AGI.

We currently know of a single instance of a system that is able to produce human-level intelligence: a human being. AGI researchers often try to understand how specific components such as the brain works. A lot of valuable work on the neuron has led to the creation of the deep learning field. Deep learning has definitely proven its value, but I am more interested in something else.

Genetics is seen as the programming of life. What I find interesting is that we can see the current human DNA as our latest implementation of this code. Since this code did not come out of existence from out of nowhere, studying DNA's history can give us ideas as to how a seed AI might come to be. It is also useful to understand how the environment has shaped DNA's existence.

Initially, there were only atoms and molecules. Through different physical and chemical processes, these molecules aggregated and formed more and more complicated assemblies. Through a multitude of steps, we reached the point where there were cells that contained DNA inside of them. This process might have been entirely random although the formation of complex structures happening randomly does not seem highly likely. Understanding the mechanisms or processes that helped create this order may be the equivalent of a pre-evolution natural selection.

My hope is that by studying such fields it is possible to discover how DNA increased in length, what were the different steps and challenges that were encountered that forced it to increase in size, as well as the potential causes of parts of DNA changing over time.

Just like a git repository, I'd like to be able to look at DNA's history and understand what happened to its code since its "Initial commit". It might also be interesting to figure out what kind of programmer nature is.

What are the differences between a brain and a CPU?

  • The brain is extremely parallel (each neuron processing many signals), while CPUs are currently limited to a few cores.
  • The brain appears to be able to only do a single thing at once (single process, single thread).
  • CPUs can explicitly control their memory access while the brain memory organization and access is unclear.
  • The brain is a lot slower in terms of sequential operations, processing at a maximum of 250-1000 Hz while current generation (2020) desktop CPUs are in the 3-5 GHz range.
  • The brain does not have a clear instruction set.
  • The brain consumes glucose for energy, while a CPU consumes electricity.
  • The human brain is much larger (average 1273 cm3 for men, 1131 cm3 for women) than a CPU chip (Intel Core i7-10710U is 46mm by 24mm (height unknown but definitely less than 10 mm) which is less than 11 cm3).
  • Heat dissipation is done through cerebral circulation in the brain and through a heatsink attached to a CPU.
  • The brain is biodegradable, the CPU is not.
  • Signal is transmitted between neurons using neurotransmitters (chemically) while CPUs transmit signals between transistors electrically.
  • The organization of the brain evolves over time (in a single person), while a CPU chip will remain the same its whole life.
  • We currently cannot transplant a brain from one person to another, but we can transfer a CPU from one computer to another (as long as the motherboard is compatible).
  • The brain contains a large amount of memory, while the CPU has a small amount of memory and relies on larger memory stores (RAM, disks).
  • It is possible to reverse engineer a CPU by trying a different combination of inputs and recording the output (immutable). Doing the same with a part of the brain may result in different results as the brain is mutable.

  • The brain may not have different levels of memory cache (we do however talk about short and long term memory).