What simple questions can be used to think about a problem differently?
I've learned to solve problems by thinking about what I already knew. I would try to think of parts of the solution that might make sense and I would piece those parts together to get from the problem to the solution.
While I was doing some hobbyist research on AGI, I read George Pólya classic "How to Solve It". Pólya covers many of the questions I would ask myself implicitly, so I was glad to see that someone had taken the time to write them in a reusable format.
His four steps of 1. understanding the problem, 2. devising a plan, 3. carrying out the plan and 4. looking back will lead you to ask yourself what you know and don't know about the problem you are trying to solve. Doing so is similar to Feynman's technique, where you attempt to teach what you know to someone else and by doing so, are discovering the parts of your explanation that needs improvement.
What I liked the most about Pólya's approach was that he was comfortable working with partial solutions. If you couldn't figure out how to get to the end, it was still important to put forward all the tools you had at your disposition in order to attempt to solve the problem. This way you would be able to get an idea of what was lacking in your solution.
- What is not known yet?
- What data do you have?
- What conditions are there?
- Is it possible to satisfy all those conditions?
- Have you seen this problem before?
- Have you seen a similar problem in a slightly different form?
- Do you know any related problems?
- Do you know something that could be useful to solve this problem?
- If you cannot solve this problem yet, can you solve a related problem?
- Can you see clearly the steps from beginning to end?
- Can you prove that your approach is correct?
- Can you check your solution?
- Can you get to your solution differently?
- Can you use your solution to solve other problems?
What are the tools I use daily that I could contribute to?
- Visual Studio Code My text editor of choice when I don't need an IDE. I've implemented a few plugins for VSC and use it daily to write personal notes as well as blog articles. I sometimes make use of its diff tool to merge changes from Sourcetree.
- Pycharm My IDE of choice when I write python. Very powerful, easy to get used to if you've used other Jetbrains IDE (I've used PHPStorm for more than 5 years). Extremely useful to run a specific unit test using pytest or to debug a complex issue by putting breakpoints and investigating the internal state of the program.
- Docker I use docker at work to containerize all of our dependencies so that it is somewhat easy to deploy what we develop in "any" environment, that is, an environment where docker (or similar, such as Kubernetes) is installed.
- Drone CI We use Drone CI at work to do continuous integration and I consider this tool to be an essential part of my daily work. When people push code to github and it fails on Drone CI, I can use this information to help them fix their issues. We also use it has part of our PR process to ensure that the PR passes all the expected tests so that we do not introduce faulty code into our master branch.
- Dependabot A few months ago I had introduced dependabot into my dependency management practices. It was highly useful to get automated PRs with updates to libraries we depended on. However, since the release of poetry 1.0.0, dependabot has not been able to update my python dependencies and has been left unused. I've created a PR which I hope will move this issue forward and get dependabot working again with poetry.
- Plotly I used to use the highcharts plotting library until someone at work introduced me to plotly. Plotly.js is open source software and released under the MIT license, which makes it an ideal library to use in personal as well as commercial software.
- Pandas I do machine learning development for a living nowadays and I depend highly on pandas. I don't think there's a single work day that goes by in which I don't whip out at least one
- Scikit-learn Similar to my dependency on pandas, my dependency on Scikit-learn is on a daily basis. Unlike the DummyRegressor documentation suggests, I use it for real problems and it's definitely useful!
- Dask / Distributed In order to scale both horizontally and vertically machine learning problems I've leaned on dask and distributed. Their use of delayed and Futures has made it simple to migrate simple for loops code into highly distributed tasks which can be monitored through a bokeh dashboard.
- pytest Who writes code without testing it? Pytest is the PHPUnit of python for me, an essential component that is used daily to ensure that code doesn't regress more than it needs to.
- mypy Python typing system is pretty weak in my opinion. I miss using PHP typing system, as well as its visibility system. Mypy is similar to doing a code compilation pass and verifying that the types specified in a function signature are the types of the arguments given to that function. It is useful in order to detect mistakes in the arguments being passed to a function.
- isort I'm a tidy man. I like when my imports are ordered alphabetically. That's what isort is there for.
- black I don't particularly like discussing code style with others because everyone has their own quirks and creating a code style that everyone agrees on is as difficult as agreeing on whether tabs or spaces should be used. black is highly opinionated and doesn't allow for much to be tweaked, while it also has a sensible style that sometimes can make you crazy.
- prospector Prospector allows you to run a variety of linters on your code, which is quite useful when you like your code to be as standard and pretty as I like it. Some of the tools also look for code complexity, which helps you identify nightmares before they're in the master branch.
- poetry I've used pip, I've used pipenv, requirements.txt, setup.py, etc. I didn't like the setup.py because since I've used composer (for PHP), I've always seen dependency management as something that shouldn't require code to define. I didn't like the requirements.txt/.lock variants because it was never clear how those were generated and if they were kept up to date together since you could use the requirements.txt as soft dependencies and requirements.lock as hard dependencies that you had to freeze yourself (which many people didn't know about). pipenv Pipfile was alright, but adding dependencies seemed to take longer and longer, which wasn't a pleasant experience, especially when the package you wanted to add didn't want to play nice with the other packages. poetry was the closest experience I got to composer.
Is a programmer commentary something that can provide value?
A few years ago I was learning how to drive. One of the techniques they teach in order to learn the basics of driving and help you reduce the amount of stress related to driving is to comment on what you are doing: I'm looking at this panel that says the maximum speed is 50 km/h, I am watching in my mirror, I am signaling that I want to move to the right lane, I'm checking my right dead angle, etc. The use of this technique is that it makes what you are doing in your head explicit by saying it out loud. There's a similar technique called pointing and calling that is used by Japanese and Chinese railways drivers in order to keep focus and attention.
When we're doing pair programming with someone else, we will often communicate with the other person what our current intent is and what we're currently trying to write down. It may help us clarify our objectives or express clearly the various steps we'll have to go through. In the case of pair programming, one of the benefits is that someone else is there to double-check what you are doing and what you're about to do.
I'm the context where you are programming by yourself, you can still benefit from talking out loud about what you are doing. If you are willing, you can record yourself and even possibly use speech to text technologies to create a log of what you were doing at the time. This can be useful to recall what you were working on if you are interrupted at some point. It can also serve as a good way to get back context from a piece of code if you need to work on it again.
Writing down comments can also be useful, however there's a lot of thinking that goes into programming that is only transient and if it were to be written as comments, it would make the code more difficult to consume.
How do checklists help to avoid mistakes?
Checklists are designed to ensure that the most important things, either it is an item or a step in a process, are not forgotten.
A process checklist will list all the steps that are critical to the completion of a process. It will make clear what steps need to be done and verified in what order. Checklists are a way to communicate with new team members what the existing team processes are and what is expected of them. As steps are completed, they are checked to indicate that the task was done or verified, which can serve as a progress indicator when a process takes a while to accomplish. What is also great about checklists is that it can be shared between teams as a way to share knowledge about processes.
If you often make mistakes in certain processes you do throughout the day, you can easily create a checklist that lists all the important things that you need to double-check when you do that process. You can then go through that list while completing the process or after completing it to ensure that all the important aspects of the process were correctly done.
One benefit of the checklist is that it also allows you to offload information that you might have to keep in your head at all times. A checklist for a process you do very infrequently can be an amazing time-saving tool since you do not need to refresh your memory on all the important steps of that process.
Anything that you can memorize, you should write down so that you can consult later and improve over time.
How does one keep a library organized if people are moving books improperly at a certain rate?
In order to keep the library as ordered as possible, librarian should suggest that books that have been removed from the shelves be put onto cart or tables where those will be properly shelved again. Given that an individual may think they properly shelved a book, they might have actually introduced a slight error in the ordering of the books. If there is enough disorder created through this method, then it becomes more challenging for librarians to keep the books ordered.
We know from computer science that efficient sorting algorithms are of complexity O(n log n). For a nearly sorted initial order, insertion sort is considered to be the best as it will become close to O(n) complexity (while it is O(n^2) in the average/worst case).